The combination of MySQL and its storage engine InnoDB is almost the most widely used relational database nowadays, and Repeatable Read is the most common in the isolation level.
However, compared to PostgreSQL, InnoDB has several problems that cannot be solved elegantly at the Repeatable Read level.
- Lost updates
- Phantom read
Lost updates in PostgreSQL can be completely solved without additional hacks. As for phantom reads, there are some small tricks that can be used, such as range types
and other mechanisms.
Nevertheless, MySQL still has to be careful to identify the pitfalls and deal with them properly by developers to solve such problems. In my previous article, we introduced three ways to address lost updates. Those approaches provide a more flexible solution to lost updates and are suitable for a variety of scenarios.
In this article, we will further explore how to properly solve the write skew caused by phantom reads.
There are many types of scenes that result in phantom reads, but in general, they all have the following pattern.
- Search a specific range.
- Do something according to the results of the range (Create, Update, Delete).
- The operation will directly affect the original range results.
Suppose it is only an update or a delete, the most straightforward way to avoid write skew is to use an exclusive lock. If you use FOR UPDATE
at the beginning of SELECT
, then two concurrent transactions will be forced to go one after the other, thus effectively avoiding the write skew in the race condition.
However, in the case of a create, the solution is not so intuitive. Because there is no corresponding row to lock in SELECT
, the row is created later. So how to solve it?
Meeting Room Booking System
Before introducing the solution, let’s use a practical example to describe the problem caused by phantom read.
There is a meeting room system that provides users to reserve a meeting room, and when the user has successfully reserved the meeting room, a new corresponding data will be added in the table as follows.
user | room_id | start_time | end_time |
---|---|---|---|
A | 123 | 2022-05-01 10:00 | 2022-05-01 11:00 |
The above table records that user A
reserved the meeting room 123
for one hour on 5/1 at 10 am.
The behavior of this system will be similar to the following pseudo code.
count = `SELECT COUNT(*) FROM booking
WHERE room_id = 123 AND
start_time < '2022-05-01 11:00' AND
end_time > '2022-05-01 10:00'`
if count == 0:
`INSERT INTO booking (user, room_id, start_time, end_time)
VALUES ('A', 123, '2022-05-01 10:00', '2022-05-01 11:00')`
When the user is sure that the meeting room is unoccupied for the corresponding time slot, then the user can insert an entry as a reservation and the next user will not have a time conflict. Doesn’t that seem nice?
The problem occurs when two users want to occupy the same time slot in the same meeting room simultaneously, and they can both pass the first SELECT
validation, so they can both insert a reservation, and a conflict occurs. And such a situation can not be solved by adding a lock, because there is no row to lock at the beginning.
Solve by Uniqueness (Incomplete Solution)
Since there is no way to turn a simultaneous operation into a sequential operation through a lock, we let one of them simply fail. To do so we need to add some constraints, e.g. unique constraints, to the table.
One approach is to create a unique constraint index on the room_id, start_time columns, so that the second person trying to reserve the same time slot will fail.
The problem is solved if we restrict the use of each room to a maximum of one hour.
But if the meeting room can be booked for more than an hour, another problem arises.
- User A is reserved for 5/1 from 10am to 12pm
- User B is reserved for 5/1 from 11:00 to 12:00
When both User A and B are operating at the same time, this unique constraint obviously cannot be effective, and then the conflict around the meeting room remains.
Materialize Conflicts (Correct Solution)
To solve such phantom reads, the developer must use some tricks to reveal conflicts hidden under the same table.
One way is to create a new table and pre-fill it with data to act as a coordinator for simultaneous operations. In the case of this meeting room system, we can create a table time_slots
that lists all time slots in advance as follows.
room_id | time_slot |
---|---|
123 | 2022-05-01 08:00 ~ 09:00 |
123 | 2022-05-01 09:00 ~ 10:00 |
123 | 2022-05-01 10:00 ~ 11:00 |
123 | 2022-05-01 11:00 ~ 12:00 |
When the meeting room is to be reserved, we not only execute SELECT
on the original booking
, but also SELECT
on time_slots
, and we can add FOR UPDATE
because the data already exists. It is worth noting that the new SELECT FOR UPDATE
is executed before the original SELECT
.
In that case, when the expected time slots of two simultaneous users overlap, they will be blocked by the exclusive lock and become one after the other, and the latter will fail directly because it sees the result of the previous completion.
Conclusion
I have to say such a solution is difficult and not intuitive. However, in order not to sacrifice performance when using MySQL, the isolation level is not configured to be Serializable, which means complexity must be traded off for performance during execution.
It is a trade-off between complexity and performance. In fact, using FOR UPDATE
to process synchronization in such a scenario does affect performance, and if booking
is a table that may have phantom reads in all contexts, then making booking
individually Serializable is a feasible solution.
When using a database, we must know the capabilities of the database and understand all the unsolvable situations of the database, so that we can know what kind of behaviors are potential risks when designing and developing.
In addition, how to properly address the risks is also an important topic. Although the use cases are not exactly the same for everyone, the patterns are similar, and learning how to solve each pattern will help you to deal with similar situations quickly in the future.
This article provides a solution for write skew when “creating” data, while the previous article is about solving write skew when “updating” data. These should cover most of the situations that you might encounter. If anyone has encountered other kinds of MySQL race conditions, please feel free to discuss them with me as well.