I need to do the following in four zillion threads at the same time (well, probably closer to 2 threads, but either way…): If such-and-such does not exist, insert it.
Currently, the application says:
select rows from the db
if row count == 0,
do some stuff
insert a row
I have a situation where this is run in 2 separate threads at the same time. This way, the 2 threads may (and in fact do every time so far) each check for existing rows before either thread inserts rows; therefore we have duplicate rows. This is bad.
All the algorithms I can think of fall short one way or another.
For example, if I do this:
open tx
insert a row
select rows
if row count > 2, rollback
else commit
when the transaction uses READ_COMMITTED isolation, then one thread won’t see another thread’s inserted rows, and duplicates are possible. With READ_UNCOMMITTED isolation, each thread may see the other thread’s rows and both will rollback. I think I’ll have the same problem if I use a MERGE statement instead of inserting then selecting (or vice versa).
Is there an algorithm to use to guarantee that exactly 1 row will be inserted when the above algorithm is executed concurrently? FWIW, I’m using DB2, mybatis, and xml-based tx management with Spring, but I’ll gladly translate from something else if possible.
I’m a newbie when it comes to concurrency, so if this question reveals ignorance remedied by a book or article you know of, please share.
EDIT:
The insert statement above is to lazily grant users something iff they don’t have it. In this case, a uniqueness constraint would be appropriate. Elsewhere in the app, however, it would not be appropriate. 🙁
I’ll make the example a bit more concrete soon so it’s easier to understand.
The only solution we could find was to issue the “insert if no records meeting criteria exist” command in a serializable transaction. The SQL itself could be anything that inserts records if none meeting certain criteria exist (e.g. MERGE would work). In this scenario, once one thread checks for the existence of a row meeting the criteria of the selected statement, no other thread may insert anything meeting that criteria until the tx is completed. For performance, we opened a new tx just for the merge operation and committed immediately upon completion.
We tested this solution with DB2 and Oracle with success.