Recently I had to deal with a problem that I imagined would be pretty common: given a database table with a large (million+) number of rows to be processed, and various processors running in various machines / threads, how to safely allow each processor instance to get a chunk of work (say 100 items) without interfering with one another?
The reason I am getting a chunk at a time is for performance reasons – I don’t want to go to the database for each item.
There are a few approaches – you could associate each processor a token, and have a SPROC that sets that token against the next [n] available items; perhaps something like:
(note – needs suitable isolation-level; perhaps serializable:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE)(edited to fix TSQL)
You’d also want a timeout (
@expiry) on this, so that when a processor goes down you don’t lose work. You’d also need a task to clear the owner on things that are past theirExpiry.