I have a database-synchronisation task that takes some time to process, as there are in the region of 120k leaf records, but they are remote and relatively slow to access.
Currently, my app does a fairly naive process of
- Get list of all the local Contacts
- For each local contact, get all the related data
- Then get the matching remote contact
- Compare the two and do stuff to bring them in sync
Step 1 returns data before it’s finished, and step 4 doesn’t involve comparisons between different contacts in the same set.
What I was hoping to do was use some sort of queue construct and start populating it in step 1, then immediately move onto step 2 and start processing items as they come in, using multiple threads.
The process then becomes:
- Start populating the queue with contacts
- While there are items in the queue
- Start a thread and:
- Take the front contact from the queue
- Fetch the remote contact
- Compare them
- Perform the required updates
Am I correct in the assumption that I can create a new ConcurrentQueue, start populating it, then loop over it as I might a single-threaded simple collection?
(I’ve not put in any error-checking or the actual threading, to keep the example simple)
class Program
{
static void Main(string[] args)
{
Processor p = new Processor();
p.Process();
}
}
class Processor
{
bool FetchComplete = false;
ConcurrentQueue<Contact> q = new ConcurrentQueue<Contact>();
public void Process()
{
this.PopulateQueue(); // this will be fired off using QueueUserWorkItem for example
while (FetchComplete == false)
{
if (q.Count > 0)
{
Contact contact;
q.TryDequeue(out contact);
ProcessContact(contact); // this will also be in QueueUserWorkItem
}
}
}
// a long running process that fills the queue with Contacts
private void PopulateQueue()
{
this.FetchComplete = false;
// foreach contact in database
Contact contact = new Contact(); // contact will come from DB
this.q.Enqueue(contact);
// end foreach
this.FetchComplete = true;
}
private void ProcessContact(Contact contact)
{
// do magic with contact
}
}
You might be better off using
BlockingCollectioninstead ofConcurrentQueue. The reason being that the former will block the thread callingTakeuntil an item appears in the queue. This would be useful when the thread processing theContractinstances clears out the queue before the fetching thread has retrieved them all.In general your strategy is pretty solid. I use it all the time. It is often referred to as the producer-consumer pattern. When there are more than 2 stages involved in the processing then it is called the pipeline pattern. In that case you would have 2 or more queues instead of the typical one. You can imagine scenarios where each stage forwards the work item onto the next stage via another queue.