I am working on a little multi-threading project.The system could be divided into 2 sub parts, A and B. the data flows from A to B.
A part keeps fetching raw data from outside world, do some transformation and and then generate thousands of new data, let’s call it A_OUTPUT.
B part do some calculations based on each A_OUTPUT and then generate even more data, may be ten times the number of A_OUTPUT.
I am confused about how to synchronize the 2 parts.
My own design is to create a work queue as well as a lock protecting the queue between the two sub parts. Also create a event to indicate whether the work queue is empty or not.
A part consists multiple threads, each thread fetch data from outside and generate A_OUTPUT, each time a single A thread generate a A_OUTPUT, the thread obtain the queue lock, push the A_OUTPUT into the queue, release the lock, and then trigger the event.
B part consists a supervisor thread and several worker threads, the supervisor thread was first blocked on the event. after the event was triggered, the supervisor thread lock the queue, fetch all A_OUTPUTS of the queue, release the lock, dispatch A_OUTPUTS to worker threads, and then wait on the event again.
The problem of this design is obvious, the supervisor thread of B will be racing with multiple threads of A to win the queue lock. maybe when B finally own the lock, there was already ten or more A_OUTPUTs in the queue, and the most aged A_OUTPUT was generated a long time ago. I want each A_OUTPUT to be processed as fast as possible.
I know I could divide the work queue into several smaller queues or add more B supervisor threads into the lock battle to shorten the average time each A_OUTPUT wait before it get processed. but might there exist a more appropriate design?
And another question, does there exists any paradigm or design pattern for different purpose multi-thread programs?
It’s rather classical problem that is well described in wikipedia
I can recommend following approach:
Synchronize access to the queue by mutex. keep two condition variables, one to signal that queue is not full (you need to handle cases when
Producerproduces more data thanConsumercan consume) and another one to signal that queue has any data.Producerchecks if queue is not full. If full – waits for condition “not full”, otherwise produces some data, puts it into queue, notifies “has data” condition.Consumerchecks if queue has any data, consumes it and notifies “not full” conditionAlso you can use lock-free queue for better performance. Check TBB or recently announced Boost.Lockless (under review at the moment). By the way, using TBB the whole task is much simpler, just use their dispatcher and containers and forgot about
explicit synchronization