suppose I have a code like this
for(i = 0; i < i_max; i++)
for(j = 0; j < j_max; j++)
// do something
and I want to do this by using different threads (assuming the //do something tasks are independent from each other, think about montecarlo simulations for instance). My question is this: is it necessarily better to create a thread for each value of i, than creating a thread for each value of j? Something like this
for(i = 0; i < i_max; i++)
create_thread(j_max);
additionally: what would a suitable number of threads? Shall I just create i_max threads or, perhaps, use a semaphore with k < i_max threads running concurrently at any given time.
Thank you,
The best way to apportion the workload is workload-dependent.
Broadly – for parallelizable workload, use OpenMP; for heterogeneous workload, use a thread pool. Avoid managing your own threads if you can.
Monte Carlo simulation should be a good candidate for truly parallel code rather than thread pool.
By the way – in case you are on Visual C++, there is in Visual C++ v10 an interesting new Concurrency Runtime for precisely this type of problem. This is somewhat analogous to the Task Parallel Library that was added to .Net Framework 4 to ease the implementation of multicore/multi-CPU code.