I am running a piece of software that is very parallel. There are about 400 commands I need to run that don’t depend on each other at all, so I just fork them off and hope and that having more CPUs means more processes executed per unit time.
Code:
foreach cmd ($CMD_LIST)
$cmd & #fork it off
end
Very simple. Here are my testing results:
On 1 CPU, this takes 1006 seconds, or 16 mins 46 seconds.
With 10 CPUs, this took 600s, or 10 minutes!
Why wouldn’t the time taken divide (roughly) by 10? I feel cheated here =(
edit – of course I’m willing to provide additional details you would want to know, just not sure what’s relevant because in simplest terms this is what I’m doing.
You are assuming your processes are 100% CPU-bound.
If your processes do any disk or network I/O, the bottleneck will be on those operations, which cannot be parallelised (eg one process will download a file at 100k/s, 2 processes at 50k/s each so you would not see any improvement at all, furthermore you could experience a degrade in performance because of overheads).
See: Amdahl’s_law – this allows you to estimate the improvement in performance when parallelising tasks, knowing the proportion between the parallelisable part and the non-parallelisable)