I am doing web crawling on a server with 32 virtual processors using Java. How can I make full of these processors? I’ve seen some suggestions on multi-threaded programming, but I wonder how that could ensure all processors would be taken advantage of since we can do multi-threaded programming on single processor machine as well.
Share
There is no simple answer to this … except the way to ensure all processors are used is to use multi-threading the right way. (Note: that is a circular answer!)
Basically, the way to get effective use of multiple processors is to:
This is difficult enough when you are doing simple computation. For a web crawler, you’ve got the additional problems that the threads will be competing for network and (possibly) remove server bandwidth, and they will typically be attempting to put their results into a shared data structure or database.
That’s about all that can be said at this level of generality …
And as @veer correctly points, you can’t “ensure” it.
Actually, if you go overboard, a load of threads can reduce throughput because of contention. Just throwing lots of threads at the problem is rarely a good idea.