I am using the below code for a dictionary of like 100,000 keys and values…I wanted to make it more faster by doing multiprocessing/multithreading since each loop is independent of another loop. Can anyone tell me how to apply and which one (multiprocessing/multithreading) is more apt for this kind of approach
from urlparse import urlparse
ProcessAllURLs(URLs)
ProcessAllURLs(URLs)
def ProcessAllURLs(URLs):
for eachurl in URLs:
x=urlparse(eachurl)
print eachurl.netloc
Thanks
I would recommend Python’s multiprocessing library. In particular, study the section labeled “Using a pool of workers”. It should be pretty quick to rework the above code so that it uses all available cores of your system.
One tip, though: Don’t print URLs from the pool workers. It is better to pass back the answer to the main process and aggregate them there for printing. Printing from different processes will result in a lot of jumbled, uncoordinated console output.