I use iPython’s parallel-processing facility for a big map operation. While waiting for the map operation to finish, I’d like to display to the user how many of the jobs have finished, how many are running, and how many are remaining. How can I find that information?
Here is what I do. I create a profile that uses a local engine and start two workers. In the shell:
$ ipython profile create --parallel --profile=local
$ ipcluster start --n=2 --profile=local
Here is the client Python script:
#!/usr/bin/env python
def meat(i):
import numpy as np
import time
import sys
seconds = np.random.randint(2, 15)
time.sleep(seconds)
return seconds
import time
from IPython.parallel import Client
c = Client(profile='local')
dview = c[:]
ar = dview.map_async(meat, range(4))
elapsed = 0
while True:
print 'After %d s: %d running' % (elapsed, len(c.outstanding))
if ar.ready():
break
time.sleep(1)
elapsed += 1
print ar.get()
Example output from the script:
After 0 s: 2 running
After 1 s: 2 running
After 2 s: 2 running
After 3 s: 2 running
After 4 s: 2 running
After 5 s: 2 running
After 6 s: 2 running
After 7 s: 2 running
After 8 s: 2 running
After 9 s: 2 running
After 10 s: 2 running
After 11 s: 2 running
After 12 s: 2 running
After 13 s: 2 running
After 14 s: 1 running
After 15 s: 1 running
After 16 s: 1 running
After 17 s: 1 running
After 18 s: 1 running
After 19 s: 1 running
After 20 s: 1 running
After 21 s: 1 running
After 22 s: 1 running
After 23 s: 1 running
[9, 14, 10, 3]
As you can see, I can get the number of currently running jobs, but not the number of jobs that have completed (or are remaining). How can I tell how many of map_async‘s jobs have finished?
the AsyncResult has a
msg_idsattribute. The outstanding jobs are the intersection of that with rc.outstanding, and the completed jobs are the difference: