Basically, I have a large object that I want to perform some function on,

Question

0

Asked: June 19, 20262026-06-19T02:38:21+00:00 2026-06-19T02:38:21+00:00

Basically, I have a large object that I want to perform some function on,

0

Basically, I have a large object that I want to perform some function on, that lends itself well to parallel processing. In this example, I have a large matrix and I want to compute all pairwise inner products between column vectors.

Please take a look at the following IPython Notebook.

I realise that the @interactive decorator is not necessary in this context and I tried removing the @require decorator but its impact is negligible.

My question is: Is there any way available to improve the performance of the parallel machinery?

I don’t know the implementation details of the map methods, could I avoid overhead by pushing the function that is executed in parallel to the engines in the view? I can’t imagine that it is sent with every argument, though.

Chunking the argument list myself and writing a function for remote use that works on that seems silly as well.

I tried the notebook on a four core machine and the results in the notebook are for a two core machine.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-19T02:38:22+00:00

The main performance issue here is that the fortran-contiguous optimization you applied does not survive the network transfer, so mat on the engines is C-contiguous, not F-contiguous after the push.

You can see this with:

print mat.flags
%px print mat.flags

Adding:

%px mat = numpy.asfortranarray(mat)

Should get your performance back (as illustrated in my tweaked version of your notebook).

For diagnosing this issue, I did my best to isolate where the bottlenecks were. Useful for this were the AsyncResult.serial_time and AsyncResult.wall_time. When the serial_time is long, that means the task is actually taking a long time on the engines, rather than spending lots of time in the IPython pipes. That led me to think that the task itself was slow on the engines, so I did the task remotely on one engine,
and it was still slow (nothing parallel involved). Here’s a notebook tracking down the issue.

Side note:

The @interactive decorator is only necessary for functions that are not interactively defined (i.e. module functions, not functions defined in the notebook), so it’s redundant in your notebook.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Basically, I have a large object that I want to perform some function on,

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply