I am using a PBS-based cluster and running IPython parallel over a set of

Question

0

Asked: June 17, 20262026-06-17T22:27:32+00:00 2026-06-17T22:27:32+00:00

I am using a PBS-based cluster and running IPython parallel over a set of

0

I am using a PBS-based cluster and running IPython parallel over a set of nodes, each with either 24 or 32 cores and memory ranging from 24G to 72G; this heterogeneity is due to our cluster having history to it. In addition, I have jobs that I am sending to the IPython cluster that have varying resource requirements (cores and memory). I am looking for a way to submit jobs to the ipython cluster that know about their resource requirements and those of the available engines. I imagine there is a way to deal with this situation gracefully using IPython functionality, but I have not found it. Any suggestions as to how to proceed?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-17T22:27:34+00:00

In addition to graph dependencies, which you indicate that you already get, IPython tasks can have functional dependencies. These can be arbitrary functions, like tasks themselves. A functional dependency runs before the real task, and if it returns False or raises a special parallel.UnmetDependency exception, the task will not be run on that engine, and will be retried somewhere else.

So to use this, you need a function that checks whatever metric you need. For instance, let’s say we only want to run a task on your nodes with a minimum amount of memory. Here is a function that checks the total memory on the system (in bytes):

def minimum_mem(limit):
    import sys
    if sys.platform == 'darwin': # or BSD in general?
        from subprocess import check_output
        mem = int(check_output(['sysctl', '-n', 'hw.memsize']))
    else: # linux
        with open("/proc/meminfo") as f:
            for line in f:
                if line.startswith("MemTotal"):
                    mem = 1024 * int(line.split()[1])
                    break
    return mem >= limit

kB = 1024.
MB = 1024 * kB
GB = 1024 * MB

so minimum_mem(4 * GB) will return True iff you have at least 4GB of memory on your system. If you want to check available memory instead of total memory, you can use the MemFree and Inactive values in /proc/meminfo to determine what is not already in use.

Now you can submit tasks only to engines with sufficient RAM by applying the @parallel.depend decorator:

@parallel.depend(minimum_mem, 8 * GB)
def big_mem_task(n):
    import os, socket
    return "big", socket.gethostname(), os.getpid(), n

amr = view.map(big_mem_task, range(10))

Similarly, you can apply restrictions based on the number of CPUs (multiprocessing.cpu_count is a useful function there).

Here is a notebook that uses these to restrict assignment of some dumb tasks.

Typically, the model is to run one IPython engine per core (not per node), but if you have specific multicore tasks, then you may want to use a smaller number (e.g. N/2 or N/4). If your tasks are really big, then you may actually want to restrict it to one engine per node. If you are running more engines per node, then you will want to be a bit careful about running high resource tasks together. As I have written them, these checks do not take into account other tasks on the same node, so if a node as 16 GB of RAM, and you have two tasks that each need 10, you will need to be more careful about how you track available resources.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am using a PBS-based cluster and running IPython parallel over a set of

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply