I’m writing some software which will manage a few hundred small systems in the

Question

0

Asked: May 17, 20262026-05-17T15:21:53+00:00 2026-05-17T15:21:53+00:00

I’m writing some software which will manage a few hundred small systems in the

0

I’m writing some software which will manage a few hundred small systems in “the field” over an intermittent 3G (or similar) connection.

Home base will need to send jobs to the systems in the field (eg, “report on your status”, “update your software”, etc), and the systems in the field will need to send jobs back to the server (eg, “a failure has been detected”, “here is some data”, etc).

I’ve spent some time looking at Celery and it seems to be a perfect fit: celeryd running at home base could collect jobs for the systems in the field, a celeryd running on the field systems could collect jobs for the server, and these jobs could be exchanged as clients become available.

So, is Celery a good fit for this problem? Specifically:

The majority of tasks will be directed to an individual worker (eg, “send the ‘get_status’ job to ‘system51’”) — will this be a problem?
Does it gracefully handle adverse network conditions (like, eg, connections dying)?
What functionality is only available if RabbitMQ is being used as a backend? (I’d rather not run RabbitMQ on the field systems)
Is there any other reason Celery could make my life difficult if I use it like I’ve described?

Thanks!

(it would be valid to suggest that Celery is overkill, but there are other reasons that it would make my life easier, so I would like to consider it)

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-17T15:21:53+00:00

The majority of tasks will be directed
to an individual worker (eg, “send the
‘get_status’ job to ‘system51’”) —
will this be a problem?

Not at all. Just create a queue for each worker, e.g. say each node listens to a round robin queue called default and each node has its own queue named after its node name:

(a)$ celeryd -n a.example.com -Q default,a.example.com
(b)$ celeryd -n b.example.com -Q default,b.example.com
(c)$ celeryd -n c.example.com -Q default,c.example.com

Routing a task directly to a node is simple:

$ get_status.apply_async(args, kwargs, queue="a.example.com")

or by configuration using a Router:

# Always route "app.get_status" to "a.example.com"
CELERY_ROUTES = {"app.get_status": {"queue": "a.example.com"}}

Does it gracefully handle adverse
network conditions (like, eg,
connections dying)?

The worker gracefully recovers from broker connection failures.
(at least from RabbitMQ, I’m not sure about all the other backends, but this
is easy to test and fix (you only need to add the related exceptions to a list)

For the client you can always retry sending the task if the connection is down,
or you can set up HA with RabbitMQ: http://www.rabbitmq.com/pacemaker.html

What functionality is only available
if RabbitMQ is being used as a
backend? (I’d rather not run RabbitMQ
on the field systems)

Remote control commands, and only “direct” exchanges are supported (not “topic” or “fanout”). But this will be supported in Kombu (http://github.com/ask/kombu).

I would seriously reconsider using RabbitMQ. Why do you think it’s not a good fit?
IMHO I wouldn’t look elsewhere for a system like this, (except maybe ZeroMQ if the system
is transient and you don’t require message persistence).

Is there any other reason Celery could make my life
difficult if I use it like I’ve described?

I can’t think of anything from what you describe above. Since the concurrency model
is multiprocessing it does require some memory (I’m working on adding support for
thread pools and eventlet pools, which may help in some cases).

it would be valid to suggest that Celery is overkill, but there are
other reasons that it would make my life easier, so I would like to
consider it)

In that case I think you use the word overkill lightly. It really depends
on how much code and tests you need to write without it. I think
it’s better to improve an already existing general solution, and in theory it sounds
like it should work well for your application.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m writing some software which will manage a few hundred small systems in the

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply