My django web-app logic is heavily geared towards background task execution (both periodic as well as stand alone, synchronous as well as asynchronous). All the research seems to point to using Celery being the most recommended approach. I plan to eventually deploy on Heroku and the fact that it has support for Celery + Redis (what I’m using for local development) is a big plus for me.
However I need more extensive scheduling capabilities than celery provides. I need some of my periodic tasks to be able to run schedules like ‘run on last sun of the month’ etc. So I’ve implemented my own models in django to store a recurrence rule and other needed parameters.
Now I’m stumped with how to interface my tables with celery. Ideally what I’d like to do is to have my own Job model which has the schedule, the task which should be run when it becomes due as well as the parameters for the task. Sort of like function ptr in C++. Then I would run a daemon which keeps checking the job queue for which job has become due, if its periodic it creates the next job instance and pushes it into queue, then runs the associated task with parameters using celery’s delay method or similar.
questions:
- Does this approach even make sense?
- If not what other alternative approach(es) can I use
- If yes how do I go about designing that Job/Event queue…
I’d love to hear a better approach to doing this or if there’s an existing implementation of a job queue that might be suitable or a way to use celery’s job queue itself…
Thanks heaps..
The periodic tasks in Celery works pretty much like this. There’s a dedicated scheduler process (
celery beat) which simply sends off tasks when they are due.You can also create new schedulers to use with
beatby subclassing thecelery.beat.Schedulerclass, and you can create custom schedules too (like thecrontabschedule that is already built-in) by subclassingcelery.schedules.schedule.There’s a database-backed scheduler implementation in the django-celery extension (
djcelery.schedulers.DatabaseScheduler), which uses many tricks to avoid too frequent polling of the database and so on (sadly it’s not well commented).Scheduler: https://github.com/celery/celery/tree/master/celery/beat.py
schedules: https://github.com/celery/celery/tree/master/celery/schedules.py
DatabaseScheduler: https://github.com/celery/django-celery/tree/master/djcelery/schedulers.py