I’m trying to implement some code to import user’s data from another service via the service’s API. The way I’m going to set it up is all the request jobs will be kept in a queue which my simple importer program will draw from. Handling one task at a time won’t come anywhere close to maxing out any of the computer’s resources so I’m wondering what is the standard way to structure a program to run multiple “jobs” at once? Should I be looking into threading or possibly a program that pulls the jobs from the queue and launches instances of the importer program? Thanks for the help.
EDIT: What I have right now is in Python although I’m open to rewriting it in another language if need be.
Use a Producer-Consumer queue, with as many Consumer threads as you need to optimize resource usage on the host (sorry – that’s very vague advice, but the “right number” is problem-dependent).
If requests are lightweight you may well only need one Producer thread to handle them.
Launching multiple processes could work too – best choice depends on your requirements. Do you need the Producer to know whether the operation worked, or is it ‘fire-and-forget’? Do you need retry logic in the event of failure? How do you keep count of concurrent Consumers in this model? And so on.
For Python, take a look at this.