Context: I have a photo-uploading website set up. I need to perform operations on these uploads every time a user uploads a photo.
With help from other users here on SO, I came to the conclusion, that I needed a background thread that accepted these “processing jobs”, so I could return response to the user quickly, and let the background thread work on these background jobs.
I’m sort of “set” on a threading solution as opposed to a service for instance, as it’s not possible for me to set up a service on the webserver. I’ve read some things on message queues, and background threads, but I’m really in need of is some practical pointers as to how I should proceed.
Also – are there any things I should be aware of? Off the top of my head, I’m thinking about the number of threads, and possibly hitting a snag with the IIS or server if too many threads are running? That’s why I’m thinking it should be a single background thread per user, and not a thread per job, as there could be MANY photos uploaded at once. So a single thread per user that takes care of the jobs in a ‘queue’ like fashion. Am I way off base?
You can run as many threads as you like, but you’ll run the risk of spending more time on context switching that actual crunching. If you need good CPU performance you should use no more than 1 thread per CPU core. The PLINQ stuff has this exact strategy. If you tell PLINQ to run a query it will execute in parallel the equivalent of the number of CPU cores available on your system.
If you’re gonna implement queue, you should be thinking of a FIFO queue, users put their work in a bunch of threads or servers pull work from this queue and does the work.
i.e. you can use a SQL Server database to synchronize work cross many machines using a FIFO queue in the database (this is just a simple table that you pull work from). It will scale pretty well and it’s also more robust because work can be resumed if it crashes or timeouts.
You should read this question i posted about this a while back. Remus Rusanu posted some intresting links on the topic that discusses the use of a database to orchestrate work loads.