I am looking for an appropriate pattern and best modern way to solve the following problem:
My application is expecting inputs from multiple sources, for example: GUI, monitoring file-system, voice command, web request, etc. When an input is received I need to send it to some ProcessInput(InputData arg) method that would start processing the data in the background, without blocking the application to receive and process more data, and in some way return some results whenever the processing is complete. Depending on the input, the processing can take significantly different amounts of time. For starters I don’t need the ability to check the progress or cancel the processing.
After reading a dozen of articles on MSDN and blogposts of some rock-star programmers I am really confused what pattern should be used here, and more importantly which features of .NET
My findings are:
- ThreadPool.QueueUserWorkItem – easiest to understand, not very convinient about returning the results
- BackgroundWorker – seems to be used only only for rather simple tasks, all workers run on single thread?
- Event-based Asynchronous Pattern
- Tasks in Task Parallel Library
- C# 5 async/await – these seem to be shortcuts for Tasks from Task Parallel
Notes:
Performance is important, so taking advantage of multi-core system when possible would be really nice.
This is not a web application.
My problem reminds me of a TCP server(really any sort of server) where application is constantly listening for new connections/data on multiple sockets, I found the article Asynchronous Server Socket and I am curious if that pattern could be a possible solution for me.
I’ve done a whole lot of asynchronous programming in my time. I find it useful to distinguish between background operations and asynchronous events. A “background operation” is something that you initiate, and some time later it completes. An “asynchronous event” is something that’s always going on independent of your program; you can subscribe, receive the events for a time, and then unsubscribe.
So, GUI inputs and file-system monitoring would be examples of asynchronous events; whereas web requests are background operations. Background operations can also be split into CPU-bound (e.g., processing some input in a pipeline) and I/O-bound (e.g., web request).
I make this distinction especially in .NET because different approaches have different strengths and weaknesses. When doing your evaluations, you also need to take into consideration how errors are propogated.
First, the options you’ve already found:
ThreadPool.QueueUserWorkItem– almost the worst option around. It can only handle background operations (no events), and doesn’t handle I/O-bound operations well. Returning results and errors are both manual.BackgroundWorker(BGW) – not the worst, but definitely not the best. It also only handles background operations (no events), and doesn’t handle I/O-bound operations well. Each BGW runs in its own thread – which is bad, because you can’t take advantage of the work-stealing self-balancing nature of the thread pool. Furthermore, the completion notifications are (usually) all queued to a single thread, which can cause a bottleneck in very busy systems.Tasksin Task Parallel Library –Taskis the best option for background operations, both CPU-bound and I/O-bound. I review several background operation options on my blog – but that blog post does not address asychronous events at all.async/await– These allow a more natural expression ofTask-based background operations. They also offer an easy way to synchronize back to the caller’s context if you want to (useful for UI-initiated operations).Of these options,
async/awaitare the easiest to use, withTaska close second. The problem with those is that they were designed for background operations and not asynchronous events.Any asynchronous event source may be consumed using asynchronous operations (e.g.,
Task) as long as you have a sufficient buffer for those events. When you have a buffer, you can just restart the asynchronous operation each time it completes. Some buffers are provided by the OS (e.g., sockets have read buffers, UI windows have message queues, etc), but you may have to provide other buffers yourself.Having said that, here’s my recommendations:
await/asyncorTaskdirectly, use TAP to model at least your background operations.Tasks. The disadvantage to Dataflow is that it’s still developing and (IMO) not as stable as the rest of the Async support.All three of these options are efficient (using the thread pool for any actual processing), and they all have well-defined semantics for error handling and results. I do recommend using TAP as much as possible; those parts can then easily be integrated into Dataflow or Rx.
You mentioned “voice commands” as one possible input source. You may be interested in a BuildWindows video where Stephen Toub sings — and uses Dataflow to harmonize his voice in near-realtime. (Stephen Toub is one of the geniuses behind TPL, Dataflow, and Async).