I am developing and maintaining a .NET 3.5 tool at work, and wondering whether a potential gain in performance can be gained by using .NET 4’s new TPL or even the new async features which are still in CTP.
The tool’s work can be roughly described as:
- Retrieve a list of container files (currently .MSI files) — a few dozens of them, ~ 50-70
- Iterate over each file, and construct a runtime object representing it.
- For each runtime object created, perform some queries on its contents (compare its contents with some files on the system).
Items #2 and #3 are the lengthy ones, and i would like to get some opinions on the potential of improving the execution time (which is a few minutes right now) by using Parallel.ForEach or other methods for executing this work in parallel.
Potential improvements i am foreseeing are:
Making use of multiple CPUs/cores
Keeping the app running while IO operations (like reading files) are being done to do something else.
Would you think this kind of application can benefit from these, before jumping into development?
I would run a profiler to see where your application is spending time and then decide. If you find it is waiting for I/O completion then you may find benefit from using the Asynchronous Programming Model. If you find you are compute bound, then, depending on your anticipated runtime environment (multi-core/single core), you may find multi-threaded computation to be of benefit. Of course, you may find that both cases apply.
Incidentally, you can also use many of the .NET 4 threading features in .NET 3.5 by using Reactive Extensions. I am currently using this in a productive .NET 3.5 application.