Imagine a functionality of an application that requires up to 5 threads crunching data, these threads use buffers, mutex and events to interact with each other. The performance is critical, and the language is C++.
The functionality can be implemented as one (compilation) unit with one class, and only one instance of this class can be instantiated for the application. The class itself implements 1 of the threads in run() method, which spawns other 4 threads, manages them and gathers them when user closes the application.
What is the advantage of choosing one of the following method over another (please do let me know of any better approach)?
- Add 5 static methods to the class, each running a single thread, mutex and other data shared as static class variables.
- Add 5 global functions (no scope) and use global variables, events and mutex (as if it is C)
- change the pattern entirely, add 4 more classes each implementing one of the threads and share data via global variables.
Here are some thoughts and issues to be considered (please correct them if they are wrong):
- Having threads as class members (static of course), they can rely on the singleton to access non-static member functions, it also gives them a namespace which by itself seems a good idea.
- Using static class methods, the class header file soon will contain many static variables (and other helper static methods). Having to declare variables in the class header file may bring additional dependencies to other units that include the header file. If variables where declared globally they could be hidden in a separate header file.
- Static class variables should be defined somewhere in the code, so it doubles typing declaration stuff.
- Compilers can take advantage of the namespace resolution for more optimized code (as opposed to global variables possibly in different units).
- The single unit can potentially be better optimized, whereas whole program optimization is slow and probably less fruitful.
- If the unit grows I have to move some part of the code to a separate unit, so I will have one class with multiple (compilation) units, is this a anti-pattern or not?
- If using more than one class, each handling one thread, again same question can be made to decide between static methods and global functions to implement the threads. In addition, this requires more lien of code, not a real issue but does it worth the additional overhead?
Please answer this assuming no library such as Qt, and then assuming that we can rely on QThread and implement one thread per run() method.
Edit1: The number of threads is fixed per design, number 5 is just an example. Please share your thoughts on the approaches/patterns and not on details.
Edit2: I have found this answer (to a different question) very helpful, I guess the first approach misuses classes as namespaces. Second approach can be mitigated if coupled with namespace.
Sources
First, you should read the whole concurrency articles from Herb Sutter:
http://herbsutter.com/2010/09/24/effective-concurrency-know-when-to-use-an-active-object-instead-of-a-mutex/
This is the link to the last article’s post, which contains the links to all the previous articles.
What’s your case?
According to the following article: How Much Scalability Do You Have or Need? ( http://drdobbs.com/parallel/201202924 ), you are in the
O(K): Fixedcase. That is, you have a fixed set of tasks to be executed concurrently.By the description of your app, you have 5 threads, each one doing a very different thing, so you must have your 5 threads, perhaps hoping one or some among those can still divide their tasks into multiple threads (and thus, using a thread pool), but this would be a bonus.
I let you read the article for more informations.
Design questions
About the singleton
Forget the singleton. This is a dumb, overused pattern.
If you really really want to limit the number of instances of your class (and seriously, haven’t you something better to do than that?), You should separate the design in two: One class for the data, and one class to wrap the previous class into the singleton limitation.
About compilation units
Make your headers and sources easy to read. If you need to have the implementation of a class into multiple sources, then so be it. I name the source accordingly. For example, for a class MyClass, I would have:
About compiler optimisations
Recent compiler are able to inline code from different compilation units (I saw that option on Visual C++ 2008, IIRC). I don’t know if whole global optimization works worse than "one unit" compilation, but even if it is, you can still divide your code into multiple sources, and then have one global source include everything. For example:
and then do your includes accordingly. But you should be sure this actually makes your performance better: Don’t optimize unless you really need it and you profiled for it.
Your case, but better?
Your question and comments speak about monolithic design more than performance or threading issue, so I could be wrong, but what you need is simple refactoring.
I would use the 3rd method (one class per thread), because with classes comes private/public access, and thus, you can use that to protect the data owned by one thread only by making it private.
The following guidelines could help you:
1 – Each thread should be hidden in one non-static object
You can either use a private static method of that class, or an anonymously namespaced function for that (I would go for the function, but here, I want to access a private function of the class, so I will settle for the static method).
Usually, thread construction functions let you pass a pointer to a function with a
void *context parameter, so use that to pass yourthispointer to the main thread function:Having one class per thread helps you isolate that thread, and thus, that thread’s data from the outer world: No other thread will be able to access that data as it is private.
Here’s some code:
.
Displaimer: This wasn’t tested in a compiler. Take it as pseudo C++ code more than actual code. YMMV.
2 – Identify the data that is not shared.
This data can be hidden in the private section of the owning object, and if they are protected by synchronization, then this protection is overkill (as the data is NOT shared)
3 – Identify the data that is shared
… and verify its sychronization (locks, atomic access)
4 – Each class should have its own header and source
… and protect the access to its (shared) data with synchronization, if necessary
5 – Protect the access as much as possible
If one function is used by a class, and only a class, and does not really need access to the class internals, then it could be hidden in an anonymous namespace.
If one variable is owned by only a thread, hide it in the class as a private variable member.
etc.