I’m trying to micro-optimize my code at a very low level point in the application architecture. So here is my concrete scenario:
- I have a parser class which parses a graph file (nodes, edges, adjacency entries etc.)
- The file format is versioned, so there exist parsers for each version which are implemented as separate classes (ParserV1, ParserV2, …).
- The parsers provide the same functionality to some upper layer in the application. Thus, they implement the same “interface“.
- In C++, I’d implement such an interface as an abstract class with all functions being pure virtual.
- As virtual functions need another memory lookup and can’t be bound statically at compile time and — much more important — will not allow inlining of small methods in the parser classes, using a classical sub-classing idiom wouldn’t lead to the best performance I can achieve.
[Before describing my possible solutions, I want to explain why I’m doing micro-optimization here (you may skip this paragraph): The parser class has a lot of small methods, where “small” means that they don’t do much. Most of them only read one or two bytes or even only one bit from a cached bit stream. So it should be possible to implement them in a very very efficient way, where a function call, when inlined, only needs a handful of machine commands.
The methods are called very often in the application, since they look up node attributes in a very big graph (the world-wide road network), which might happen about one million times per user request, and such an request should be as fast as possible.]
Which is the way to go here? I can see the following methods to solve the problem:
- Write an interface with pure virtual methods and subclass it. The performance will suffer.
- Do not write such an interface. Each parser defines the same methods on its own. In the upper layer (which uses the parser) has pointers (as members) to each version subclass. In the beginning, instantiate the specific parser which should be used. Use a switch block and cast the parser instance to the explicit subclass whenever accessing a function. Will the performance better? (if/switch block vs. virtual table lookup).
- Mix the two solutions 1. + 2.: Write an interface with pure virtual methods for seldom used methods, where performance isn’t highly critical. If it is critical, don’t provide a virtual method but use the second method.
- Improving 2.: Provide non-virtual methods in the abstract class; keep a version number as a member variable in the abstract class (kind of own runtime type information) and implement the if/switch blocks and casts in these methods; then call the methods in the subclass. This provide both inlining and static binding.
Are there better ways to solve this problem? Is there any idiom for this?
To clarify, I have a lot of functions which are version-independent (at least until now), and are thus perfectly fitting in some super class. I will use a standard sub-classing design for most functions, while this questions only covers a solution for the version-dependent functions to be optimised. (Some of them aren’t called very frequently and I can of course use virtual methods in these cases.) Besides this, I don’t like the idea to make the parser class decide which methods need to be performant and which don’t. (Although it would be possible to do so.)
First, of couse, you should profile your code to figure-out how much are the vcalls performance-killing in your particular case (besides of potentially weaker optimizations).
Putting the optimization subject aside, I’m almost sure you won’t get any significant performance gain by replacing virtual function call (or call a function by a pointer variable, which is almost the same) with a switch that calls compile-time-known functions in different cases.
If you really want a significant improvement – those are the most promising variants IMHO:
Try to redesign your interface to enable more complex functions. For instance, if you have a function that reads a single vertex – modify it to read (up to) N vertexes at once. And so on.
You may make your whole parsing code (that uses your parser) a
templateclass/function, that will use a template parameter to instantiate the needed parser. Here you’ll need neither interface nor virtual functions. At the very beginning (where you identify the version) – put aswitch, for every recognized version call this function with the appropriate template parameter.The latter will probably be superior from the performance point of view, OTOH this increases the code size
EDIT:
Here’s an example of (2):
The classes
ParserV1,ParserV2and etc. do not havevirtualfunctions. They also don’t inherit any interface. They just implement some functions, such asGetVertexCount.