I’ve built an interpreter in C++ for a language created by me.
One main problem in the design was that I had two different types in the language: number and string. So I have to pass around a struct like:
class myInterpreterValue
{
myInterpreterType type;
int intValue;
string strValue;
}
Objects of this class are passed around million times a second during e.g.: a countdown loop in my language.
Profiling pointed out: 85% of the performance is eaten by the allocation function of the string template.
This is pretty clear to me: My interpreter has bad design and doesn’t use pointers enough. Yet, I don’t have an option: I can’t use pointers in most cases as I just have to make copies.
How to do something against this? Is a class like this a better idea?
vector<string> strTable;
vector<int> intTable;
class myInterpreterValue
{
myInterpreterType type;
int locationInTable;
}
So the class only knows what type it represents and the position in the table
This however again has disadvantages:
I’d have to add temporary values to the string/int vector table and then remove them again, this would eat a lot of performance again.
- Help, how do interpreters of languages like Python or Ruby do that? They somehow need a struct that represents a value in the language like something that can either be int or string.
I suspect many values aren’t strings. So the first thing you can do is to get rid of the
stringobject if you don’t need it. Put it into an union. Another thing is that probably many of your strings are only small, thus you can get rid of heap allocation if you save small strings in the object itself. LLVM has theSmallStringtemplate for that. And then you can use string interning, as another answer says too. LLVM has theStringPoolclass for that: Callintern("foo")and get a smart pointer refering to a shared string potentially used by othermyInterpreterValueobjects too.The union can be written like this
boost::variantdoes the type tagging for you. You can implement it like this, if you don’t have boost. The alignment can’t be gotten portably in C++ yet, so we push some types that possibly require some large alignment into the storage union.You get the idea.