“Introduction”
I’m relatively new to C++. I went through all the basic stuff and managed to build 2-3 simple interpreters for my programming languages.
The first thing that gave and still gives me a headache: Implementing the type system of my language in C++
Think of that: Ruby, Python, PHP and Co. have a lot of built-in types which obviously are implemented in C.
So what I first tried was to make it possible to give a value in my language three possible types: Int, String and Nil.
I came up with this:
enum ValueType
{
Int, String, Nil
};
class Value
{
public:
ValueType type;
int intVal;
string stringVal;
};
Yeah, wow, I know. It was extremely slow to pass this class around as the string allocator had to be called all the time.
Next time I’ve tried something similar to this:
enum ValueType
{
Int, String, Nil
};
extern string stringTable[255];
class Value
{
public:
ValueType type;
int index;
};
I would store all strings in stringTable and write their position to index. If the type of Value was Int, I just stored the integer in index, it wouldn’t make sense at all using an int index to access another int, or?
Anyways, the above gave me a headache too. After some time, accessing the string from the table here, referencing it there and copying it over there grew over my head – I lost control. I had to put the interpreter draft down.
Now: Okay, so C and C++ are statically typed.
-
How do the main implementations of the languages mentioned above handle the different types in their programs (fixnums, bignums, nums, strings, arrays, resources,…)?
-
What should I do to get maximum speed with many different available types?
-
How do the solutions compare to my simplified versions above?
There are a couple of different things that you can do here. Different solutions have come up in time, and most of them require dynamic allocation of the actual datum (boost::variant can avoid using dynamically allocated memory for small objects –thanks @MSalters).
Pure C approach:
Store type information and a void pointer to memory that has to be interpreted according to the type information (usually an enum):
In C++ you can improve this approach by using classes to simplify the usage, but more importantly you can go for more complex solutions and use existing libraries as boost::any or boost::variant that offer different solutions to the same problem.
Both boost::any and boost::variant store the values in dynamically allocated memory, usually through a pointer to a virtual class in a hierarchy, and with operators that reinterpret (down casts) to the concrete types.