I’m interesting in how CLR implementes the calls like this:
abstract class A {
public abstract void Foo<T, U, V>();
}
A a = ...
a.Foo<int, string, decimal>(); // <=== ?
Is this call cause an some kind of hash map lookup by type parameters tokens as the keys and compiled generic method specialization (one for all reference types and the different code for all the value types) as the values?
I didn’t find much exact information about this, so much of this answer is based on the excellent paper on .Net generics from 2001 (even before .Net 1.0 came out!), one short note in a follow-up paper and what I gathered from SSCLI v. 2.0 source code (even though I wasn’t able to find the exact code for calling virtual generic methods).
Let’s start simple: how is a non-generic non-virtual method called? By directly calling the method code, so the compiled code contains direct address. The compiler gets the method address from the method table (see next paragraph). Can it be that simple? Well, almost. The fact that methods are JITed makes it a little more complicated: what is actually called is either code that compiles the method and only then executes it, if it wasn’t compiled yet; or it’s one instruction that directly calls the compiled code, if it already exists. I’m going to ignore this detail further on.
Now, how is a non-generic virtual method called? Similar to polymorphism in languages like C++, there is a method table accessible from the
thispointer (reference). Each derived class has its own method table and its methods there. So, to call a virtual method, get the reference tothis(passed in as a parameter), from there, get the reference to the method table, look at the correct entry in it (the entry number is constant for specific function) and call the code the entry points to. Calling methods through interfaces is slightly more complicated, but not interesting for us now.Now we need to know about code sharing. Code can be shared between two “instances” of the same method, if reference types in type parameters correspond to any other reference types, and value types are exactly the same. So, for example
C<string>.M<int>()shares code withC<object>.M<int>(), but not withC<string>.M<byte>(). There is no difference between type type parameters and method type parameters. (The original paper from 2001 mentions that code can be shared also when both parameters arestructs with the same layout, but I’m not sure this is true in the actual implementation.)Let’s make an intermediate step on our way to generic methods: non-generic methods in generic types. Because of code sharing, we need to get the type parameters from somewhere (e.g. for calling code like
new T[]). For this reason, each instantiation of generic type (e.g.C<string>andC<object>) has its own type handle, which contains the type parameters and also method table. Ordinary methods can access this type handle (technically a structure confusingly calledMethodTable, even though it contains more than just the method table) from thethisreference. There are two types of methods that can’t do that: static methods and methods on value types. For those, the type handle is passed in as a hidden argument.For non-virtual generic methods, the type handle is not enough and so they get different hidden argument,
MethodDesc, that contains the type parameters. Also, the compiler can’t store the instantiations in the ordinary method table, because that’s static. So it creates a second, different method table for generic methods, which is indexed by type parameters, and gets the method address from there, if it already exists with compatible type parameters, or creates a new entry.Virtual generic methods are now simple: the compiler doesn’t know the concrete type, so it has to use the method table at runtime. And the normal method table can’t be used, so it has to look in the special method table for generic methods. Of course, the hidden parameter containing type parameters is still present.
One interesting tidbit learned while researching this: because the JITer is very lazy, the following (completely useless) code works:
The equivalent C++ code causes the compiler to give up with a stack overflow.