I am trying to build a Java to C++ trans-compiler (i.e. Java code goes in, semantically “equivalent” (more or less) C++ code comes out).
Not considering garbage collection, the languages are quite familiar, so the overall process works quite well already. One issue, however, are generics which do not exist in C++. Of course, the easiest way would be to perform erasure as done by the java compiler. However, the resulting C++ code should be nice to handle, so it would be good if I would not lose generic type information, i.e., it would be good, if the C++ code would still work with List<X> instead of List. Otherwise, the C++ code would need explicit casting everywhere where such generics are used. This is bug-prone and inconvenient.
So, I am trying to find a way to somehow get a better representation for generics. Of course, templates seem to be a good candidate. Although they are something completely different (metaprogramming vs. compile-time only type enhancement), they could still be useful. As long as no wildcards are used, just compiling a generic class to a template works reasonably well. However, as soon as wildcards come into play, things get really messy.
For example, consider the following java constructor of a list:
class List<T>{
List(Collection<? extends T> c){
this.addAll(c);
}
}
//Usage
Collection<String> c = ...;
List<Object> l = new List<Object>(c);
how to compile this? I had the idea of using chainsaw reinterpret cast between templates. Then, the upper example could be compiled like that:
template<class T>
class List{
List(Collection<T*> c){
this.addAll(c);
}
}
//Usage
Collection<String*> c = ...;
List<Object*> l = new List<Object*>(reinterpret_cast<Collection<Object*>>(c));
however, the question is whether this reinterpret cast produces the expected behaviour. Of course, it is dirty. But will it work? Usually, List<Object*> and List<String*> should have the same memory layout, as their template parameter is only a pointer. But is this guaranteed?
Another solution I thought of would be replacing methods using wildcards by template methods which instanciate each wildcard parameter, i.e., compile the constructor to
template<class T>
class List{
template<class S>
List(Collection<S*> c){
this.addAll(c);
}
}
of course, all other methods involving wildcards, like addAll would then also need template parameters. Another problem with this approach would be handling wildcards in class fields for example. I cannot use templates here.
A third approach would be a hybrid one: A generic class is compiled to a template class (call it T<X>) and an erased class (call it E). The template class T<X> inherits from the erased class E so it is always possible to drop genericity by upcasting to E. Then, all methods containing wildcards would be compiled using the erased type while others could retain the full template type.
What do you think about these methods? Where do you see the dis-/advantages of them?
Do you have any other thoughts of how wildcards could be implemented as clean as possible while keeping as much generic information in the code as possible?
If the goal is to represent Java semantics in C++, then do so in the most direct way. Do not use
reinterpret_castas its purpose is to defeat the native semantics of C++. (And doing so between high-level types almost always results in a program that is allowed to crash.)You should be using reference counting, or a similar mechanism such as a custom garbage collector (although that sounds unlikely under the circumstances). So these objects will all go to the heap anyway.
Put the generic
Listobject on the heap, and use a separate class to access that as aList<String>or whatever. This way, the persistent object has the generic type that can handle any ill-formed means of accessing it that Java can express. The accessor class contains just a pointer, which you already have for reference counting (i.e. it subclasses the “native” reference, not an Object for the heap), and exposes the appropriately downcasted interface. You might even be able to generate the template for the accessor using the generics source code. If you really want to try.