The implementation of generics in Java is an interesting one. Unlike most main stream languages Java generics do not carry the generic type information till runtime. So in this article we will look at the design/implementation rationale behind Java generics, specifically Type Erasure.
The C++ Way
C++ have templates which is more or less similar to generics. During compilation each instance of a template class with type T is blown in to separate class.
So in code, if two version of Stack<int> and Stack<string> are used then two versions of Stack classes gets generated by the compiler. Since the heavy lifting is done by the compiler, it has zero runtime cost. In fact runtime has no idea of generic types because at execution they are simply two different classes.
The C# Way
C# compiler use the generic type information to make the classes strongly typed. Unlike C++, C# compiler does not generate code for the classes at compilation. Instead, the type information is passed on to CLR in IL. At runtime the JIT will create the necessary classes for each type Stack<int> and Stack<string>. What this means is, The IL should have extra instruction to encode the generified types information. Even though having type information at runtime has many benefits. This blows the instruction cache in the CPU.
The Java Way
In some sense Java seems to strike the right balance between best of both worlds. In Java the generic type information is erased by the compiler after type checking. So we get strong type checking at compile time. And then every generic class when instantiated Stack<Integer> and Stack<String> will be replaced with Object for T. So essentially at runtime only one class is handed over to JVM. Since runtime do not have to care about generic type information, There are no special instructions in the byte code to deal with generics.
This decay of Type information to Object by compiler is called as Type Erasure. There are bunch of advantages and weird disadvantages with type erasure. We will discuss some of these to the end of this article. For now let see why did Java choose type erasure.
Why Java generics are implemented with type erasure?
This has to do with multiple constraints that Java had. One of them being, the migration cost. Java introduced generics in Java 1.5 release and the burden of safely migrating existing millions of lines of code. It already has a huge set of non generic libraries which used Object as their element type. For example, the Vector class treats every object that it is going to be stored as Object. With the help of type erasure we can easily create a generic Vector which will eventually degrade to the original Vector implementation. Also with type erasure the Java runtime(JVM) does not require any modifications like adding new instruction to handle generic types.
What does Java generics really mean?
The way to look at a declaration like Book<String> book = new Book<String>(); is
“Hey compiler! here is an object called Book and make sure none of the code inside the class violate what can be done with T substituted with String. And once validated remove T and replace T with Object everywhere”
“Hey Runtime! here is an object called Book(whose T is already replaced by Object by compiler), you just go execute it blindly as any other non generic object!”
Java Generic Caveats
This example compiles fine because getBook return type T which is valid. Now lets see what happens if we add a new method called getBook which returns Object.
This example do not compile because the return type of the second method deduced by the compiler is also Object! So the above example after compilation becomes
This is the classic method overload resolution failure.
Error:(14, 19) java: method getBook() is already defined in class com.company.Book
There are many more such caveats to Java generics. Even though most of these caveats may not hurt a casual programmer, But its always a good to know what is going on one layer below.