Java generics and type erasure

M

Marcin Pietraszek

Hi!

Some time ago I've encountered stange behaviour while using generics,
small example is provided in gist:

https://gist.github.com/977599

Anybody could explain me why in second example (line commended with
"compilation failure") compilation fails? Do you know any detailed
description on how and when type erasure works in java?
 
J

John B. Matthews

Marcin Pietraszek said:
Some time ago I've encountered stange behaviour while using generics,
small example is provided in gist:

https://gist.github.com/977599

Anybody could explain me why in second example (line commended with
"compilation failure") compilation fails?

You neglected to specify the type parameter for foo2, specified in the
declaration Foo<T>. Without the actual type, <Boolean>, the compiler
can only infer that get() returns Object, as would have been the case
prior to generics:

import java.util.*;

public class Foo<T> {

private Map<String, Integer> bar = new HashMap<String, Integer>();

public static void main(String... args) {
Foo<Boolean> foo1 = new Foo<Boolean>();
Integer x1 = foo1.bar.get("x"); // ok

Foo<Boolean> foo2 = new Foo<Boolean>();
Integer x2 = foo2.bar.get("x"); // compilation failure
}
Do you know any detailed description on how and when type erasure
works in java?

"All of these parameterized types share the same class at runtime."

<http://java.sun.com/docs/books/jls/third_edition/html/classes.html#8.1.2>

See also, Bloch, ch. 5:

<http://java.sun.com/docs/books/effective/>
 
L

Lew

You should post the code in the message instead of out of band.

import java.util.*;

public class Foo<T> {

private Map<String, Integer> bar = new HashMap<String, Integer>();

public static void main(String ... args) {
Foo<Boolean> foo1 = new Foo<Boolean>();
Integer x1 = foo1.bar.get("x"); // ok

Foo foo2 = new Foo<Boolean>();
Integer x2 = foo2.bar.get("x"); // compilation failure
}

}
You neglected to specify the type parameter for foo2, specified in the
declaration Foo<T>. Without the actual type,<Boolean>, the compiler
can only infer that get() returns Object, as would have been the case
prior to generics:

import java.util.*;

public class Foo<T> {

private Map<String, Integer> bar = new HashMap<String, Integer>();

public static void main(String... args) {
Foo<Boolean> foo1 = new Foo<Boolean>();
Integer x1 = foo1.bar.get("x"); // ok

Foo<Boolean> foo2 = new Foo<Boolean>();
Integer x2 = foo2.bar.get("x"); // compilation failure
}
"All of these parameterized types share the same class at runtime."

<http://java.sun.com/docs/books/jls/third_edition/html/classes.html#8.1.2>

See also, Bloch, ch. 5:

<http://java.sun.com/docs/books/effective/>

It's subtler than that. The generic parameter is on 'Foo', not the map. The
map type does not depend on the type parameter. So the question is why the
'get()' returns 'Object'. Naively, one would expect the type of 'bar' to
resolve to 'Map<String, Integer>' no matter what '<T>' is, or isn't.

The only thing I can think of is that leaving the type parameter out in the
containing class means that all bets are off, and the compiler gives up on
generics throughout that variable's depth. Thus it doesn't even bother to
parse the 'bar' parameters, defaulting thus to '<?,?>'. Consequently, the
type of the expression 'foo2.bar.get("x")' is 'Object'. The assignment target
of that expression is 'Integer', and that requires an explicit downcast,
omitted in the code along with the type parameter.

This is an object lesson (pun intended) in how bad it really can be to omit
the type parameter.

I haven't looked up chapter and verse on this reasoning yet. Anyone care to
rise to the challenge?

A couple of observations:

Type erasure has absolutely nothing to do with this. Type erasure happens at
run time. It will never create a compiler error.

The direct dot reference to a 'private' member is a bit dodgy, though allowed
in the class's own 'main()'.
 
M

Marcin Pietraszek

John said:
It's subtler than that.  The generic parameter is on 'Foo', not the map..  The
map type does not depend on the type parameter.  So the question is whythe
'get()' returns 'Object'.  Naively, one would expect the type of 'bar' to
resolve to 'Map<String, Integer>' no matter what '<T>' is, or isn't.

The only thing I can think of is that leaving the type parameter out in the
containing class means that all bets are off, and the compiler gives up on
generics throughout that variable's depth.  Thus it doesn't even botherto
parse the 'bar' parameters, defaulting thus to '<?,?>'.  Consequently, the
type of the expression 'foo2.bar.get("x")' is 'Object'.  The assignmenttarget
of that expression is 'Integer', and that requires an explicit downcast,
omitted in the code along with the type parameter.

This is an object lesson (pun intended) in how bad it really can be to omit
the type parameter.

I haven't looked up chapter and verse on this reasoning yet.  Anyone care to
rise to the challenge?

To be honest before asking on group I've tried to search the answer in
JLS, unfortunatelly without any success -- I have same observations as
yours but I didn't manage to find exact explanation.
 
I

Ian Shef

Lew said:
It's subtler than that. The generic parameter is on 'Foo', not the map.
The map type does not depend on the type parameter. So the question is
why the 'get()' returns 'Object'. Naively, one would expect the type of
'bar' to resolve to 'Map<String, Integer>' no matter what '<T>' is, or
isn't.

The only thing I can think of is that leaving the type parameter out in
the containing class means that all bets are off, and the compiler gives
up on generics throughout that variable's depth.
Correct (although "gives up" is an unfair characterization). One of the
things that I learned recently about generics is that using a raw type
e.g. Foo foo2 = ...
causes the compiler to treat everything within that type as raw (with respect
to the particular variable that was declared with the raw type).
That is, any use of foo will now be treated as if everything within the Foo
class was defined as a raw type. Thus,
private Map<String, Integer> bar = new HashMap<String, Integer>();
now gets treated as if it was written
private Map bar = new HashMap();

Now (as far as the compiler is concerned) foo2.bar.get("x") produces an
Object and not the Integer that one naively might expect.
Thus it doesn't even
bother to parse the 'bar' parameters, defaulting thus to '<?,?>'.
Consequently, the type of the expression 'foo2.bar.get("x")' is
'Object'. The assignment target of that expression is 'Integer', and
that requires an explicit downcast, omitted in the code along with the
type parameter.

This is an object lesson (pun intended) in how bad it really can be to
omit the type parameter.

I haven't looked up chapter and verse on this reasoning yet. Anyone
care to rise to the challenge?
It better be someplace in the JLS but I have not looked. I prefer to use
Angelika Langer's Java Generics FAQs. FAQ_203 says (in part):

"Fields of a raw type have the type that they would have after type erasure."
See
<http://www.angelikalanger.com/GenericsFAQ/FAQSections/ParameterizedTypes.htm
l#FAQ203>

That is somewhat subtle but describes the situation observed here. I have
seen a clearer explanation elsewhere recently but cannot recall where.
Sorry.
A couple of observations:

Type erasure has absolutely nothing to do with this. Type erasure
happens at run time. It will never create a compiler error.

Au contraire. The Java Tutorial says "When a generic type is instantiated,
the compiler translates those types by a technique called type erasure — a
process where the compiler removes all information related to type parameters
and type arguments within a class or method."
See
<http://download.oracle.com/javase/tutorial/java/generics/erasure.html>

I interpret this to mean that type erasure happens at compile time - not at
run time.
 
L

Lew

Au contraire. The Java Tutorial says "When a generic type is instantiated,
the compiler translates those types by a technique called type erasure � a
process where the compiler removes all information related to type parameters
and type arguments within a class or method."
See
<http://download.oracle.com/javase/tutorial/java/generics/erasure.html>

I interpret this to mean that type erasure happens at compile time - not at
run time.

OK, I see what you mean, but that's only after the compiler's enforcement of
the type parameters. If they don't fly, the erasure doesn't happen. So what
I meant, and said incorrectly, that *to the compiler*, the generic type
exists, but *to the runtime* it's already been erased. So the fact that the
type parameter has been erased shows up at run time, and the type information
in the parameter shows up at compile time.

You are correct. The difference between how you and I think of it is that I'm
thinking in terms of where you see the type-parameter errors, that is, on
input to the compiler. You're seeing where the actual erasure occurs, that
is, on output from the compiler.

I will be more rigorous in my explanation of that henceforth. Thanks for the
nuance.
 
I

Ian Shef

OK, I see what you mean, but that's only after the compiler's
enforcement of the type parameters. If they don't fly, the erasure
doesn't happen. So what I meant, and said incorrectly, that *to the
compiler*, the generic type exists, but *to the runtime* it's already
been erased. So the fact that the type parameter has been erased shows
up at run time, and the type information in the parameter shows up at
compile time.

Sounds like we are in violent agreement.
You are correct. The difference between how you and I think of it is
that I'm thinking in terms of where you see the type-parameter errors,
that is, on input to the compiler. You're seeing where the actual
erasure occurs, that is, on output from the compiler.

More violent agreement: Type erasure takes place at compile time and
becomes visible at run time.
I will be more rigorous in my explanation of that henceforth. Thanks
for the nuance.

OK. I didn't mean to get pedantic but my fingers ran away on their own.
Be rigorous if you want to; some will get the nuance (I'm glad you did) and
some won't.

By the way, the description of the particular situation examined here
(where instance fields of a type are treated as raw because the type itself
is used raw) seem to be covered by version 3 of the JLS in paragraph 4.8.
I am not sure exactly which sentence in 4.8 covers this, but I am sure that
it is there because the examples show similar situations. Perhaps it is
covered by this:

"The type of a constructor (§8.8), instance method (§8.8, §9.4), or non-
static field (§8.3) M of a raw type C that is not inherited from its
superclasses or superinterfaces is the erasure of its type in the generic
declaration corresponding to C."
 
L

Lawrence D'Oliveiro

The only reason I can think of for not doing this (the logic seems simple
enough to implement) is that it turned out doing so would break legacy
code that used util collections' raw types. Was that why?

Yup. Much of the complexity attendant on introducing generics into Java was
precisely because of that need for backward compatibility.
Perhaps there should be a compile flag that turns on the legacy-compatible
behavior for use when compiling 1.4 and older sources, but which is off by
default?

What happens when you mix code compiled with that flag, with code that was
compiled without?
 
S

Susan Calvin

Yup. Much of the complexity attendant on introducing generics into Java
was precisely because of that need for backward compatibility.


What happens when you mix code compiled with that flag, with code that
was compiled without?

Why, nothing, of course, since generics don't exist at run-time. Both

Integer x = aFoo.m1.get("quux");

(in the file compiled with the flag) and

Integer x = (Integer)(aFoo.m1.get("quux"));

(in the file compiled without it) would compile to the same bytecode,
including a checkcast for Integer.

This is only an issue for source compatibility, not binary compatibility.

And there are far worse problems with generics and especially with
autoboxing. For example:

int x = aFoo.m1.get("quux");

Guess what happens if "quux" is not found? There should probably be a
shorthand way to check for this -- maybe a variation on the ?: operator
that tests its left hand side for null, evaluates to it if it's not, and
evaluates to its right hand side otherwise, e.g.

int x = aFoo.m1.get("quux")?:-1

perhaps would make x -1 as a sentinel for "not found" in this instance.

Actually, the Java 5 features have several rough spots that aren't easy
to smooth over. Java really should have been designed with generics and
better integration of primitive types into the type system from the
beginning. Now we have these ad hoc bandaid solutions that show visible
seams here and there and we're probably going to have to live with them
for the next forty years, just as we're living now with 40-year-old COBOL
code.
 
L

Lawrence D'Oliveiro

Why, nothing, of course, since generics don't exist at run-time.

If theat were the case, then there would be no backward-compatibility issue,
would there? And the whole machinery of “raw†types could just disappear in
a puff of un-necessity.
 
S

Susan Calvin

If theat were the case, then there would be no backward-compatibility
issue, would there? And the whole machinery of “raw†types could just
disappear in a puff of un-necessity.

No -- presumably, it would have caused problems compiling 1.4 source code
against 1.5's libraries, not running existing 1.4 bytecode linked to 1.5
libraries.
 
L

Lawrence D'Oliveiro

No -- presumably, it would have caused problems compiling 1.4 source code
against 1.5's libraries, not running existing 1.4 bytecode linked to 1.5
libraries.

In Java terms, what’s the difference?
 
E

Esmond Pitt

In Java terms, what’s the difference?

The difference is that generic type-signatures are present in .class
files for compilation purposes even though generics are erased to their
lower bounds at runtime, so that the compiler can enforce the semantics
of generics.
 
L

Lew

The difference is that generic type-signatures are present in .class files for
compilation purposes even though generics are erased to their lower bounds at
runtime, so that the compiler can enforce the semantics of generics.

This contradicts the statement that Ian Shef made upthread that type erasure
occurs in compilation. It seems I was correct after all, and that it really
does happen at runtime.
 
J

John B. Matthews

Lew <[email protected]> said:
This contradicts the statement that Ian Shef made upthread that type
erasure occurs in compilation.

I don't see a contradiction. Since 1.5, the compiler puts Type and
GenericDeclaration information in the class file for access via
reflection, as shown using javap; but the information is not otherwise
used at runtime.
It seems I was correct after all, and that it really does happen at
runtime.

I'm still not seeing a disparity. The compiler interprets generic
declarations and stores them in the class file. At runtime, those types
are available to introspection but not checked for validity, a job
already done at compile time.

As you observed earlier, the problem is more subtle than my initial
response. I welcome any clarification.
 
E

Esmond Pitt

This contradicts the statement that Ian Shef made upthread that type
erasure occurs in compilation. It seems I was correct after all, and
that it really does happen at runtime.

No, it doesn't contradict that statement at all. Read what I wrote
properly. Types are erased at compile time. They don't appear in the
bytecode. Look for yourself. They do appear in method signatures, for
the compiler. I said all that rather carefully.
 
L

Lew

Esmond said:
No, it doesn't contradict that statement at all. Read what I wrote properly.

What parts did you write properly, and what parts did you write improperly?
Types are erased at compile time. They don't appear in the bytecode. Look for
yourself. They do appear in method signatures, for the compiler. I said all
that rather carefully.

Here's how I will think of it:

Generics make a difference at compile time. They don't make a difference at
run time.

Then I can ignore what is and isn't in the bytecode.
 
T

Tom McGlynn

No, it doesn't contradict that statement at all. Read what I wrote
properly. Types are erased at compile time. They don't appear in the
bytecode. Look for yourself. They do appear in method signatures, for
the compiler. I said all that rather carefully.

Perhaps the confusion here may be due to an ambiguity in the phrase
'generics are erased to ... at runtime'
which I could interpret in two ways. It can mean the something is
actively erasing them at runtime, or it can mean that they have the
static characteristic of having been erased to xxx at runtime. The
first would suggest that type erasure is happening at runtime while
the later implies that erasure has already happened, i.e., at compile
time. While I gather you intended the later, I think someone
inferring the former has a reasonable case too.

Regards,
Tom McGlynn
 
E

Esmond Pitt

Generics make a difference at compile time. They don't make a difference
at run time.
Correct.

Then I can ignore what is and isn't in the bytecode.

Unless you are asserting, and I quote, that 'it really does happen at
runtime'. If there is no bytecode that executes that action, it really
doesn't happen at runtime.
 
L

Lew

Unless you are asserting, and I quote, that 'it really does happen at
runtime'. If there is no bytecode that executes that action, it really doesn't
happen at runtime.

What I'm asserting is that there are no generics at runtime, and that there
are at compile time, so the state of things having been erased already happens
at runtime and the state of things not having yet been erased happens at
compile time. It follows from this that the erasure itself happens between
compile time and runtime, i.e., during the process of compilation as you and
Ian said.

The meaning of the assertion you cite was that the generics information that
is present in the bytecode is ignored at runtime, as you yourself said, so
that's the "it" that happens at runtime. My statement thereof was far from
clear. I should have indicated that then.

I am most emphatically not saying that it is not compilation that performs the
erasure. As I stated upthread, my perspective has to do with when generics
are relevant, not when they cease to be. They are relevant at compilation;
they are not relevant at runtime. It is in that sense that I think of erasure
as being a runtime phenomenon. I am more interested in the state of having
been erased vs. not having been erased than in the brief moment during which
the erasure happens. Pfft.

So generics are a compile-time phenomenon; the condition of them having been
erased is a runtime phenomenon.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,535
Members
45,008
Latest member
obedient dusk

Latest Threads

Top