Deserialization bug: NullPointerException thrown during HashMap.hash()

M

Mike Smith

I am attempting to serialize objects that have sets, but there is a
NullPointerException thrown when deserializing. The exception is
thrown at a point where deserialization calls on one of the my class's
hashCode() methods. Though the fields are null at this point
deserialization, the object sent in had a non-null value in that field.
The deserialization process is attempting to put this object in a
HashMap before deserialization has provided the field values (which
tend to be essential to a hashCode() override!).

The problem can be reproduced with two simple classes and an instance
of each, wired together (code follows). An instance 'a' of A has a
field reference to an instance 'b' of B. b has a set with the single
instance a within, as well as a name field. b's name field is used to
generate its hashCode (as in a natural key). a's hashCode is derived
from b (it's name).

The result is a NullPointerException. Failures duplicated on both
Solaris-based and Mac OS X-based Java 1.4.2.

Here is the code example:

public class A implements java.io.Serializable {
private B b;

public A() {
}

public boolean equals(Object object) {
if (this == object) {
return true;
}
if (object == null || getClass() != object.getClass()) {
return false;
}
A other = (A) object;
return getB().equals(other.getB());
}

public int hashCode() {
return getB().hashCode(); // FAILS HERE in deserialization.
}

public void setB(B b) {
this.b = b;
}

public B getB() {
return b;
}
}

import java.util.Set;

public class B implements java.io.Serializable {
private Set aSet;

private String name;

public B() {
}

public B(String name) {
this.name = name;
}

public boolean equals(Object object) {
if (this == object) {
return true;
}
if (object == null || getClass() != object.getClass()) {
return false;
}
B other = (B) object;
return getName().equals(other.getName());
}

public int hashCode() {
return getName().hashCode();
}

public String getName() {
return name;
}

public void setASet(Set aSet) {
this.aSet = aSet;
}

public Set getASet() {
return aSet;
}
}

import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io_ObjectInputStream;
import java.io_ObjectOutputStream;
import java.util.HashSet;

public class Main {

public Main() {
}

public static void main(String argv[]) {
A a = new A();
B b = new B("FOO");
a.setB(b);
HashSet aSet = new HashSet();
aSet.add(a);
b.setASet(aSet);

try {
FileOutputStream fileOut = new
FileOutputStream("test.srl");
ObjectOutputStream out = new ObjectOutputStream(fileOut);
out.writeObject(a);
fileOut.close();

FileInputStream fileIn = new FileInputStream("test.srl");
ObjectInputStream in = new ObjectInputStream(fileIn);
A rehydratedA = (A) in.readObject();
// Won't get this far, will fail above...
System.out.println(rehydratedA);
} catch (Exception anyException) {
anyException.printStackTrace();
}
}

}

The full stack trace:

java.lang.NullPointerException
at A.hashCode(A.java:26)
at java.util.HashMap.hash(HashMap.java:261)
at java.util.HashMap.put(HashMap.java:379)
at java.util.HashSet.readObject(HashSet.java:277)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:324)
at
java.io_ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:838)
at
java.io_ObjectInputStream.readSerialData(ObjectInputStream.java:1746)
at
java.io_ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646)
at java.io_ObjectInputStream.readObject0(ObjectInputStream.java:1274)
at
java.io_ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845)
at
java.io_ObjectInputStream.readSerialData(ObjectInputStream.java:1769)
at
java.io_ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646)
at java.io_ObjectInputStream.readObject0(ObjectInputStream.java:1274)
at
java.io_ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845)
at
java.io_ObjectInputStream.readSerialData(ObjectInputStream.java:1769)
at
java.io_ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646)
at java.io_ObjectInputStream.readObject0(ObjectInputStream.java:1274)
at java.io_ObjectInputStream.readObject(ObjectInputStream.java:324)
at Main.main(Main.java:32)

- Mike
 
C

Chris Uppal

Mike said:
I am attempting to serialize objects that have sets, but there is a
NullPointerException thrown when deserializing. The exception is
thrown at a point where deserialization calls on one of the my class's
hashCode() methods. Though the fields are null at this point
deserialization, the object sent in had a non-null value in that field.
The deserialization process is attempting to put this object in a
HashMap before deserialization has provided the field values (which
tend to be essential to a hashCode() override!).

That's nasty...

Looks like a bug in the implementation of HashSet.readObject() which assumes
that none of the objects in the map are still "under construction".

One way that you might be able to code around would be to keep a cache of the
hashCode() in instances of A which was serialised /before/ the B instance and
so would be available when the B's Set was being reconstituted. That might be
tricky to get right if B's hash is allowed to change.

Alternatively you could use a custom Set or Map that stored the original
indexes of its elements as part of its serialised form, and which could
therefore reconstitute itself without invoking its elements' hashCode() or
equals() methods.

-- chris
 
C

charles_n_may

If setB(...) can be called more than once on an instance of A, then A's
hashCode can change, which will wreak havoc if A has already been
placed in a HashSet (or HashMap). The hashCode shouldn't possibly
change through the lifetime of A, at least once it's placed in any
structure requiring its hashCode.

I recommend declaring A's instance of B to be final, provide the B
instance in A's constructor, and get rid of the setB method. Since A's
hashCode will depend on B's name, also make B's name final, provide it
in B's constructor, and get rid of the setName method.

Even Java's Integer class falls victim to this non-final declaration.
Integer wraps a non-final int primitive. Even though the int is
private, it can be accessed using some tricky reflection. That means
you could write code to add n different Integers to a Set, then iterate
over the set and use the to-remain-unexplained reflection trick to
change all elements' held values to be the same. You would end up with
a HashSet with size n, whose values are all equal!

The moral of the story is, don't make the fields upon which the
hashCode depends be changeable once the object is instantiated. Integer
is relatively safe because the wrapped value is private and there is no
setter for it. It's not fully safe because the wrapped value isn't
declared final. I recommend you use at least one, but preferably both,
approaches for your classes.

Mike said:
I am attempting to serialize objects that have sets, but there is a
NullPointerException thrown when deserializing. The exception is
thrown at a point where deserialization calls on one of the my class's
hashCode() methods. Though the fields are null at this point
deserialization, the object sent in had a non-null value in that field.
The deserialization process is attempting to put this object in a
HashMap before deserialization has provided the field values (which
tend to be essential to a hashCode() override!).

The problem can be reproduced with two simple classes and an instance
of each, wired together (code follows). An instance 'a' of A has a
field reference to an instance 'b' of B. b has a set with the single
instance a within, as well as a name field. b's name field is used to
generate its hashCode (as in a natural key). a's hashCode is derived
from b (it's name).

The result is a NullPointerException. Failures duplicated on both
Solaris-based and Mac OS X-based Java 1.4.2.

Here is the code example:

public class A implements java.io.Serializable {
private B b;

public A() {
}

public boolean equals(Object object) {
if (this == object) {
return true;
}
if (object == null || getClass() != object.getClass()) {
return false;
}
A other = (A) object;
return getB().equals(other.getB());
}

public int hashCode() {
return getB().hashCode(); // FAILS HERE in deserialization.
}

public void setB(B b) {
this.b = b;
}

public B getB() {
return b;
}
}

import java.util.Set;

public class B implements java.io.Serializable {
private Set aSet;

private String name;

public B() {
}

public B(String name) {
this.name = name;
}

public boolean equals(Object object) {
if (this == object) {
return true;
}
if (object == null || getClass() != object.getClass()) {
return false;
}
B other = (B) object;
return getName().equals(other.getName());
}

public int hashCode() {
return getName().hashCode();
}

public String getName() {
return name;
}

public void setASet(Set aSet) {
this.aSet = aSet;
}

public Set getASet() {
return aSet;
}
}

import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io_ObjectInputStream;
import java.io_ObjectOutputStream;
import java.util.HashSet;

public class Main {

public Main() {
}

public static void main(String argv[]) {
A a = new A();
B b = new B("FOO");
a.setB(b);
HashSet aSet = new HashSet();
aSet.add(a);
b.setASet(aSet);

try {
FileOutputStream fileOut = new
FileOutputStream("test.srl");
ObjectOutputStream out = new ObjectOutputStream(fileOut);
out.writeObject(a);
fileOut.close();

FileInputStream fileIn = new FileInputStream("test.srl");
ObjectInputStream in = new ObjectInputStream(fileIn);
A rehydratedA = (A) in.readObject();
// Won't get this far, will fail above...
System.out.println(rehydratedA);
} catch (Exception anyException) {
anyException.printStackTrace();
}
}

}

The full stack trace:

java.lang.NullPointerException
at A.hashCode(A.java:26)
at java.util.HashMap.hash(HashMap.java:261)
at java.util.HashMap.put(HashMap.java:379)
at java.util.HashSet.readObject(HashSet.java:277)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:324)
at
java.io_ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:838)
at
java.io_ObjectInputStream.readSerialData(ObjectInputStream.java:1746)
at
java.io_ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646)
at java.io_ObjectInputStream.readObject0(ObjectInputStream.java:1274)
at
java.io_ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845)
at
java.io_ObjectInputStream.readSerialData(ObjectInputStream.java:1769)
at
java.io_ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646)
at java.io_ObjectInputStream.readObject0(ObjectInputStream.java:1274)
at
java.io_ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845)
at
java.io_ObjectInputStream.readSerialData(ObjectInputStream.java:1769)
at
java.io_ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646)
at java.io_ObjectInputStream.readObject0(ObjectInputStream.java:1274)
at java.io_ObjectInputStream.readObject(ObjectInputStream.java:324)
at Main.main(Main.java:32)

- Mike
 
C

Chris Uppal

The moral of the story is, don't make the fields upon which the
hashCode depends be changeable once the object is instantiated. Integer
is relatively safe because the wrapped value is private and there is no
setter for it. It's not fully safe because the wrapped value isn't
declared final. I recommend you use at least one, but preferably both,
approaches for your classes.

Pretty good advice in itself (though one often can't follow it). But you'll
notice that the OP's problem is not caused by the hash() changing. It is (IMO)
a genuine bug in the deserialisation code for HashSet.

-- chris
 
C

charles_n_may

On further examination I agree with you. I can think of two
workarounds:

1) Use a TreeSet instead of a HashSet.

2) If A.b is declared final and B.name is declared final (following my
earlier advice), then A's hashCode is known at instantiation, so A's
hashcode could be stored in another final field during the execution of
its constructor, rather than computed on the fly based on B in
A.hashCode(). The resulting hashCode is the same either way.
 
C

charles_n_may

For what it's worth, both workarounds are concrete implementations of
Chris' suggestions in message 2 of this thread...
 
S

Shin

IMHO, the problem is with the design of your code. You should avoid
any kind of circularity in your code (this almost always calls for
trouble). You can draw a diagram of your code. Consider
deserialization: it first call no-arg constructor; then initialize its
fields. In your code: deserialize a -> initialize field a.b ->
deserialize b -> initiaize b.aSet -> deserialize aSet -> (insert a)
compute hashCode of a -> rely on field a.b (that is still waiting to be
initialized in this stack)

So, redesign your code is my advice.

-Shin
 
C

charles_n_may

Also good advice to avoid circular references, but there are
circumstances where they are hard to avoid.

What bothers me about the original design in this thread is that it
models a many-to-one relationship between A and B, but the hashCode of
all instances of A must be the same for a fixed instance of B. That
means B has a HashSet of instances of A, all of which have the same
hashCode. Since the hashCode is the same for all entries in the Set, it
amounts to an overblown implementation of a List which avoids
duplications -- all efficiencies due to the hashing algorithm are lost.
Thus a HashSet is a poor choice to hold this Collection.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,770
Messages
2,569,586
Members
45,088
Latest member
JeremyMedl

Latest Threads

Top