serialisation panic

R

Roedy Green

I just woke up with some panicky thoughts about serialisation. Some
people dream about women; I dream about Java.

1. If a class does not have a no-arg constructor, how is it possible
for serialisation to reconstitute the object?

2. IIRC constructor initialisation does not happen but what about
STATIC fields? Do they get initialised? If so, how? Does it call
the <clinit> static-initialiser method directly?

3. how does initialisation set the private fields of an object?
 
T

Thomas Fritsch

Roedy said:
I just woke up with some panicky thoughts about serialisation. Some
people dream about women; I dream about Java. Poor man ;-)

1. If a class does not have a no-arg constructor, how is it possible
for serialisation to reconstitute the object?
It is not. See the API doc of Serializable:
<quote>
During deserialization, the fields of non-serializable classes will be
initialized using the public or protected no-arg constructor of the
class. A no-arg constructor must be accessible to the subclass that is
serializable.
2. IIRC constructor initialisation does not happen but what about
STATIC fields? Do they get initialised? If so, how? Does it call
the <clinit> static-initialiser method directly?
I would assume so said:
3. how does initialisation set the private fields of an object?
I guess it does so by some JNI code in Sun's DLLs. (Remember, JNI has
the power to set any fields, even the private and final ones.)
 
Z

Zig

It is not. See the API doc of Serializable:
<quote>
During deserialization, the fields of non-serializable classes will be
initialized using the public or protected no-arg constructor of the
class. A no-arg constructor must be accessible to the subclass that is
serializable.
</quote>

True for a class that is non-serializable. Though I think Roedy's question
was about how Serialization does it under the hood for serializable
classes?

For those, I assume Serialization is using the JNI function AllocObject:

<quote>
jobject AllocObject(JNIEnv *env, jclass clazz);

Allocates a new Java object without invoking any of the constructors for
the object. Returns a reference to the object.
I guess it does so by some JNI code in Sun's DLLs. (Remember, JNI has
the power to set any fields, even the private and final ones.)

Maybe, but Reflection can do the same I think (if wrapped into a
AccessController.doPrivileged call).

HTH,

-Zig
 
R

Roedy Green

It is not. See the API doc of Serializable:
<quote>
During deserialization, the fields of non-serializable classes will be
initialized using the public or protected no-arg constructor of the
class. A no-arg constructor must be accessible to the subclass that is
serializable.
</quote>

But what about a serialisable class without a no-arg constructor?
 
Z

Zig

True for a class that is non-serializable. Though I think Roedy's
question was about how Serialization does it under the hood for
serializable classes?

For those, I assume Serialization is using the JNI function AllocObject:

<quote>
jobject AllocObject(JNIEnv *env, jclass clazz);

Allocates a new Java object without invoking any of the constructors for
the object. Returns a reference to the object.
</quote>

Actually, I had to look this up.

From java.io_ObjectStreamClass.newInstance()

Creates a new instance of the represented class. If the class is
externalizable, invokes its public no-arg constructor; otherwise, if the
class is serializable, invokes the no-arg constructor of the first
non-serializable superclass. Throws UnsupportedOperationException if this
class descriptor is not associated with a class, if the associated class
is non-serializable or if the appropriate no-arg constructor is
inaccessible/unavailable.

HTH,
 
T

Thomas Fritsch

Zig said:
True for a class that is non-serializable. Though I think Roedy's
question was about how Serialization does it under the hood for
serializable classes?
Oops, you're right. I confused non-serializable with serializable.
For those, I assume Serialization is using the JNI function AllocObject:

<quote>
jobject AllocObject(JNIEnv *env, jclass clazz);

Allocates a new Java object without invoking any of the constructors
for the object. Returns a reference to the object.
</quote>
Sounds very reasonable to me. Actually for classes like Boolean
(implements Serializable, but has no no-arg-constructor) I see no other
way for (de)serialization than JNI-AllocObject.
 
M

Mike Schilling

Thomas said:
Oops, you're right. I confused non-serializable with serializable.
Sounds very reasonable to me. Actually for classes like Boolean
(implements Serializable, but has no no-arg-constructor) I see no
other way for (de)serialization than JNI-AllocObject.

Can I quibble here? (De)Serialization is one of the facilities in
Java that require native methods. (Another is synchronization via
wait()/notify()). That is, a JVM implementation must implement a way
of creating the object that doesn't involve calling a constructor
immediately thereafter, and make this mechanism available to
readObject(). It's possible that this is the JNI function
AllocObject. It's also possible that the two both call some
lower-level function. A third possibility is that the implementations
are wholly unrelated.
 
R

Roedy Green

<quote>
During deserialization, the fields of non-serializable classes will be
initialized using the public or protected no-arg constructor of the
class. A no-arg constructor must be accessible to the subclass that is
serializable.
</quote>

I am baffled. How could you ever be reconstituting non-serialisable
classes?
 
O

Owen Jacobson

I am baffled. How could you ever be reconstituting non-serialisable
classes?

Serializable classes can be derived from non-serializable classes.
Trivially, java.lang.Object is not serializable...

When reconstituting a serialized blob into an object, the most-derived
non-serializable class is reconstituted using Java language
mechanisms, which allows it to establish invariants, and then the
remaining data is reconstituted by directly manipulating fields on the
object.

Yes, Serialization cheats on construction. It's one of the reasons I
don't like it very much. :)

-o
 
L

Lew

Thomas said:
It is not. See the API doc of Serializable:
<quote>
During deserialization, the fields of non-serializable classes will be
initialized using the public or protected no-arg constructor of the
class. A no-arg constructor must be accessible to the subclass that is
serializable.
</quote>

This says "of non-serializable classes". It doesn't speak to serializable
classes.

AFAIK there is no problem deserializing an instance of a class that lacks a
no-arg constructor as long as the class implements java.io.Serializable.

Static fields are initialized when the class is loaded, which will have
happened already by the time you're trying to deserialize any instances.
I would assume so, since <clinit> is called at class-load-time.

But this has nothing to do with deserialization, of course.

Via the special in-built deserialization mechanism, with possible help from
the readObject() and readResolve() methods. The deserializer pulls the values
of the private fields from the input stream.
 
L

Lew

Owen said:
Yes, Serialization cheats on construction. It's one of the reasons I
don't like it very much. :)

Joshua Bloch covers this extensively in /Effective Java/. He points out that
making a class implement Serializable is a very, very heavy responsibility.
You create a public interface to the class that subverts access controls, and
lock the (non-transient portion of the) class structure in for eternity if you
wish serialized instances to remain compatible.

No one should use Serializable for non-transitory persistence without reading
those items in /Effective Java/.
 
J

j1mb0jay

Lew said:
Joshua Bloch covers this extensively in /Effective Java/. He points out
that making a class implement Serializable is a very, very heavy
responsibility. You create a public interface to the class that subverts
access controls, and lock the (non-transient portion of the) class
structure in for eternity if you wish serialized instances to remain
compatible.

No one should use Serializable for non-transitory persistence without
reading those items in /Effective Java/.

If i wanted to pass a dataset over a socket how would i do it without
making it 'Serializable' ??

j1mb0jay
 
L

Lew

j1mb0jay said:
If i wanted to pass a dataset over a socket how would i do it without
making it 'Serializable' ??

You can write the values by any number of methods, using ordinary Streams or
Sockets. Serializable is for persisting objects (and their reference trees),
not data sets.

Some systems use RMI, IIOP or XML to transmit data.
 
J

j1mb0jay

Lew said:
You can write the values by any number of methods, using ordinary
Streams or Sockets. Serializable is for persisting objects (and their
reference trees), not data sets.

Some systems use RMI, IIOP or XML to transmit data.

I use XML else where in my program, is using a Serializable version of a
dataset frowned upon for poor performance ?

j1mb0jay
 
L

Lew

j1mb0jay said:
I use XML else where in my program, is using a Serializable version of a
dataset frowned upon for poor performance ?

It's not the best choice because Serializable has such devastating effects on
the maintenance cost of a class. Classes contain behavior as well as data.
If all you want to do is persist data, Serializable is waaaay overkill.

Performance is not the issue. Labor is. Flexibility is. Compatibility is.

Performance is far from the most important factor in software.
 
Z

Zig

You can write the values by any number of methods, using ordinary
Streams or Sockets. Serializable is for persisting objects (and their
reference trees), not data sets.

Some systems use RMI, IIOP or XML to transmit data.

IIRC, to use RMI, one must use Serialization.

From:
http://java.sun.com/javase/6/docs/platform/rmi/spec/rmi-protocol4.html

<quote>
Call and return data in RMI calls are formatted using the Java Object
Serialization protocol
</quote>

If I'm not mistaken, when exchanging data through RMI, your values must
either be primitive, Serializable, or Remote (to which references will be
replaced by Serializable stubs at transmission time).

That said, at the moment I'm a bit to lazy to write an RMI client & server
to see what happens when you pass a non-serializable object through...
 
L

Lew

Zig said:
IIRC, to use RMI, one must use Serialization.

From:
http://java.sun.com/javase/6/docs/platform/rmi/spec/rmi-protocol4.html

<quote>
Call and return data in RMI calls are formatted using the Java Object
Serialization protocol
</quote>

If I'm not mistaken, when exchanging data through RMI, your values must
either be primitive, Serializable, or Remote (to which references will
be replaced by Serializable stubs at transmission time).

That said, at the moment I'm a bit to lazy to write an RMI client &
server to see what happens when you pass a non-serializable object
through...

Good catch, thanks.

It remains that there are many protocols to transmit data or objects. Since
the OP's question was about data particularly, not object graphs, Serializable
is almost certainly overkill for their needs. They should use a
lighter-weight and safer protocol.
 
R

Roedy Green

Yes, Serialization cheats on construction. It's one of the reasons I
don't like it very much. :)

Perhaps they should have insisted serialisable classes have a no-arg
constructor, possibly private.

Are there circumstances where fields of non-serialisable superclasses
don't get saved/restored?

..
 
R

Roedy Green

You can write the values by any number of methods, using ordinary Streams or
Sockets. Serializable is for persisting objects (and their reference trees),
not data sets.

you can use DataOutputStream directly or use it to create packets you
send in a raw byte stream.
 
R

Roedy Green

I use XML else where in my program, is using a Serializable version of a
dataset frowned upon for poor performance ?

see http://mindprod.com/jgloss/serialization.html
for the downsides.

The biggest problem is fragility. If you change any of your classes,
there is a good chance you will never again be able to read your data
files. It is ROYAL pain to update the format of serialised files. You
need both the old and new formats. Yet both formats belong to a class
of the same name.

It is for "transient data" where you can afford to just toss your
files and recreate them from scratch or from some sort of formatted
files.

The huge advantage is you can write out a giant complicated tree of
data with one line of code that does not need to be maintained.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,431
Messages
2,571,678
Members
48,796
Latest member
Greg L.

Latest Threads

Top