Roedy Green (
[email protected]) wrote:
: On 17 Sep 2003 21:18:28 GMT, (e-mail address removed) (Robert
: Olofsson) wrote or quoted :
: >Hm, no, I would not say that it is that. For readObject to work
: >correctly a class description is neccessary.
: but why does the name of the class need to appear more than once in
: the file?
Exercise:
1) Create a file with serialized data from some objects you have now
(for example an instance of BuildingInfo(adress, owner))
2) change a variable type (split adress -> street and zip or similar)
3) serialize the new object to another file
4) try to read back both the files.
The format of the classes have changed so you will not be able to read
the first file. However if you peek inside the stream with a special
written readObject you can see which version the file is and get the
adress and add some code that splits it as you want and then creates
an instance of the new version.
For this to work the class description needs to describe the type of
the variables and the name of the variables.
A class description also needs the class name since that is the id for
that structure.
This makes it easy to see why data is the way it is today.
Ok, it should be possible to create a more optimized stream where the
class names are only stored once, however then you would need to store
an id for the String and the String. On average you should probably
save a few bytes (String def: <id> <length> <characters>, class name:
<string id>, variable type: L<string id>, an id would probably be 4
bytes).
The cost of doing this optimization would be a bit extra complexity in
the stream.
Serialized streams are not designed to be the most compressed form of
the data, only a binary compact format, for easy reading and writing.
If you serialize ~100 objects of 4 classes, maybe you could save
~20-50 bytes, not a big win, (feel free to try it and give me the
exact count). Class descriptions happen only once in a stream so in
this case youd have 4 - 10 class descriptions (java.lang.String will
have one, as will the arrays you have, as will the java.util.HashSet
you have in your objects...). You will have some of the class names
twice, not a big deal.
And as I have stated before If you want your object streams to be even
more compact, wrap the stream in a GZIP(Input|Output)Stream normally
resulting in something like 50% reduction of the size (for the data I
have tried).
/robo