One thread here said you can serialize object and count serialized
bytes to get the size of the object
This is incorrect. I serialize small class with one boolean and 2
chars and
got something crazy like 637 bytes for the size.
This has all kinds of problems.
First, objects may be larger in memory than serialized, due to
transient fields. As others noted, measuring size in memory is best
done by something like
Runtime rt = Runtime.getRuntime();
System.gc();
int usage = rt.totalMemory() - rt.freeMemory();
MyObject myObject = new MyObject(args);
System.gc();
int size = rt.totalMemory() - rt.freeMemory() - usage;
This still will vary from VM to VM, and may not work perfectly
(System.gc() doesn't guarantee the gc runs, so it can err high if
transient objects are made and discarded by the MyObject constructor
and the second System.gc() does nothing, and it can err low if the
first System.gc() does nothing and the second does and collects some
garbage).
Measuring the size of serialized objects can be done more reliably,
since for an identical object it will be identical on all platforms
given the same version of the object's class. It may vary from
instance to instance depending on what objects it references or
contains via its member variables though. Still it will give you an
idea of how much disk space or bandwidth serialized instances will
consume in bulk.
But serialized output contains overhead; this will be most of your 637
bytes. I'd serialize an N-element array of MyObjects and an N+1-
element array of MyObjects, both with every array cell containing a
MyObject (rather than null), and look at the difference in their file
sizes. I'd make the fields that would tend to reference shared objects
reference a single instance from all these MyObjects so that their
referents "don't count" in the final calculation, and the fields that
would tend to reference "owned" objects or "contained" ones reference
separate ones for each instance so that their referents do count. This
will give the best idea of scaling behavior when a large number of
MyObjects are serialized on a single stream. Your original figure of
637 bytes is, on the other hand, unfortunately exactly what you can
expect if each one is serialized on a separate stream.
Note that the suggested method of measurement should end up summing
the MyObject size as the size of its fields, with fields of reference
type being the size of a pointer or some equivalent, plus the size of
the objects an instance references with such fields and actually
"owns".