Memory Footprint of an Object

C

Codedigestion

Peace,

I would like some help to be able to get the memory footprint of an
object that has been created. Effectively, I look to find the
footprint that a DOM Document of a parsed XML file takes in the memory.

God Bless,

shree
 
?

=?ISO-8859-1?Q?Arne_Vajh=F8j?=

Codedigestion said:
I would like some help to be able to get the memory footprint of an
object that has been created. Effectively, I look to find the
footprint that a DOM Document of a parsed XML file takes in the memory.

My best idea is getting used memory before and after
construction the object.

See below for a primitive example.

Arne

package october;

public class SizeOf<T> {
private final static int N = 1000000;
public static long mem() {
Runtime rt = Runtime.getRuntime();
return rt.totalMemory() - rt.freeMemory();
}
public static void main(String[] args) {
A[] arrA = new A[N];
long m1 = mem();
for(int i = 0; i < N; i++) {
arrA = new A();
}
long m2 = mem();
System.out.println("sizeof A = " + (m2 - m1)*1.0/N);
B[] arrB = new B[N];
long m3 = mem();
for(int i = 0; i < N; i++) {
arrB = new B();
}
long m4 = mem();
System.out.println("sizeof B = " + (m4 - m3)*1.0/N);
}
}

class A {
public int a;
public int b;
}

class B {
public byte b1;
public byte b2;
}
 
B

Bart

Arne Vajhøj wrote:
public class SizeOf<T> {

You don't seem to be using the generic type anywhere.
private final static int N = 1000000;
public static long mem() {
Runtime rt = Runtime.getRuntime();
return rt.totalMemory() - rt.freeMemory();

According to the docs freeMemory() returns an approximation. The docs
for totalMemory() also say this:

Note that the amount of memory required to hold an object of any given
type may be implementation-dependent.

Wouldn't that make your code non-portable if you rely on the object
size? Also, what happens if other threads allocate memory at the same
time?

Regards,
Bart.
 
?

=?ISO-8859-1?Q?Arne_Vajh=F8j?=

Bart said:
You don't seem to be using the generic type anywhere.

That was a leftover from a non successful attempt
to use generics.

It can just be removed.
According to the docs freeMemory() returns an approximation. The docs
for totalMemory() also say this:

Note that the amount of memory required to hold an object of any given
type may be implementation-dependent.

Wouldn't that make your code non-portable if you rely on the object
size? Also, what happens if other threads allocate memory at the same
time?

I think you missed the context.

If you write a test program to investigate how much
a certain object use of memory, then you do not start
other threads in that test program.

Yes - it is an approximation, but the posters problem
was for XML DOM objects. I assume big since else no need
to ask. So I can not see any significance in the decimals.

Arne
 
B

Bart

Arne said:
That was a leftover from a non successful attempt
to use generics.

It can just be removed.


I think you missed the context.

If you write a test program to investigate how much
a certain object use of memory, then you do not start
other threads in that test program.

Oh, I thought you were suggesting a general-purpose solution for
finding memory footprint at run-time. My bad.
Yes - it is an approximation, but the posters problem
was for XML DOM objects. I assume big since else no need
to ask. So I can not see any significance in the decimals.

So creating a million objects of it, as you do above, is not a very
good idea then.

Regards,
Bart.
 
P

Patricia Shanahan

Bart said:
Arne Vajhøj wrote: ....

So creating a million objects of it, as you do above, is not a very
good idea then.

It will depend on just how big each object is. I read the choice of a
million for N as an example.

In practice, I would either start with big N and halve if the job takes
too long or uses too much memory, or start small and double until it
gives stable results.

Patricia
 
?

=?ISO-8859-1?Q?Arne_Vajh=F8j?=

Bart said:
So creating a million objects of it, as you do above, is not a very
good idea then.

Number of objects x object size needs to be big enough
to dwarf the uncertainty

I used small objects.

Hey - it is not a perfect method. But I don't know any better.

Arne
 
C

Codedigestion

Arne said:
Number of objects x object size needs to be big enough
to dwarf the uncertainty

I used small objects.

Hey - it is not a perfect method. But I don't know any better.

Arne

Peace,

I appreciate all of your responses, but is there not a more direct way?
Is there not a way to build a method to ask directly - the object
created how much memory it's taking up? The solutions that have been
recommended would not work because the number of times to build the
object would need to be different depending on the size of the XML file
that is parsed. For example, if you parsed a 10k document, you could
easily make thousands of instances and have no problem, but if you had
a 2MB file, you'd run into problems with just a few instances. So,
need something a bit more dynamic or 'safe', if possible.

Thanks,
God Bless,

shree
 
?

=?ISO-8859-1?Q?Arne_Vajh=F8j?=

I appreciate all of your responses, but is there not a more direct way?
Is there not a way to build a method to ask directly - the object
created how much memory it's taking up? The solutions that have been
recommended would not work because the number of times to build the
object would need to be different depending on the size of the XML file
that is parsed. For example, if you parsed a 10k document, you could
easily make thousands of instances and have no problem, but if you had
a 2MB file, you'd run into problems with just a few instances. So,
need something a bit more dynamic or 'safe', if possible.

I do not know of any better method.

That does not necessarily mean that a better method does not
exist.

For XML files I would try with 1000 x 10KB XML, 100 x 100KB XML and
10 x 1 MB XML and see if there are a reasonable constant factor
so that
size in memory = constant * size of XML file
and use that as rule of thumb.

Arne
 
H

Hendrik Maryns

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Codedigestion schreef:
Peace,

I appreciate all of your responses, but is there not a more direct way?
Is there not a way to build a method to ask directly - the object
created how much memory it's taking up? The solutions that have been
recommended would not work because the number of times to build the
object would need to be different depending on the size of the XML file
that is parsed. For example, if you parsed a 10k document, you could
easily make thousands of instances and have no problem, but if you had
a 2MB file, you'd run into problems with just a few instances. So,
need something a bit more dynamic or 'safe', if possible.

Have a look here:
http://www.javaworld.com/javaworld/javaqa/2003-12/02-qa-1226-sizeof.html,
it works fairly well for me.

H.
- --
Hendrik Maryns
http://tcl.sfs.uni-tuebingen.de/~hendrik/
==================
http://aouw.org
Ask smart questions, get good answers:
http://www.catb.org/~esr/faqs/smart-questions.html
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)

iD8DBQFFSzFbe+7xMGD3itQRAr5JAJsEh5nd9NOobwM7ap8Qklf6m6kCTACfc/G9
pm4qc+HwUfrQ4n+o11EMbHk=
=XALf
-----END PGP SIGNATURE-----
 
C

Chris Uppal

Codedigestion said:
Is there not a way to build a method to ask directly - the object
created how much memory it's taking up?

No. Not in standard Java, and not in anything that runs on a standard JVM.

Why not ? I suspect that it's for two reasons: one is that the Java designers
didn't think of it -- which is understandable (for once), since I know of no
realistic language where you /can/ ask an arbitrary object how much space it
takes[*]. The other is that it is difficult, if not impossible, to provide a
useful answer.

Consider this code fragment:

String greeting = "hello";
String myth = s.substring(0, 4);

It would certainly be possible to ask the Strings referred to by 'greeting' or
'myth' how much space they took up. In a current 32-bit JVM from Sun, they
would each take up 24 bytes (8 bytes for each object header, 3 32-bit integer
fields, and 1 32-bit reference field). There is no way to ask those Strings
that, however (unless you use the debugging API); which is reasonable because
the information would not be very useful. Most people would expect the size of
a String to depend in some way on how many characters it contains -- but it
doesn't.

So, could Strings be clever and add up the sizes of their contained objects ?
No. They could certainly add in the size of the char[] array objects, but
that's no use because they doesn't know how much of that array is logically
"part of" the String itself, and how much is just shared with other Strings (or
StringBuffers, or StringBuilders). In the example, 'myth' and 'greeting' share
their internal char[] arrays -- they both refer to the same array, but they
"use" different sub-sections of it.

-- chris

[*] There may be some, of course, but I don't know of any. And, in case anyone
is thinking of C's sizeof operator:

void
someFunction(char *buffer)
{
/* now, how do you ask the buffer how big it is ? */
}
 
C

Codedigestion

Peace,

hmmm... Maybe it may help to share with all of y'all the intention of
finding the objects size. I'm creating this program for myself (and
for whoever may find the information derived pertinent) to compare the
various different flavors of the xml parsers available for Java like
JDOM, dom4j, XOM, Xerces, Crimson, etc... I want to see the amount of
time required to parse various XML files and the memory footpath the
created document takes in the memory. So far, it's been pretty easy to
indirectly calculate the parse time and the amount of memory the
complete parsing operation with Document has taken. Unfortunately,
these are not ABSOLUTE results, but still can be used when comparing
one xml parser to another.

To calculate the time, the start time and end time of the parse are
taken and the time to parse is derived. This is simple, it works, and
is fairly consistent.

To calculate the memory, the start JVM memory total and free memory are
taken and after the parse, the JVM total and free memory are again
take. From the two sets of memories the used memory is calculated of
before and after parse and these are then used to come to an
'approximation' of the total memory used by the parse. Unfortunately,
this doesn't seem to be so reliable as the time method above. There
seems to be many issues, for example, Garbage Collection and
allocation/deallocation of additional memory by the JVM at various
times. Though, at the end of the parse, after the memory has been
recorded, the gc() is run.

Each XML file is parsed about 5 times and it's interesting to see that
usually, the VERY first parse is the slowest while the later ones seem
faster. Does anyone have any more ideas on other ways to go about
testing the different parsers and getting results so as to make a
better comparison of the parsers to one another. Right now, I would
like to just compare the parsing abilities and have the requirement
fulfilled before testing the other feature the parsers provide.

A final thought, I believe someone has mentioned this, what about
making the class serializable and storing ONLY the created parsed
DOCUMENT to the Serialized File and then seeing the size of the created
file? It may not ABSOLUTELY represent the memory footpath of the
document, but, would it be closer than the other methods discussed thus
far? And, wouldn't the results be more consistent? At least, it may
make comparing the parsers' document's memory footpath to one another.

God Bless,

shree



Chris said:
Codedigestion said:
Is there not a way to build a method to ask directly - the object
created how much memory it's taking up?

No. Not in standard Java, and not in anything that runs on a standard JVM.

Why not ? I suspect that it's for two reasons: one is that the Java designers
didn't think of it -- which is understandable (for once), since I know of no
realistic language where you /can/ ask an arbitrary object how much space it
takes[*]. The other is that it is difficult, if not impossible, to provide a
useful answer.

Consider this code fragment:

String greeting = "hello";
String myth = s.substring(0, 4);

It would certainly be possible to ask the Strings referred to by 'greeting' or
'myth' how much space they took up. In a current 32-bit JVM from Sun, they
would each take up 24 bytes (8 bytes for each object header, 3 32-bit integer
fields, and 1 32-bit reference field). There is no way to ask those Strings
that, however (unless you use the debugging API); which is reasonable because
the information would not be very useful. Most people would expect the size of
a String to depend in some way on how many characters it contains -- but it
doesn't.

So, could Strings be clever and add up the sizes of their contained objects ?
No. They could certainly add in the size of the char[] array objects, but
that's no use because they doesn't know how much of that array is logically
"part of" the String itself, and how much is just shared with other Strings (or
StringBuffers, or StringBuilders). In the example, 'myth' and 'greeting' share
their internal char[] arrays -- they both refer to the same array, but they
"use" different sub-sections of it.

-- chris

[*] There may be some, of course, but I don't know of any. And, in case anyone
is thinking of C's sizeof operator:

void
someFunction(char *buffer)
{
/* now, how do you ask the buffer how big it is ? */
}
 
C

Chris Uppal

Codedigestion wrote:

To calculate the memory, the start JVM memory total and free memory are
taken and after the parse, the JVM total and free memory are again
take. From the two sets of memories the used memory is calculated of
before and after parse and these are then used to come to an
'approximation' of the total memory used by the parse. Unfortunately,
this doesn't seem to be so reliable as the time method above. There
seems to be many issues, for example, Garbage Collection and
allocation/deallocation of additional memory by the JVM at various
times. Though, at the end of the parse, after the memory has been
recorded, the gc() is run.

As a general point, if the results of that sort of testing are not very
consistent, then even if you can get an exact measurement, then it won't be
much use to other people since they won't be able to depend on the numbers. So
if you can't easily get a number accurate to less than, say, 20% then don't
worry. Make it clear that the numbers are very approximate, and that real
world use will vary a lot.

More specifically, I would do the following (Arne's technique, with a couple of
frills).

Parse one document, saving the result, and measuring memory delta as best you
can.
Parse the document again (from scratch), saving the result, and measuring
memory delta as best you can.
.....
Parse the document the Nth time, saving the result, and measuring memory delta
as best you can.

At each step you are adding another document to memorr and the delta should
approximate to the actual use. The variance in deltas should give an estimate
of the uncertainty. But...

Now throw all the parsed documents away (null out the references), /and/ throw
away the measured numbers. Re-run the test code from within the same program
run. Use the numbers from this second run. That should (i.e. "might") tend to
eliminate inconsistencies due to the JVM rearranging its memory (e.g. asking
for more from the OS) as the test progresses.

Probably be worth setting -Xms and -Xmx (to the same value) for the test. It
may reduce some variation, and shouldn't do any harm.

A final thought, I believe someone has mentioned this, what about
making the class serializable and storing ONLY the created parsed
DOCUMENT to the Serialized File and then seeing the size of the created
file? It may not ABSOLUTELY represent the memory footpath of the
document, but, would it be closer than the other methods discussed thus
far? And, wouldn't the results be more consistent? At least, it may
make comparing the parsers' document's memory footpath to one another.

The results might be more consistent, but I wouldn't trust them to be at all
indicative of actual memory consumption. The serialised format is not a simple
dump of a memory image. That is especially true of Strings -- and the
(attempted) efficient use of Strings is one of the things that XML parsers are
likely to differ in.

-- chris
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,777
Messages
2,569,604
Members
45,226
Latest member
KristanTal

Latest Threads

Top