java object member layout question

Discussion in 'Java' started by Jimmy Zhang, May 14, 2004.

  1. Jimmy Zhang

    Jimmy Zhang Guest

    Hi, I heard from somewhere that Java object incurs 8 bytes of memory
    overhead by default
    It sounds to me that 8 bytes is like 2 32-bit pointers. One is probably a
    reference counter. (correct me if I am worng) The other one, I guess, is a
    pointer to the method table??

    Also for an array the overhead is 16 bites. Can some one explain what those
    spaces are for?

    I am also looking for reference of the answers to my question. Should I go
    to JVM spec for answer to those?
    Cheers,
    Jimmy
    Jimmy Zhang, May 14, 2004
    #1
    1. Advertising

  2. Jimmy Zhang

    Chris Smith Guest

    Jimmy Zhang wrote:
    > Hi, I heard from somewhere that Java object incurs 8 bytes of memory
    > overhead by default
    > It sounds to me that 8 bytes is like 2 32-bit pointers. One is probably a
    > reference counter. (correct me if I am worng) The other one, I guess, is a
    > pointer to the method table??
    >
    > Also for an array the overhead is 16 bites. Can some one explain what those
    > spaces are for?
    >
    > I am also looking for reference of the answers to my question. Should I go
    > to JVM spec for answer to those?


    You need a clear understanding of the difference between specification
    and implementation. The JVM specification does not offer a single clue
    as to how much memory overhead is involved in a single object.
    Different implementations of that spec will probably incur different
    amounts of overhead. It's in fact completely meaningless to say that
    "Java" incurs 8 bytes per object of overhead.

    What you might say (and I don't know if any of this is true or not) is
    that version 1.4.2_02 of the Sun J2SE SDK on Win32/x86 does that. Or
    that some version of the reference implementation of MIDP on the Palm
    does that. Or that some version of JRocket on some platform does that.
    Or so on...

    That said, there are some aspects of the JVM and language/API specs that
    impact this, and there are some common implementation techniques that
    might give you a hint:

    Each object owns its behavior, and thus needs to identify its class
    somehow. This probably means accessing a function pointer table. It
    requires about 4 bytes on a 32-bit platform.

    Each object also needs an identity hashcode. If the object is
    stationary in memory then this can be its address, but that restricts
    the garbage collection options rather a lot, so an object may carry an
    identity hashcode with it, requiring an extra 4 bytes (regardless of
    platform).

    Each object may have a monitor. Since this is not used for the majority
    of objects, it's probable that the monitor is created on-the-fly when
    needed, but the object might reserve space for pointing to its monitor.

    Arrays need a size, which must be at least 31 bits long, and is likely
    to go ahead and occupy at least 4 bytes.

    There are also concerns with the size of references and padding, which
    are very likely to change the memory use of an object between platforms.

    --
    www.designacourse.com
    The Easiest Way to Train Anyone... Anywhere.

    Chris Smith - Lead Software Developer/Technical Trainer
    MindIQ Corporation
    Chris Smith, May 14, 2004
    #2
    1. Advertising

  3. Jimmy Zhang

    Roedy Green Guest

    On Thu, 13 May 2004 21:50:29 -0600, Chris Smith <>
    wrote or quoted :

    >Each object also needs an identity hashcode. If the object is
    >stationary in memory then this can be its address, but that restricts
    >the garbage collection options rather a lot, so an object may carry an
    >identity hashcode with it, requiring an extra 4 bytes (regardless of
    >platform).


    Needs or often has cached? Could it not compute it on the fly as
    needed?

    One of the great aha moments of my life came from reading Stroustrup's
    book where he explained how VTBLs worked to handle inheritance. I had
    tried vainly for days to invent the concept for my own language
    Abundance and gave up and used something fast but that required the
    programmer to set up the override tables explicitly.

    Java has a simpler time of it than C++ because you have no embedded
    objects, just references to objects.


    --
    Canadian Mind Products, Roedy Green.
    Coaching, problem solving, economical contract programming.
    See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.
    Roedy Green, May 14, 2004
    #3
  4. Jimmy Zhang

    Roedy Green Guest

    On Thu, 13 May 2004 21:50:29 -0600, Chris Smith <>
    wrote or quoted :

    >There are also concerns with the size of references and padding, which
    >are very likely to change the memory use of an object between platforms.


    also alignment. On some platforms you would want objects to start
    on even 4, 8 or 16 byte boundaries for speed.

    --
    Canadian Mind Products, Roedy Green.
    Coaching, problem solving, economical contract programming.
    See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.
    Roedy Green, May 14, 2004
    #4
  5. Jimmy Zhang

    Chris Smith Guest

    Roedy Green wrote:
    > On Thu, 13 May 2004 21:50:29 -0600, Chris Smith <>
    > wrote or quoted :
    >
    > >Each object also needs an identity hashcode. If the object is
    > >stationary in memory then this can be its address, but that restricts
    > >the garbage collection options rather a lot, so an object may carry an
    > >identity hashcode with it, requiring an extra 4 bytes (regardless of
    > >platform).

    >
    > Needs or often has cached? Could it not compute it on the fly as
    > needed?


    I fail to see how you could compute an identity hashcode on the fly, if
    the object is mobile in memory (as in a copying collector). If you can
    inform me of this, I'd be quite interested.

    --
    www.designacourse.com
    The Easiest Way to Train Anyone... Anywhere.

    Chris Smith - Lead Software Developer/Technical Trainer
    MindIQ Corporation
    Chris Smith, May 14, 2004
    #5
  6. Jimmy Zhang

    Chris Uppal Guest

    Chris Smith wrote:

    > I fail to see how you could compute an identity hashcode on the fly, if
    > the object is mobile in memory (as in a copying collector). If you can
    > inform me of this, I'd be quite interested.


    There's a really beautiful technique described by Ole Agesen in "Space and
    Time-Efficient Hashing of Garbage-Collected Objects" which you can find at:

    <http://www.sunlabs.com/research/java-topics/pubs/>

    In essence objects use 2 bits of header data to support something very like
    on-the-fly generation of hash codes.

    When the object is born it has no hash code.

    When someone asks for the "identity" hash code it answers its /current/
    address, and sets one bit to say that it has done so.

    When the GC moves the object, it inspects the bit and, if it's set, it expands
    the object by 32bits as it copies it. This leaves room to put the old
    "identity" hash code into the object explicitly. The GC clears the first bit
    and sets different one to mean that the object now has the expanded layout.

    So when an object is asked for its hash code it may answer its address, or the
    value from the extra slot depending on the header bits. Since most objects are
    never asked for their hash, this saves space.

    The technique can be extended to generate "identity" hash codes that are
    assigned by a pseudo-random number generator, for instance, rather than the
    object's address (which can tend to be rather ill-spread).

    -- chris
    Chris Uppal, May 14, 2004
    #6
  7. Jimmy Zhang

    Chris Smith Guest

    Chris Uppal wrote:
    > There's a really beautiful technique described by Ole Agesen in "Space and
    > Time-Efficient Hashing of Garbage-Collected Objects" which you can find at:
    >
    > <http://www.sunlabs.com/research/java-topics/pubs/>


    That is, indeed very interesting.

    --
    www.designacourse.com
    The Easiest Way to Train Anyone... Anywhere.

    Chris Smith - Lead Software Developer/Technical Trainer
    MindIQ Corporation
    Chris Smith, May 14, 2004
    #7
  8. Jimmy Zhang

    Roedy Green Guest

    On Fri, 14 May 2004 07:11:00 -0600, Chris Smith <>
    wrote or quoted :

    >
    >I fail to see how you could compute an identity hashcode on the fly, if
    >the object is mobile in memory (as in a copying collector). If you can
    >inform me of this, I'd be quite interested.


    it could be its handle. It could be the Adlerian crc of all the
    bytes in the object.
    --
    Canadian Mind Products, Roedy Green.
    Coaching, problem solving, economical contract programming.
    See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.
    Roedy Green, May 14, 2004
    #8
  9. Jimmy Zhang

    Chris Smith Guest

    > Chris Smith <> wrote :
    > >I fail to see how you could compute an identity hashcode on the fly, if
    > >the object is mobile in memory (as in a copying collector). If you can
    > >inform me of this, I'd be quite interested.


    Roedy Green wrote:
    > it could be its handle. It could be the Adlerian crc of all the
    > bytes in the object.


    It could be its handle, if you assume a JVM implementation where handles
    exist (and I assume that by handle, you mean the double-indirection
    technique that was used for objects in versions 1.0 and 1.1 of Sun's
    JVM?)

    The CRC of the object's data could not be used for an identity hashcode,
    since the identity hashcode is guaranteed not to change even if all of
    the object's state actually does change. Hence the need for additional
    data in the first place.

    --
    www.designacourse.com
    The Easiest Way to Train Anyone... Anywhere.

    Chris Smith - Lead Software Developer/Technical Trainer
    MindIQ Corporation
    Chris Smith, May 14, 2004
    #9
  10. Jimmy Zhang

    Roedy Green Guest

    On Fri, 14 May 2004 11:09:55 -0600, Chris Smith <>
    wrote or quoted :

    >The CRC of the object's data could not be used for an identity hashcode,
    >since the identity hashcode is guaranteed not to change even if all of
    >the object's state actually does change. Hence the need for additional
    >data in the first place.


    Is this ordinary Object.hashCode or something else?

    --
    Canadian Mind Products, Roedy Green.
    Coaching, problem solving, economical contract programming.
    See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.
    Roedy Green, May 14, 2004
    #10
  11. Chris Smith wrote:
    > The CRC of the object's data could not be used for an identity hashcode,
    > since the identity hashcode is guaranteed not to change even if all of
    > the object's state actually does change.


    There is no such guarantee, at least not for Java's System.identityHashCode()
    method. It merely gives the hashcode the object would give if it didn't
    override Object.hashCode() and for *that* method there are no guarantees,
    only the statement that "As much as is reasonably practical, the hashCode
    method defined by class Object does return distinct integers for distinct
    objects."
    Michael Borgwardt, May 14, 2004
    #11
  12. Jimmy Zhang

    Chris Smith Guest

    Michael Borgwardt wrote:
    > Chris Smith wrote:
    > > The CRC of the object's data could not be used for an identity hashcode,
    > > since the identity hashcode is guaranteed not to change even if all of
    > > the object's state actually does change.

    >
    > There is no such guarantee, at least not for Java's System.identityHashCode()
    > method. It merely gives the hashcode the object would give if it didn't
    > override Object.hashCode() and for *that* method there are no guarantees,
    > only the statement that "As much as is reasonably practical, the hashCode
    > method defined by class Object does return distinct integers for distinct
    > objects."


    The guarantee is hard to come by, but it is there. The contract for
    Object.hashCode says:

    Whenever it is invoked on the same object more than once during an
    execution of a Java application, the hashCode method must
    consistently return the same integer, provided no information used
    in equals comparisons on the object is modified.

    So to find the characteristics of the default implementation inherited
    from java.lang.Object, you are referred to the documentation for
    Object.equals, which says:

    The equals method for class Object implements the most
    discriminating possible equivalence relation on objects; that is,
    for any reference values x and y, this method returns true if and
    only if x and y refer to the same object (x==y has the value true).

    So the only information used for equals in the java.lang.Object class is
    the identity of the object, and not any of its state. The hashCode
    implementation from java.lang.Object, to comply with its contract, must
    therefore not return different integers for the same object over its
    lifetime.

    --
    www.designacourse.com
    The Easiest Way to Train Anyone... Anywhere.

    Chris Smith - Lead Software Developer/Technical Trainer
    MindIQ Corporation
    Chris Smith, May 14, 2004
    #12
  13. Jimmy Zhang

    Chris Smith Guest

    Roedy Green wrote:
    > On Fri, 14 May 2004 11:09:55 -0600, Chris Smith <>
    > wrote or quoted :
    >
    > >The CRC of the object's data could not be used for an identity hashcode,
    > >since the identity hashcode is guaranteed not to change even if all of
    > >the object's state actually does change. Hence the need for additional
    > >data in the first place.

    >
    > Is this ordinary Object.hashCode or something else?


    It's the identity hash code... that is, the hash code that would be
    returned by Object.hashCode assuming that it's not overridden. It's
    still available, though, regardless of the overriding of
    Object.hashCode... by calling System.identityHashCode.

    --
    www.designacourse.com
    The Easiest Way to Train Anyone... Anywhere.

    Chris Smith - Lead Software Developer/Technical Trainer
    MindIQ Corporation
    Chris Smith, May 14, 2004
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Rick Spiewak
    Replies:
    3
    Views:
    3,119
    Rick Spiewak
    Aug 26, 2003
  2. RobertH
    Replies:
    1
    Views:
    701
    Steve C. Orr [MVP, MCSD]
    Nov 4, 2003
  3. NWx
    Replies:
    4
    Views:
    2,941
    Kevin Spencer
    Feb 19, 2004
  4. Eric
    Replies:
    4
    Views:
    688
    clintonG
    Dec 24, 2004
  5. Replies:
    1
    Views:
    553
    John Timney \(MVP\)
    Jun 19, 2006
Loading...

Share This Page