Why "lock" functionality is introduced for all the objects?

Discussion in 'Java' started by Alex J, Jun 28, 2011.

  1. Alex J

    Alex J Guest

    I'm curious why Java designers once decided to allow every object to
    be lockable (i.e. allow using lock on those).
    I know, that out of such a design decision every Java object contain
    lock index, i.e. new Object() results in allocation of at least 8
    bytes where 4 bytes is object index and 4 bytes is lock index on 32-
    bit JVM.
    I think that it just inefficient waste of space, because not all the
    objects requires to be lockable/waitable.

    The better decision, IMHO, would be to introduce lock/wait mechanics
    for only, say, the Lockable descendants.
    The current approach seems to be very simple, but is the performance
    penalty so small for not be taken into an account?
    Eclipse uses tons of small objects and I guess that is why it consumes
    so much memory while a significant part of it is never used.

    What do you think of it?
     
    Alex J, Jun 28, 2011
    #1
    1. Advertising

  2. Alex J

    Guest

    OT "sic" (was Re: Why "lock" functionality is introduced for all the objects?

    In article <iuce66$2nh$>, Lew <> wrote:
    > Alex J wrote:
    > > I'm curious why Java designers once decided to allow every object to
    > > be lockable (i.e. [sic] allow using lock on those).


    "[sic]"? The only thing that seems wrong here to me is the absence
    of a comma after "i.e.". Have I missed an error here ....

    [ snip ]

    > > The better decision, IMHO, would be to introduce lock/wait mechanics
    > > for only, say, the Lockable descendants.

    >
    > Oh, yeah, your opinion is humble.


    Seemed that way to me. Your reply, Lew, seems a little testy even
    by your standards. Just sayin'. <shrug>

    [ snip ]

    --
    B. L. Massingill
    ObDisclaimer: I don't speak for my employers; they return the favor.
     
    , Jun 28, 2011
    #2
    1. Advertising

  3. Alex J

    Lew Guest

    Re: OT "sic" (was Re: Why "lock" functionality is introduced forall the objects?

    wrote:
    > Lew wrote:
    >> Alex J wrote:
    >>> I'm curious why Java designers once decided to allow every object to
    >>> be lockable (i.e. [sic] allow using lock on those).

    >
    > "[sic]"? The only thing that seems wrong here to me is the absence
    > of a comma after "i.e.". Have I missed an error here ....


    That'd be it. Good eye.

    The trouble I had with "IMHO" is the guy spouting off about supposed
    inefficiencies in Java with regard to one of the most key features of the
    language, and presuming to offer a so-called "better" solution out of sheer
    ignorance. That is the antithesis of humility.

    --
    Lew
    Honi soit qui mal y pense.
    http://upload.wikimedia.org/wikipedia/commons/c/cf/Friz.jpg
     
    Lew, Jun 28, 2011
    #3
  4. Lew wrote:

    > Alex J wrote:
    >> I'm curious why Java designers once decided to allow every object to
    >> be lockable (i.e. [sic] allow using lock on those).

    >
    > Because that makes it possible to do concurrent programming intrinsically.
    >


    Could you elaborate more on that?
    Do you mean there is no other way to do it?

    I find this question quite intriguing as well since it looks quite useless
    for example to be able to lock on java.lang.Integer instance (and it is
    strange for me that java.lang.Integer instance occupies much more memory as
    int). Surely a compromise must have been done taking into account various
    language features ("synchronized" keyword, lack of multiple inheritance,
    lack of closures) - but I am not that knowlegeable enough to explain this
    compromise in detail.

    >> I know, that out of such a design decision every Java object contain
    >> lock index, i.e. new Object() results in allocation of at least 8
    >> bytes where 4 bytes is object index and 4 bytes is lock index on 32-
    >> bit JVM.
    >> I think that it just inefficient waste of space, because not all the
    >> objects requires to be lockable/waitable.

    >
    > Well, that's your opinion.
    >


    It is not only his opinion - the size of object header is important
    especially on memory constrained devices. But not only - there is a reason
    why UseCompressedOops flag was introduced in 64 bit HotSpot JVM.

    >> The better decision, IMHO, would be to introduce lock/wait mechanics
    >> for only, say, the Lockable descendants.

    >
    > Oh, yeah, your opinion is humble.
    >
    >> The current approach seems to be very simple, but is the performance
    >> penalty so small for not be taken into an account?

    >
    > Yes. Nonexistent, really.
    >


    I wouldn't say so - see:
    http://wikis.sun.com/display/HotSpotInternals/CompressedOops
    Especially the sentence:
    "Memory is pretty cheap, but these days bandwidth and cache is in short
    supply"

    --
    Michal
     
    Michal Kleczek, Jun 28, 2011
    #4
  5. Alex J

    Lew Guest

    On 06/28/2011 12:41 PM, Michal Kleczek wrote:
    > Lew wrote:
    >
    >> Alex J wrote:
    >>> I'm curious why Java designers once decided to allow every object to
    >>> be lockable (i.e. [sic] allow using lock on those).

    >>
    >> Because that makes it possible to do concurrent programming intrinsically.
    >>

    >
    > Could you elaborate more on that?
    > Do you mean there is no other way to do it?


    No, I don't mean that, and I don't see how it follows from what I said.

    To elaborate, building monitors into 'Object' (and perforce its descendants)
    means that multi-threaded programming is intrinsic, that is, built in to the
    very nature of objects in Java. This makes the simple case for
    synchronization, well, simple. You create an object of arbitrary type and you
    can use it as a lock (strictly speaking, a Hoare monitor). This means that
    any class can achieve a measure of thread safety by the proper use of
    'synchronized (this)' or the implicit equivalents. The idiom is pervasive and
    extremely useful in Java. It is arguably one of the key aspects to Java's
    success as a language.

    > I find this question quite intriguing as well since it looks quite useless
    > for example to be able to lock on java.lang.Integer instance (and it is


    So don't lock on an Integer instance, then.

    > strange for me that java.lang.Integer instance occupies much more memory as


    "strange"? Based on what?

    'int' is a primitive, available for convenience because primitives are so
    useful without the overhead of objectness. That might not be an optimal
    decision, although many successful languages have made the same choice. (C,
    C++, C#, Java.) Given the long history of languages with the same dichotomy,
    I find it strange that you find it strange.

    > int). Surely a compromise must have been done taking into account various
    > language features ("synchronized" keyword, lack of multiple inheritance,
    > lack of closures) - but I am not that knowlegeable enough to explain this
    > compromise in detail.


    Java's supposed "lack" of closures, given the admittedly more verbose
    alternatives that actually do exist, poses little if any problem. Java does
    allow multiple inheritance of contract, just not of implementation, and
    actually that distinction makes for clean, elegant code. The 'synchronized'
    keyword depends for its utility on the very feature in dispute in this thread,
    namely the presence of a monitor in every object. Far from being a
    compromise, this is a key strength of the Java language.

    >>> I know, that out of such a design decision every Java object contain
    >>> lock index, i.e. new Object() results in allocation of at least 8
    >>> bytes where 4 bytes is object index and 4 bytes is lock index on 32-
    >>> bit JVM.
    >>> I think that it just inefficient waste of space, because not all the
    >>> objects requires to be lockable/waitable.

    >>
    >> Well, that's your opinion.
    >>

    >
    > It is not only his opinion - the size of object header is important
    > especially on memory constrained devices. But not only - there is a reason
    > why UseCompressedOops flag was introduced in 64 bit HotSpot JVM.



    OK, that's your opinion, too.

    >>> The better decision, IMHO, would be to introduce lock/wait mechanics
    >>> for only, say, the Lockable descendants.

    >>
    >> Oh, yeah, your opinion is humble.
    >>
    >>> The current approach seems to be very simple, but is the performance
    >>> penalty so small for not be taken into an account?

    >>
    >> Yes. Nonexistent, really.
    >>

    >
    > I wouldn't say so - see:
    > http://wikis.sun.com/display/HotSpotInternals/CompressedOops
    > Especially the sentence:
    > "Memory is pretty cheap, but these days bandwidth and cache is in short
    > supply"


    Show me the numbers. What penalty? Compared to what instead? If you give up
    a feature, you have to add it some other way - what would be the inefficiency
    of Java's approach compared to the alternative?

    And give us some measurements to support the claim of any "penalty".

    Don't forget that HotSpot saves memory as well as increases speed, depending
    ont he optimizations involved at any given moment. Have you taken that into
    consideration?

    --
    Lew
    Honi soit qui mal y pense.
    http://upload.wikimedia.org/wikipedia/commons/c/cf/Friz.jpg
     
    Lew, Jun 28, 2011
    #5
  6. Alex J

    Stefan Ram Guest

    Stefan Ram, Jun 28, 2011
    #6
  7. Lew wrote:

    > On 06/28/2011 12:41 PM, Michal Kleczek wrote:
    >> Lew wrote:
    >>
    >>> Alex J wrote:
    >>>> I'm curious why Java designers once decided to allow every object to
    >>>> be lockable (i.e. [sic] allow using lock on those).
    >>>
    >>> Because that makes it possible to do concurrent programming
    >>> intrinsically.
    >>>

    >>
    >> Could you elaborate more on that?
    >> Do you mean there is no other way to do it?

    >
    > No, I don't mean that, and I don't see how it follows from what I said.
    >
    > To elaborate, building monitors into 'Object' (and perforce its
    > descendants) means that multi-threaded programming is intrinsic, that is,
    > built in to the
    > very nature of objects in Java. This makes the simple case for
    > synchronization, well, simple. You create an object of arbitrary type and
    > you
    > can use it as a lock (strictly speaking, a Hoare monitor). This means
    > that any class can achieve a measure of thread safety by the proper use of
    > 'synchronized (this)' or the implicit equivalents. The idiom is pervasive
    > and
    > extremely useful in Java. It is arguably one of the key aspects to Java's
    > success as a language.
    >


    The point is - I don't see a need to be able to use _every_ object as a
    monitor. But I have to pay the price (memory usage) of this possibility.

    >> I find this question quite intriguing as well since it looks quite
    >> useless for example to be able to lock on java.lang.Integer instance (and
    >> it is

    >
    > So don't lock on an Integer instance, then.
    >
    >> strange for me that java.lang.Integer instance occupies much more memory
    >> as

    >
    > "strange"? Based on what?
    >
    > 'int' is a primitive, available for convenience because primitives are so
    > useful without the overhead of objectness. That might not be an optimal
    > decision, although many successful languages have made the same choice.
    > (C,
    > C++, C#, Java.) Given the long history of languages with the same
    > dichotomy, I find it strange that you find it strange.
    >


    "Strange" is a wrong word - it simply is a PITA :)
    For example it makes it unfeasible to use java.util collections in numerical
    algorithms.
    C++ and C# have their own way of dealing with this - templates and generics
    without erasure.

    >> int). Surely a compromise must have been done taking into account various
    >> language features ("synchronized" keyword, lack of multiple inheritance,
    >> lack of closures) - but I am not that knowlegeable enough to explain this
    >> compromise in detail.

    >
    > Java's supposed "lack" of closures, given the admittedly more verbose
    > alternatives that actually do exist, poses little if any problem.
    > Java
    > does allow multiple inheritance of contract, just not of implementation,


    If Java had multiple implementation inheritance and closures there would be
    no need for "synchronized" keyword at all (the modifier could be added as a
    syntactic sugar).

    class Foo extends public Bar, private Monitor {
    void method() {
    //synchronized is defined in Monitor class
    synchronized({
    //do stuff
    });
    }
    }

    > and
    > actually that distinction makes for clean, elegant code. The
    > 'synchronized' keyword depends for its utility on the very feature in
    > dispute in this thread,
    > namely the presence of a monitor in every object. Far from being a
    > compromise, this is a key strength of the Java language.
    >


    Yet in the end the community seems to agree not to use "synchronized"
    directly but rather use classes from java.util.concurrent (namely Lock and
    Condition). So is this keyword really that important?

    [snip]
    >>>> The current approach seems to be very simple, but is the performance
    >>>> penalty so small for not be taken into an account?
    >>>
    >>> Yes. Nonexistent, really.
    >>>

    >>
    >> I wouldn't say so - see:
    >> http://wikis.sun.com/display/HotSpotInternals/CompressedOops
    >> Especially the sentence:
    >> "Memory is pretty cheap, but these days bandwidth and cache is in short
    >> supply"

    >
    > Show me the numbers. What penalty?


    It is (almost) twice as much memory as it could be and twice as much GC
    cycles. Almost because in real world the number of objects that you need to
    synchronize on is way lower than the number of all objects you create.

    > Compared to what instead? If you
    > give up a feature, you have to add it some other way - what would be the
    > inefficiency of Java's approach compared to the alternative?
    >


    That is actually something I would like to hear as well and - as I read it -
    what OP asked for - the discussion of pros and cons of different approaches
    with some explanation of why it is done like this in Java.
    And your answer looks like "that's the way it is and it is the best way it
    can be done".

    --
    Michal
     
    Michal Kleczek, Jun 28, 2011
    #7
  8. Alex J

    Lew Guest

    Michal Kleczek wrote:
    > Lew wrote:
    >> Show me the numbers. What penalty?

    >
    > It is (almost) twice as much memory as it could be and twice as much GC
    > cycles. Almost because in real world the number of objects that you need to


    Nonsense. It's an extra 4 bytes per object. Most objects are much larger
    than 4 bytes, so it's far, far less than "twice as much memory".

    Similarly there's no way four extra bytes per object double the number of GC
    cycles.

    Anyhow, when I ask you to show me the numbers, I mean real numbers, not
    made-up speculative nonsense. What Java version, host, OS, configuration and
    workload? What was the actual, measured, real, verified impact?

    Show me the numbers.

    > synchronize on is way lower than the number of all objects you create.
    >
    >> Compared to what instead? If you
    >> give up a feature, you have to add it some other way - what would be the
    >> inefficiency of Java's approach compared to the alternative?


    > That is actually something I would like to hear as well and - as I read it -
    > what OP asked for - the discussion of pros and cons of different approaches
    > with some explanation of why it is done like this in Java.
    > And your answer looks like "that's the way it is and it is the best way it
    > can be done".


    How does it look like that? How do you get to misrepresent what I say and
    attribute different points to me from what I actually said?

    What I actually said was the answer to the OP's question - why was it done
    that way? There was a reason. Nowhere did I say or imply that it's the best
    way it can be done. In fact, I explicitly disavowed that interpretation.
    You're being dishonest here. Stop that.

    --
    Lew
    Honi soit qui mal y pense.
    http://upload.wikimedia.org/wikipedia/commons/c/cf/Friz.jpg
     
    Lew, Jun 28, 2011
    #8
  9. On 28/06/2011 2:13 PM, Lew wrote:
    > Michal Kleczek wrote:
    >> Lew wrote:
    >>> Show me the numbers. What penalty?

    >>
    >> It is (almost) twice as much memory as it could be and twice as much GC
    >> cycles. Almost because in real world the number of objects that you
    >> need to

    >
    > Nonsense. It's an extra 4 bytes per object. Most objects are much larger
    > than 4 bytes,


    Bullpuckey, other than that a nontrivial object is always at least 12
    bytes due to Java's bloated object headers plus alignment. Ignoring
    that, it's quite common to have lots of small objects at runtime -- from
    boxed primitives (4 bytes for most and 8 bytes for Longs and Doubles,
    plus object headers) to short strings (two bytes per character plus four
    for the length field = 8 for a two-letter word and 4 for an empty string
    -- again, plus object headers) and the occasional content-free
    pure-behavior object (abstract factories, strategy pattern
    implementations, event listeners, plus most of the things you'd use an
    anonymous inner class for ...). Small collections are a bit larger but
    an 8 byte object header is still likely to be a fair percentage; and
    their iterators may contain as little as a single integer index plus a
    single pointer (ArrayList's likely does) and so be the same size as a
    Long or a Double.

    And then there's all the objects with one or two reference type fields.
    Four bytes each, on a 32-bit JVM. You can't count the "nested" objects'
    own sizes, because they each have their own object header.

    Objects with many fields are quite a bit rarer than ones with only one
    or two fields. *Classes* with many fields are less common, and usually
    there will be fewer live instances at runtime.

    Ultimately, overhead fraction = average bytes directly in fields (at
    most 4 for most fields on 32-bit systems, excepting long and double
    primitive fields) divided by header size, where the average is over a
    *typical population of live objects in a running system* rather than
    over a set of, say, classes in use in a system.
     
    supercalifragilisticexpialadiamaticonormalizeringe, Jun 28, 2011
    #9
  10. Alex J

    Lew Guest

    On 06/28/2011 02:23 PM,
    supercalifragilisticexpialadiamaticonormalizeringelimatisticantations wrote:
    > On 28/06/2011 2:13 PM, Lew wrote:
    >> Michal Kleczek wrote:
    >>> Lew wrote:
    >>>> Show me the numbers. What penalty?
    >>>
    >>> It is (almost) twice as much memory as it could be and twice as much GC
    >>> cycles. Almost because in real world the number of objects that you
    >>> need to

    >>
    >> Nonsense. It's an extra 4 bytes per object. Most objects are much larger
    >> than 4 bytes,

    >
    > Bullpuckey, other than that a nontrivial object is always at least 12 bytes


    So 4 bytes overhead is less than 100%, as I said.

    > due to Java's bloated object headers plus alignment. Ignoring that, it's quite
    > common to have lots of small objects at runtime -- from boxed primitives (4
    > bytes for most and 8 bytes for Longs and Doubles, plus object headers) to
    > short strings (two bytes per character plus four for the length field = 8 for
    > a two-letter word and 4 for an empty string -- again, plus object headers) and


    Most strings in a typical program are non-empty and generally longer than two
    bytes. A good percentage are interned. Strings in many runtime contexts
    refer to substrings of those already in memory, saving overhead.

    Integer objects make up a small fraction of most programs. Many Integer
    instances are shared, especially if one follows best practices. Not a lot of
    memory pressure there.

    Double and Long, even fewer.

    > the occasional content-free pure-behavior object (abstract factories, strategy
    > pattern implementations, event listeners, plus most of the things you'd use an
    > anonymous inner class for ...). Small collections are a bit larger but an 8
    > byte object header is still likely to be a fair percentage; and their


    What percentage is "fair"? Surely less than 100%, as I claim.
    \
    > iterators may contain as little as a single integer index plus a single
    > pointer (ArrayList's likely does) and so be the same size as a Long or a Double.
    >
    > And then there's all the objects with one or two reference type fields. Four
    > bytes each, on a 32-bit JVM. You can't count the "nested" objects' own sizes,
    > because they each have their own object header.
    >
    > Objects with many fields are quite a bit rarer than ones with only one or two
    > fields. *Classes* with many fields are less common, and usually there will be
    > fewer live instances at runtime.
    >
    > Ultimately, overhead fraction = average bytes directly in fields (at most 4
    > for most fields on 32-bit systems, excepting long and double primitive fields)
    > divided by header size, where the average is over a *typical population of
    > live objects in a running system* rather than over a set of, say, classes in
    > use in a system.


    You show only that the overhead of 4 bytes per object is less than 100% of the
    object's memory footprint, which is what I said.

    Which footprint can be reduced by HotSpot, to the point of pulling an object
    out of the heap altogether.

    Where are the numbers? Everyone's arguing from speculation. Show me the numbers.

    Real numbers. From actual runs. What is the overhead, really? Stop making
    shit up.

    Show me the numbers.

    --
    Lew
    Honi soit qui mal y pense.
    http://upload.wikimedia.org/wikipedia/commons/c/cf/Friz.jpg
     
    Lew, Jun 28, 2011
    #10
  11. Alex J

    Lew Guest

    On 06/28/2011 01:33 PM, Stefan Ram wrote:
    > Alex J<> writes:
    >> What do you think of it?

    >
    > I do not think, but use a web search engine and find:
    >
    > http://c2.com/cgi/wiki?JavaObjectOverheadIsRidiculous


    Refers to Java 1.2.2. Things have changed significantly since then, including
    the loss of a word from object pointers.

    > http://www.trevorpounds.com/blog/?p=351
    >
    > . And here is a rationale given for why every object has a lock:
    >
    > http://forums.oracle.com/forums/thread.jspa?threadID=1140765
    >
    > , that is, so that one can use »synchronized« on object
    > methods (which stands for »synchronized( this )«).


    It is evident that Java's design introduces overhead. Duh. But it's not the
    wild claim of 100% overhead. That's just stupid.

    How much that overhead is in practice depends on HotSpot and what idioms would
    be needed to replace the lost feature of inbuilt synchronization.

    Given that Java's design does introduce a cost, the question remains - for
    what benefit? We give up some memory - did we save on developer cost? Did we
    save on runtime crashes? Did HotSpot optimize away the unused cruft?

    We need to know the real numbers. How much does Java's design cost an actual
    program, and what would it have cost that program to have lacked that design
    feature?

    People are throwing around terms like "bloated" but only focusing on half the
    cost-benefit analysis, picking numbers out of their butts, and exaggerating
    those numbers to boot. That can only lead to suboptimal decisions.

    --
    Lew
    Honi soit qui mal y pense.
    http://upload.wikimedia.org/wikipedia/commons/c/cf/Friz.jpg
     
    Lew, Jun 28, 2011
    #11
  12. On 28/06/2011 2:33 PM, Lew wrote:
    > On 06/28/2011 02:23 PM,
    > supercalifragilisticexpialadiamaticonormalizeringelimatisticantations
    > wrote:
    >> On 28/06/2011 2:13 PM, Lew wrote:
    >>> Michal Kleczek wrote:
    >>>> Lew wrote:
    >>>>> Show me the numbers. What penalty?
    >>>>
    >>>> It is (almost) twice as much memory as it could be and twice as much GC
    >>>> cycles. Almost because in real world the number of objects that you
    >>>> need to
    >>>
    >>> Nonsense. It's an extra 4 bytes per object. Most objects are much larger
    >>> than 4 bytes,

    >>
    >> Bullpuckey, other than that a nontrivial object is always at least 12
    >> bytes

    >
    > So 4 bytes overhead is less than 100%, as I said.


    I didn't dispute that. I disputed "most objects are much larger than 4
    bytes". Most objects are only a little bit larger than 4 bytes.

    > Most strings in a typical program are non-empty and generally longer
    > than two bytes.


    A lot longer, though, or only a little?

    > A good percentage are interned.


    Not in my experience.

    > Strings in many runtime contexts refer to substrings of those already
    > in memory, saving overhead.


    Not in my experience. And substring sharing is a three-edged sword, with
    two possible downsides:

    1. A small string hangs onto a much larger char array than is needed,
    the rest of which is unused but can't be collected.

    2. Small strings are even less efficient if one adds an offset as well
    as a length field to the string, besides the pointer to the char
    array.

    And let's not forget that a string incurs the object overhead twice,
    once for the string and once for the embedded array, assuming that array
    ISN'T (and it usually isn't) shared.

    So we're looking at one object header going along with 12 bytes of
    offset, length, pointer to array; then another going along with 4 bytes
    of length and 2 per character for the actual array contents. For an
    eight-character string we have 16 bytes of actual data and 32 bytes of
    overhead from two redundant (if the array isn't shared) length fields,
    an offset field, a pointer, and two 8-byte object headers. That's 33%
    meat and 67% fat, folks. For an EIGHT character string. A
    substrings-are-separate implementation fares somewhat better: eight byte
    object header, four byte pointer, eight byte object header, four byte
    length, array contents, for 24 rather than 32 bytes of cruft. Still 60%
    overhead. If Java had const and typedef and auxiliary methods so you
    could just declare that String is another name for const char[] and tack
    on the String methods, you'd get away with just 12 bytes of overhead:
    array object header and length field. Now the 8 character string is
    actually more than 50% meat instead of fat. Well, unless you count all
    the empty space between the probably-ASCII-bytes ... encoding them
    internally as UTF-8 would save a lot more space in the common case.
    Maybe we should assume that only about 60% of the space taken up by the
    actual chars in the string is non-wasted, presuming a low but nonzero
    prevalence of characters outside of ISO-8859-1; now a ten character
    string has four wasted bytes internally, plus the object header/various
    fields of overhead. Still somewhat icky.

    Java strings are quite inefficient any way you slice 'em. But at least
    we can get their lengths in O(1) instead of O(n). Take *that*, C
    weenies! Oh, wait, most strings are short and it wouldn't take many
    cycles to find their lengths at O(n) anyway ...

    > Integer objects make up a small fraction of most programs. Many Integer
    > instances are shared, especially if one follows best practices. Not a
    > lot of memory pressure there.


    Not my experience again, not since 1.5. Before autoboxing was introduced
    you might have been right; now I expect there's a lot of "hidden" (i.e.,
    the capitalized classname doesn't appear in code much) use of boxed
    primitives, particularly in collections.

    > You show only that the overhead of 4 bytes per object is less than 100%
    > of the object's memory footprint, which is what I said.


    Keep on attacking that straw man ...

    > Which footprint can be reduced by HotSpot, to the point of pulling an
    > object out of the heap altogether.


    ???

    > Where are the numbers? Everyone's arguing from speculation. Show me the
    > numbers.
    >
    > Real numbers. From actual runs. What is the overhead, really? Stop
    > making shit up.


    Stop accusing me of lying when I've done nothing of the sort.

    > Show me the numbers.


    http://c2.com/cgi/wiki?JavaObjectOverheadIsRidiculous

    People ran tests and found an 8 byte overhead per object, much as was
    claimed earlier in this thread. Oh, and that an array of java.awt.Points
    containing pairs of ints is 60% overhead and 40% actual integer values
    in the limit of many array elements -- so array-related overheads
    (object header, length field) go away. That suggests something eating
    another 4 bytes per object *on top of* the Points' object headers and
    integer values, showing that Point has some extra field in it taking up
    space.

    And in regard to the original topic of this thread,

    http://c2.com/cgi/wiki?EveryObjectIsaMonitor

    raises some very good points, including that forcing people to use the
    java.util.concurrent classes (while making "synchronized"
    exception-safely lock a ReentrantLock, or similar) or having objects
    only be lockable if they implemented a Lockable interface or inherited a
    Monitor class would have resulted in code having to document its
    concurrency semantics and explicitly declare which objects and which
    types of objects were meant to be used as monitors; this
    more-self-documenting-code point in which intended-to-be-locked is part
    of something's type and subjected to type safety was not raised in this
    thread. Until now.
     
    supercalifragilisticexpialadiamaticonormalizeringe, Jun 28, 2011
    #12
  13. On 28/06/2011 2:42 PM, Patricia Shanahan wrote:
    > Each String instance has the following fields:
    >
    > private final char value[];
    > private final int offset;
    > private final int count;
    > private int hash;
    >
    > There are 12 bytes in addition to the char array. The offset and count
    > fields allow quick sub-string construction, and hash is used to cache
    > the hashCode result.


    Oh, geez, even *more* overhead. And let's not forget the array has its
    own separate object header and length field!
     
    supercalifragilisticexpialadiamaticonormalizeringe, Jun 28, 2011
    #13
  14. Alex J

    markspace Guest

    On 6/28/2011 12:34 PM, Patricia Shanahan wrote:

    > If, for your purposes, minimal memory use is very important, you may
    > want to consider other languages with other trade-offs.



    Or rolling your own, via the CharSequence interface, for example.
     
    markspace, Jun 28, 2011
    #14
  15. Alex J

    BGB Guest

    On 6/28/2011 9:41 AM, Michal Kleczek wrote:
    > Lew wrote:
    >
    >> Alex J wrote:
    >>> I'm curious why Java designers once decided to allow every object to
    >>> be lockable (i.e. [sic] allow using lock on those).

    >>
    >> Because that makes it possible to do concurrent programming intrinsically.
    >>

    >
    > Could you elaborate more on that?
    > Do you mean there is no other way to do it?
    >
    > I find this question quite intriguing as well since it looks quite useless
    > for example to be able to lock on java.lang.Integer instance (and it is
    > strange for me that java.lang.Integer instance occupies much more memory as
    > int). Surely a compromise must have been done taking into account various
    > language features ("synchronized" keyword, lack of multiple inheritance,
    > lack of closures) - but I am not that knowlegeable enough to explain this
    > compromise in detail.
    >


    yeah...

    they made every object lockable but not every object cloneable, where
    one would think cloning would be generally a higher priority.


    I guess the alternative would require, say:
    class Foo implements Lockable
    {
    ...
    }

    or at least:
    synchronized class Foo
    {
    }
    although this could be confusing/misleading.


    >>> I know, that out of such a design decision every Java object contain
    >>> lock index, i.e. new Object() results in allocation of at least 8
    >>> bytes where 4 bytes is object index and 4 bytes is lock index on 32-
    >>> bit JVM.
    >>> I think that it just inefficient waste of space, because not all the
    >>> objects requires to be lockable/waitable.

    >>
    >> Well, that's your opinion.
    >>

    >
    > It is not only his opinion - the size of object header is important
    > especially on memory constrained devices. But not only - there is a reason
    > why UseCompressedOops flag was introduced in 64 bit HotSpot JVM.
    >



    there are also other possible implementation strategy (used in my own
    VM) where as opposed to a fixed per-object overhead, lock/wait/... are
    implemented internally via a magic table:
    locked objects may be added to a table, and waiting threads may be
    suspended and added to a second table linked to the object in the first;
    when a status change occurs, then the waiting threads are woken up, at
    which point they may do whatever (perform the operation, try to lock the
    object again, ...), and if they try to lock the object again and can't
    get a lock right away, then they go back to sleeping.

    this strategy trades off some performance for not having to store
    per-object lock state (especially given this is a relatively rare
    operation IME).


    >>> The better decision, IMHO, would be to introduce lock/wait mechanics
    >>> for only, say, the Lockable descendants.

    >>
    >> Oh, yeah, your opinion is humble.
    >>
    >>> The current approach seems to be very simple, but is the performance
    >>> penalty so small for not be taken into an account?

    >>
    >> Yes. Nonexistent, really.
    >>

    >
    > I wouldn't say so - see:
    > http://wikis.sun.com/display/HotSpotInternals/CompressedOops
    > Especially the sentence:
    > "Memory is pretty cheap, but these days bandwidth and cache is in short
    > supply"
    >


    yeah...

    another bigger problem though is when one has multiple memory-hungry
    apps compete for the available memory (say, one has several apps each
    eating 1GB-3GB of RAM, and one only has 4GB), and one is left "computing
    at the speed of swap" especially when one only has 5400RPM drives...

    in this case, smaller programs perform faster, as even when the rest of
    the system is bogged down with "the red HD light of doom", the app still
    keeps going ok mostly because its smaller footprint is like prone to
    getting swapped out, and because when it does get swapped, it can also
    pull things back in from disk much faster.
     
    BGB, Jun 28, 2011
    #15
  16. Alex J

    BGB Guest

    On 6/28/2011 11:54 AM,
    supercalifragilisticexpialadiamaticonormalizeringelimatisticantations wrote:
    > On 28/06/2011 2:42 PM, Patricia Shanahan wrote:
    >> Each String instance has the following fields:
    >>
    >> private final char value[];
    >> private final int offset;
    >> private final int count;
    >> private int hash;
    >>
    >> There are 12 bytes in addition to the char array. The offset and count
    >> fields allow quick sub-string construction, and hash is used to cache
    >> the hashCode result.

    >
    > Oh, geez, even *more* overhead. And let's not forget the array has its
    > own separate object header and length field!



    going OT:

    these sorts of issues were one reason why in my own VM and
    custom-designed language, string types are built into the VM. this
    allows somewhat reducing the costs of storing a string (but, yes, many
    more string operations are O(n), such as getting the length or accessing
    a character by index...).

    many other types are built into the VM, and it also has "fixnum" and
    "flonum" types (basically, where an integer or floating-point value is
    encoded directly into the pointer, via tagging magic, allowing avoiding
    the overhead of using object-based boxes).

    as-is though, the per-object memory cost is a little steep though
    (creating a simple class instance will take around 48 bytes, mostly
    header overhead...), partly related to some fancy features supported by
    the OO facilities (and maintaining isolation between the OO facilities
    and the GC, adding a layer of GC memory-object-headers).

    partly it is not as big of a killer though, as most common small types
    are built directly into the VM, rather than existing as classes or
    instances.


    or such...
     
    BGB, Jun 28, 2011
    #16
  17. Alex J

    Eric Sosman Guest

    On 6/28/2011 4:43 PM, BGB wrote:
    > [...]
    > they made every object lockable but not every object cloneable, where
    > one would think cloning would be generally a higher priority.


    class MySingleton implements Cloneable {
    // What's wrong with this picture?
    }

    --
    Eric Sosman
    d
     
    Eric Sosman, Jun 29, 2011
    #17
  18. Alex J

    BGB Guest

    On 6/28/2011 5:43 PM, Eric Sosman wrote:
    > On 6/28/2011 4:43 PM, BGB wrote:
    >> [...]
    >> they made every object lockable but not every object cloneable, where
    >> one would think cloning would be generally a higher priority.

    >
    > class MySingleton implements Cloneable {
    > // What's wrong with this picture?
    > }
    >


    makes about as much sense as the ability to lock every object, including
    those where locking is not exactly sane or useful...

    to be strictly sensible, one would have to do something special for
    lockable objects as well as cloneable ones...


    it is like in ActionScript, which requires a special modifier for
    dynamic objects, even though one could allow people to freely shove
    custom fields and methods into instances of any object.

    it made more sense to support non-dynamic objects as well, and actually
    have these as the default case (whereas it probably would have been less
    effort implementation-wise to simply only have dynamic objects and
    optimize the non-dynamic use cases).


    or such...
     
    BGB, Jun 29, 2011
    #18
  19. On 28/06/2011 4:20 PM, Lew wrote:
    > On 06/28/2011 02:52 PM,
    > supercalifragilisticexpialadiamaticonormalizeringelimatisticantations
    > wrote:
    >> On 28/06/2011 2:33 PM, Lew wrote:
    >>> On 06/28/2011 02:23 PM,
    >>> supercalifragilisticexpialadiamaticonormalizeringelimatisticantations
    >>> wrote:
    >>>> On 28/06/2011 2:13 PM, Lew wrote:
    >>>>> Michal Kleczek wrote:
    >>>>>> Lew wrote:
    >>>>>>> Show me the numbers. What penalty?
    >>>>>>
    >>>>>> It is (almost) twice as much memory as it could be and twice as
    >>>>>> much GC
    >>>>>> cycles. Almost because in real world the number of objects that you
    >>>>>> need to
    >>>>>
    >>>>> Nonsense. It's an extra 4 bytes per object. Most objects are much
    >>>>> larger
    >>>>> than 4 bytes,
    >>>>
    >>>> Bullpuckey, other than that a nontrivial object is always at least 12
    >>>> bytes
    >>>
    >>> So 4 bytes overhead is less than 100%, as I said.

    >>
    >> I didn't dispute that. I disputed "most objects are much larger than 4
    >> bytes".
    >> Most objects are only a little bit larger than 4 bytes.

    >
    > And yet you go on and on and on about how much larger than 4 bytes they
    > are, yourself.


    No, I go on and on about how NOT very much larger than 4 bytes they are.

    >>> Most strings in a typical program are non-empty and generally longer
    >>> than two bytes.

    >>
    >> A lot longer, though, or only a little?

    >
    > According to you, a lot longer.


    I meant the actual character data.

    >>> A good percentage are interned.

    >>
    >> Not in my experience.

    >
    > In most of the programs I've seen they are. (Where "good" means
    > something large enough to notice.) String literals alone abound in all
    > the real-world Java code that I've seen. Dynamic string variables exist,
    > too, of course, and I'm not claiming that a majority are interned.


    The stuff I see tends to have a lot of non-literal strings, acquired
    from disk files, the database, or the network. And a lot of them are
    short strings, like short table names and little fragments of XML and SQL.

    > But the overhead of the monitor is still only 4 bytes, less than 100% of
    > the object size.


    You keep harping on that "less than 100%" thing as if that was the part
    in dispute. But I'd say that anything more than about 5% is certainly
    still a significant overhead. How would you like it if they raised sales
    taxes 5% in whatever place it is where you live?

    > Ergo the claim that the monitor doubles the allocation size is bogus.


    I never made or agreed with such a claim, so this is another straw man.

    > So that 4-byte overhead for a monitor is looking like less and less of a
    > problem by comparison.


    In much the way a 5% tax hike is less of a problem than a 100% tax hike.

    > Aren't you proving my point that objects are much larger than 4 bytes?


    No.

    > You're providing evidence for my point. Thanks.


    Bull puckey.

    >> Not my experience again, not since 1.5. Before autoboxing was
    >> introduced you
    >> might have been right; now I expect there's a lot of "hidden" (i.e., the
    >> capitalized classname doesn't appear in code much) use of boxed
    >> primitives,
    >> particularly in collections.

    >
    > Most of which are shared,


    Where's your numbers? Where's your data? What's good for the goose is
    good for the gander...

    > and best practice militates against autoboxing so that scenario you
    > describe represents bad programming.


    Who said it didn't? But it also represents common programming. The best
    programmers don't grind away at reams and reams of Java for BigSoftInc;
    the best programmers are hacking Lisp at Google's blue-sky division or
    working on AI at MIT's Media Lab or shit like that. The whole Java
    language is designed around the expectation that the stuff will be
    written by masses of corporate monkeys with
    adequate-to-somewhat-noteworthy programming skill and maintained by the
    guys that got the C-s and D+s in college, but still need gainful
    employment somewhere.

    >>> You show only that the overhead of 4 bytes per object is less than 100%
    >>> of the object's memory footprint, which is what I said.

    >>
    >> Keep on attacking that straw man ...

    >
    > You're bringing in the straw man.


    Bull puckey.

    > The OP claimed that monitors doubled the memory need for objects.


    What the OP claimed is not a point against me, because I cannot be held
    responsible for something someone else said. So that's irrelevant, i.e.
    it's a straw man, in this branch of the thread. I only argued that it's
    a significant percentage increase in the memory need, and I only did so
    when you made the blatantly false claim that the non-header size of
    objects tends to be much larger.

    > This is the point I addressed,


    Obviously not, since it is not a point I ever made, and you are
    following up to one of my posts to argue with me.

    > You have, in fact, provided substantial evidence for my claim that the
    > monitor presents far less than 100% overhead.


    A claim I didn't dispute. My claim was only that objects aren't
    typically as large as you claimed they were, and that the overhead is
    still significant, even if nowhere near as large as the OP claimed.

    > How is directly addressing the main point remotely classifiable as a
    > straw-man argument?


    Define "the main point"? I'd define it as "whatever my opponent asserted
    in the immediately preceding post", but obviously you're not using that
    definition. Instead you seem to be misattributing to me the more extreme
    position of the thread's OP, and then misguidedly attacking me for that.

    >>> Which footprint can be reduced by HotSpot, to the point of pulling an
    >>> object out of the heap altogether.

    >>
    >> ???

    >
    > It's called "enregistration", and it's one of the optimizations
    > available to HotSpot, as is instantiating an object on the stack instead
    > of the heap.


    More details, please, and references. Or, put more succinctly: [citation
    needed].

    >>> Where are the numbers? Everyone's arguing from speculation. Show me the
    >>> numbers.
    >>>
    >>> Real numbers. From actual runs. What is the overhead, really? Stop
    >>> making shit up.

    >>
    >> Stop accusing me of lying when I've done nothing of the sort.

    >
    > Yet you don't show the numbers.


    Neither do you.

    > What other conclusion can I draw?


    There are lots of other explanations; jumping instantly to the least
    charitable one, namely that your opponent is being outright dishonest,
    says something about your character that I find disturbing.

    > Tell verifiable truth if you don't want to be called to account for
    > fantasy facts.


    Tell that to the man in the mirror.

    > Don't get mad at me for pointing out your failure.


    ???

    I have no failures, so the above sentence is merely a philosophical
    thought-experiment, at least for now. :)

    >>> Show me the numbers.

    >>
    >> http://c2.com/cgi/wiki?JavaObjectOverheadIsRidiculous
    >>
    >> People ran tests and found an 8 byte overhead per object, much as was
    >> claimed
    >> earlier in this thread. Oh, and that an array of java.awt.Points
    >> containing
    >> pairs of ints is 60% overhead and 40% actual integer values in the
    >> limit of
    >> many array elements -- so array-related overheads (object header, length
    >> field) go away. That suggests something eating another 4 bytes per
    >> object *on
    >> top of* the Points' object headers and integer values, showing that
    >> Point has
    >> some extra field in it taking up space.

    >
    > yadda yadda yadda yadda yadda yadda


    So,

    1. You claimed my reason for not giving numbers earlier had to be
    dishonesty, but here you suggest another reason, which is that a) it
    would be effort and b) you'd just invent some long-winded excuse for
    ignoring them and sticking to your theory anyway, so it would be
    *wasted* effort.

    2. You went ahead and accused me (again!) of not having numbers and of
    being dishonest in a post that is subsequent to, and indeed in reply
    to, a post where I *did* include some numbers -- so *you* were
    dishonest.

    3. This means that going to the effort of digging up some numbers and
    posting them just to *shut you up* in your harping about my lack of
    numbers was also wasted effort!

    4. Which of course makes it even less likely that others will bother in
    the future when you demand hard data, having seen you react like this
    once already.

    > Michal Kleczek had written:
    > "It is (almost) twice as much memory as it could be and twice as much GC
    > cycles."


    Michal Kleczek does not speak for me, so it does not matter what he had
    written.

    > I said that was "nonsense", to which you replied "Bullpuckey"


    No, you specifically claimed "most objects are much larger than 4 bytes"
    to which I replied "bullpuckey".

    > then proceeded to demonstrate that I was correct.


    Horsefeathers.

    > And how is that a straw-man argument on my part, again?


    Because you are misattributing Kleczek's position to me, when my
    position is actually less extreme.

    > Given that I directly addressed that claim and you yourself provided
    > evidence for my point? Hm?


    I may have provided evidence for your claim that object overhead is less
    than 100% for a typical object, but not for your claim that most objects
    are "much larger than 4 bytes". A very large number of them are not; in
    fact almost all non-array, non-AWT, non-Swing objects tend to be not
    much larger than 4 bytes (not including the object header) and most
    arrays are wrapped in a containing class instance (ArrayList, HashMap,
    String) so get a triple whammy from two object headers, a pointer from
    the containing object to the array, and the array length field, which
    will come to 24 bytes rather than just 8 on a typical 32-bit system.
    That array needs to get fairly large -- 30 normal references, 60
    characters, 120 bytes -- before the overhead gets below 5% of the
    thing's total memory consumption.

    > I'm not defending the decision to make every object a monitor,


    Really? It sure as hell looks like you are, given that you argue
    vehemently against and border on flaming (I consider repeated
    insinuations that your opponents in debate may be intentionally lying to
    be bordering on flaming) anyone who suggests that that may have been a
    mistake.

    > other than to point out that it contributed mightily to Java's utility
    > and popularity.


    [citation needed]

    > But I am refuting the claim that the
    > monitor adds 100% to the object's memory footprint.


    If that were *all* you were doing I'd take no issue with it. But you
    also claimed:

    1. that "most objects are much larger than 4 bytes"; and
    2. that you think I might be being intentionally dishonest;

    and both of those speculations are pure and utter hogwash.

    > Meanwhile no one is showing me the numbers


    Utter hooey. That might have been true a couple of days ago but it
    hasn't been since 2:52 yesterday afternoon.

    > The addition of monitors to Java has a benefit.


    The addition of monitors to Java is not at issue here. No-one has
    claimed it should have shipped with no locking mechanisms at all.

    I will assume for the rest of that paragraph that you really meant "the
    making of every object a monitor".

    > Is it worth the cost? That depends on the actual cost, and the actual
    > benefit, quantification of which is not in evidence in this
    > discussion.


    That's a comparison of apples and oranges: design time benefits on the
    one hand and run time costs on the other.

    Of course, the design time benefits are reaped, for a given piece of
    code, only once, while any run time costs are incurred every time that
    code is run.

    So the design time benefits have to be huge, really, to outweigh run
    time costs for any piece of code that will be run very frequently and
    will still be in use for decades.

    This clearly is a criterion that includes a lot of Java's own older
    library code, which has been in use since the 90s and some of which is
    very frequently executed by JVMs all over the world.

    > A point you nor anyone else has yet to address, choosing instead to
    > divert with side issues and straw men.


    Horse manure.

    > That'd be you bringing in the straw man, not me, dude.


    Balderdash.

    > Show me the numbers.


    Been there, done that, got the T-shirt.
     
    supercalifragilisticexpialadiamaticonormalizeringe, Jun 29, 2011
    #19
  20. Alex J

    Lew Guest

    Lew wrote:
    >> Ergo the claim that the monitor doubles the allocation size is bogus.


    supercalifragilisticexpialadiamaticonormalizeringelimatisticantations wrote:
    > I never made or agreed with such a claim, so this is another straw man.


    No, but you argued with me when I refuted that claim. So it's not any kind of
    straw man, and since there wasn't an earlier one, it couldn't be "another" one
    anyway.

    I also never said that you did make that claim, as such. But you disagreed
    with my refutation of it, putting you in that topic. And there are other
    people in the world besides yourself, you self-involved little man.

    > Where's your numbers? Where's your data? What's good for the goose is good for
    > the gander...


    I asked first. The one making the claim that there is 100% overhead, or any
    percent overhead, needs to substantiate the claim. I've already proven the
    100% claim false, as have you, but no one has proven any actual number. I
    haven't asserted an actual number, so have nothing to substantiate.

    Show me the numbers.

    > What the OP claimed is not a point against me, because I cannot be held
    > responsible for something someone else said. So that's irrelevant, i.e. it's a


    If you disagree with the refutation of that point, then you are on that topic,
    and you have an obligation to be responsible for that.

    > straw man, in this branch of the thread.


    You keep using that term. I am not sure that it means what you think it means.

    > Define "the main point"? I'd define it as "whatever my opponent asserted in


    Interesting that you frame this in terms of "opponents". We're not supposed
    to be opponents but partners in exploration of the truth. Apparently your
    purpose is to turn this into some kind of contest, and you hold an
    oppositional frame. I am interested in increasing knowledge here, not doing
    battle, so -

    PLONK.

    --
    Lew
    Honi soit qui mal y pense.
    http://upload.wikimedia.org/wikipedia/commons/c/cf/Friz.jpg
     
    Lew, Jun 29, 2011
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Fuzzyman
    Replies:
    3
    Views:
    509
    Andrew MacIntyre
    Dec 5, 2003
  2. Robert Brewer
    Replies:
    0
    Views:
    500
    Robert Brewer
    Dec 5, 2003
  3. Mr. SweatyFinger
    Replies:
    2
    Views:
    2,076
    Smokey Grindel
    Dec 2, 2006
  4. k3xji
    Replies:
    7
    Views:
    845
    Gabriel Genellina
    Dec 30, 2008
  5. nano2k

    Application.Lock()/UnLock() or lock(Application)

    nano2k, Jul 23, 2007, in forum: ASP .Net Web Services
    Replies:
    2
    Views:
    299
    nano2k
    Aug 9, 2007
Loading...

Share This Page