Re: Write-to-disk cache via WeakReferences?

Discussion in 'Java' started by Paul J. Lucas, Aug 4, 2005.

  1. I've been looking at your code and thinking about it more.
    Presumeably, the GC, when it needs to reclaim memory, will look
    for the biggest objects to get rid of. However, in your code,
    it's only the Proxy objects that the GC will be able to
    reclaim since they're the only objects that are softly
    reachable. Proxy objects are small so it will likely need to
    reclaim lots of them. This will cause the associated Resources
    to be written to disk, more than is probably necessary since
    the GC doens't know that you intend on reclaiming the memory
    for the Resources.

    So this solution is fairly sub-optimal, right?

    - Paul
     
    Paul J. Lucas, Aug 4, 2005
    #1
    1. Advertising

  2. Paul J. Lucas wrote:
    > I've been looking at your code and thinking about it more.
    > Presumeably, the GC, when it needs to reclaim memory, will look
    > for the biggest objects to get rid of. However, in your code,
    > it's only the Proxy objects that the GC will be able to
    > reclaim since they're the only objects that are softly
    > reachable. Proxy objects are small so it will likely need to
    > reclaim lots of them. This will cause the associated Resources
    > to be written to disk, more than is probably necessary since
    > the GC doens't know that you intend on reclaiming the memory
    > for the Resources.


    No GC I am aware of considers object size.


    For some background, Sun's implementations of GC runs something like this:

    The vast majority of objects are very short lived. Objects are initally
    allocated in the eden area. The allocation is very fast as it just bumps
    a pointer up (well it is more complicated).

    To reclaim memory from eden any objects that do still have a reference
    are copied into a survivor area. When copying an object the references
    in other objects to it are updated.

    There are two quite small survivor areas, and surviving objects get
    copied from one to the other as it runs out of memory. If not enough
    memory is reclaimed out of the survivor area or an object has been
    copied above a certain number of times, it gets copied to the tenured area.

    The tenured area is where you have all the really old objects. GC here
    is slow. The precise details are dependent upon which GC implementation
    is configured.

    Then there is the persistent area which has the JVMs class data and
    interned strings.


    GCs treat the reference in Reference objects specially. WeakReferences
    will be cleared as soon as possible. SoftReferences will be cleared if
    they haven't been used recently and there is little free memory
    (actually the default is something like, clear if the age in seconds is
    greater than the freed memory in megabytes).

    When cleared the Reference (including an implementation for finalisers)
    is added to a reference handler queue. A thread handling the queue runs
    at highest priority and feeds the user reference queues and the
    finalizer queue. A special type of object gets to run a finaliser type
    method in the reference handler thread for efficiency.

    Finalisers can then be run in finaliser threads (also running at max
    priority) and ReferenceQueues can pass on the dead References.


    So, no size doesn't matter. How would you calculate size anyway? What
    matters is age and last use of SoftReference (I didn't show in my code,
    but you have to call get on the SoftReference itself to mark as a use).

    Tom Hawtin
    --
    Unemployed English Java programmer
    http://jroller.com/page/tackline/
     
    Thomas Hawtin, Aug 4, 2005
    #2
    1. Advertising

  3. Thomas Hawtin <> wrote:

    > So, no size doesn't matter. How would you calculate size anyway?


    Doesn't some sort of object traversal have to be done anyway to
    determine how reachable an object is? You could add size as you
    traverse.

    It seems silly to reclaim 100 small objects via SoftReferences
    when reclaiming say 1 large object would do the job. When your
    boat is sinking, you want to throw the heaviest object
    overboard first.

    > What matters is age and last use of SoftReference (I didn't show in my code,
    > but you have to call get on the SoftReference itself to mark as a use).


    But the point of reclaiming more than is needed is still true,
    right? And in this case, that means uncessary disk I/O.

    - Paul
     
    Paul J. Lucas, Aug 4, 2005
    #3
  4. Thomas Hawtin <> wrote:

    > So, no size doesn't matter. How would you calculate size anyway?


    Doesn't some sort of object traversal have to be done anyway to
    determine how reachable an object is? You could add size as you
    traverse.

    It seems silly to reclaim 100 small objects via SoftReferences
    when reclaiming say 1 large object would do the job. When your
    boat is sinking, you want to throw the heaviest object
    overboard first.

    > What matters is age and last use of SoftReference (I didn't show in my code,
    > but you have to call get on the SoftReference itself to mark as a use).


    But the point of reclaiming more than is needed is still true,
    right? And in this case, that means uncessary disk I/O.

    - Paul
     
    Paul J. Lucas, Aug 4, 2005
    #4
  5. Paul J. Lucas wrote:
    > Thomas Hawtin <> wrote:
    >
    >
    >>So, no size doesn't matter. How would you calculate size anyway?

    >
    >
    > Doesn't some sort of object traversal have to be done anyway to
    > determine how reachable an object is? You could add size as you
    > traverse.


    To determine the total object graph size referenced by the object would
    mean traversing those objects that are not strongly referenceable, and
    hence don't need to be traversed in the normal course of events.

    Also you need to be aware of what an object references. For a start,
    every object will reference it's class, which will reference it's class
    loader, the class loader's ancestors, all the classes loaded by those
    loaders and all the objects those classes reference.

    > It seems silly to reclaim 100 small objects via SoftReferences
    > when reclaiming say 1 large object would do the job. When your
    > boat is sinking, you want to throw the heaviest object
    > overboard first.


    GC runs more efficiently with plenty of free memory. Therefore there
    isn't really a notion of sufficient memory. OutOfMemoryError will throw
    even with a substantial proportion of the heap freed.

    It doesn't cost anything (other than Reference handling) to reclaim an
    object. It costs to keep an object.

    How much would it cost to recreate an object? No idea. We can guess that
    it is roughly proportional to the object's size. Therefore the size of
    the object is irrelevant.

    >>What matters is age and last use of SoftReference (I didn't show in my code,
    >>but you have to call get on the SoftReference itself to mark as a use).

    >
    > But the point of reclaiming more than is needed is still true,
    > right? And in this case, that means uncessary disk I/O.


    You certainly want to be careful. However, it doesn't actually matter
    which object is referenced by the SoftReference.

    Tom Hawtin
    --
    Unemployed English Java programmer
    http://jroller.com/page/tackline/
     
    Thomas Hawtin, Aug 4, 2005
    #5
  6. Paul J. Lucas

    Guest

    Thomas Hawtin wrote:
    >
    > GCs treat the reference in Reference objects specially. WeakReferences
    > will be cleared as soon as possible. SoftReferences will be cleared if
    > they haven't been used recently and there is little free memory
    > (actually the default is something like, clear if the age in seconds is
    > greater than the freed memory in megabytes).
    >


    As a slight tangent, it has been my experience that SoftReferences are
    cleared anytime you call gc() regardless of how much free memory there
    is. I had a cache of soft references and my own thread that would call
    gc() itself during idle times so free memory would not fill up with
    dead objects that would need to be reclaimed during peak times. Every
    time it ran all the soft references would go away. Thus I abandoned
    the soft reference cache. They really need to be reclaimed only after
    the first pass of gc is completed and there is still too little memory.
    I could fill up 1 gig of memory with dead objects, call gc(), get all
    that memory back, have plenty of memory now, and my soft reference
    cache is wiped out for no reason.
     
    , Aug 5, 2005
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Indbond
    Replies:
    10
    Views:
    547
    John C. Bollinger
    Apr 23, 2004
  2. Paul J. Lucas

    Write-to-disk cache via WeakReferences?

    Paul J. Lucas, Jul 23, 2005, in forum: Java
    Replies:
    6
    Views:
    453
    Paul J. Lucas
    Jul 25, 2005
  3. Paul J. Lucas
    Replies:
    2
    Views:
    390
  4. Ian Pilcher
    Replies:
    7
    Views:
    408
    Mike Schilling
    Jan 19, 2006
  5. Replies:
    12
    Views:
    535
    santosh
    Nov 15, 2006
Loading...

Share This Page