Memoize and GC

Discussion in 'Ruby' started by Intransition, Apr 13, 2010.

  1. Intransition

    Intransition Guest

    I have a quick question. I just re-read:

    http://www.engineyard.com/blog/2010/memoization-and-id2ref/

    I am wondering about garbage collection. Am I right to think that
    using

    def foo
    @foo ||= something
    end

    will mean that @foo be GC'd when the object is GC'd, but if I do

    FOO = []

    def foo
    FOO[object_id] ||= something
    end

    Then the memorized entry will just hang around forever?

    If so, then how do we do it without polluting the instance var space
    (as in example 2), but still get the garbage collection (as in example
    1)?

    Thanks.
     
    Intransition, Apr 13, 2010
    #1
    1. Advertising

  2. On 04/13/2010 06:26 PM, Intransition wrote:
    > I have a quick question. I just re-read:
    >
    > http://www.engineyard.com/blog/2010/memoization-and-id2ref/
    >
    > I am wondering about garbage collection. Am I right to think that
    > using
    >
    > def foo
    > @foo ||= something
    > end
    >
    > will mean that @foo be GC'd when the object is GC'd, but if I do
    >
    > FOO = []
    >
    > def foo
    > FOO[object_id] ||= something
    > end
    >
    > Then the memorized entry will just hang around forever?
    >
    > If so, then how do we do it without polluting the instance var space
    > (as in example 2), but still get the garbage collection (as in example
    > 1)?


    I would live with polluting the instance var space. Btw, I usually do
    not use memoize but rather an explicit Hash with default_proc. That way
    I have control over where I place the Hash and hence how long it lives.

    Of course you can hack something together using #object_id and an
    external Hash but then you also need to create a finalizer and soon
    things get overly complicated. If you have issues with polluting the
    instance variable namespace you can a) put all memoize containers into a
    single instance variable (typically a Hash) and / or b) generate
    instance variable names which are likely to be conflict free.

    Kind regards

    robert

    --
    remember.guy do |as, often| as.you_can - without end
    http://blog.rubybestpractices.com/
     
    Robert Klemme, Apr 13, 2010
    #2
    1. Advertising

  3. On Tue, Apr 13, 2010 at 6:26 PM, Intransition <> wrote:
    > I have a quick question. I just re-read:
    >
    > =A0http://www.engineyard.com/blog/2010/memoization-and-id2ref/
    >
    > I am wondering about garbage collection. Am I right to think that
    > using
    >
    > =A0def foo
    > =A0 =A0@foo ||=3D something
    > =A0end
    >
    > will mean that @foo be GC'd when the object is GC'd, but if I do
    >
    > =A0FOO =3D []
    >
    > =A0def foo
    > =A0 =A0FOO[object_id] ||=3D something
    > =A0end
    >
    > Then the memorized entry will just hang around forever?
    >
    > If so, then how do we do it without polluting the instance var space
    > (as in example 2), but still get the garbage collection (as in example
    > 1)?
    >
    > Thanks.
    >
    >


    One way could be using a closure that redefines the method:

    irb(main):019:0> class A
    irb(main):020:1> def a
    irb(main):021:2> puts "expensive calculation of a..."
    irb(main):022:2> a =3D "complex value"
    irb(main):023:2> self.class.send:)define_method, :a) do
    irb(main):024:3* a
    irb(main):025:3> end
    irb(main):026:2> a
    irb(main):027:2> end
    irb(main):028:1> end
    =3D> nil
    irb(main):029:0> obj =3D A.new
    =3D> #<A:0xb7c9ef20>
    irb(main):030:0> obj.a
    expensive calculation of a...
    =3D> "complex value"
    irb(main):031:0> obj.a
    =3D> "complex value"

    Jesus.
     
    Jesús Gabriel y Galán, Apr 13, 2010
    #3
  4. On 4/13/10, Intransition <> wrote:
    > will mean that @foo be GC'd when the object is GC'd, but if I do
    >
    > FOO = []
    >
    > def foo
    > FOO[object_id] ||= something
    > end


    I have to assume that you meant FOO to be a hash... tho an array will
    work here... sorta.

    But rather than either an array or hash, what if you could write:

    FOO=Cache.new

    That is, FOO would behave mostly like a hash, but unused entries would
    age out after a while. Now if only someone would invent an appropriate
    Cache class.
     
    Caleb Clausen, Apr 13, 2010
    #4
  5. Intransition

    Intransition Guest

    On Apr 13, 1:28=A0pm, Caleb Clausen <> wrote:
    > On 4/13/10, Intransition <> wrote:
    >
    > > will mean that @foo be GC'd when the object is GC'd, but if I do

    >
    > > =A0 FOO =3D []

    >
    > > =A0 def foo
    > > =A0 =A0 FOO[object_id] ||=3D something
    > > =A0 end

    >
    > I have to assume that you meant FOO to be a hash... tho an array will
    > work here... sorta.
    >
    > But rather than either an array or hash, what if you could write:
    >
    > =A0 FOO=3DCache.new
    >
    > That is, FOO would behave mostly like a hash, but unused entries would
    > age out after a while. Now if only someone would invent an appropriate
    > Cache class.


    Ah, very good thought! An LRUCache would do pretty well here.

    Well then, that leads to me to another project. My implementation of
    LRUCache (lrucache gem) kind of sucks in that it only handles max
    size, so I am looking for a new one to replace it. Any
    recommendations?
     
    Intransition, Apr 13, 2010
    #5
  6. 2010/4/13 Intransition <>:
    >
    >
    > On Apr 13, 1:28=A0pm, Caleb Clausen <> wrote:
    >> On 4/13/10, Intransition <> wrote:
    >>
    >> > will mean that @foo be GC'd when the object is GC'd, but if I do

    >>
    >> > =A0 FOO =3D []

    >>
    >> > =A0 def foo
    >> > =A0 =A0 FOO[object_id] ||=3D something
    >> > =A0 end

    >>
    >> I have to assume that you meant FOO to be a hash... tho an array will
    >> work here... sorta.
    >>
    >> But rather than either an array or hash, what if you could write:
    >>
    >> =A0 FOO=3DCache.new
    >>
    >> That is, FOO would behave mostly like a hash, but unused entries would
    >> age out after a while. Now if only someone would invent an appropriate
    >> Cache class.

    >
    > Ah, very good thought! An LRUCache would do pretty well here.
    >
    > Well then, that leads to me to another project. My implementation of
    > LRUCache (lrucache gem) kind of sucks in that it only handles max
    > size, so I am looking for a new one to replace it. Any
    > recommendations?


    What else do you want to handle? My implementation [1] does also only
    consider size but I have also seen implementations that consider
    insertion or access time. Personally I dislike those solutions
    because they might throw away stuff too early and make the
    implementation more complex. The main point (resource control) is
    already achieved by limiting the size of a LRU map.

    Kind regards

    robert

    http://github.com/rklemme/muppet-laboratories/blob/master/lib/lruhash.rb


    --=20
    remember.guy do |as, often| as.you_can - without end
    http://blog.rubybestpractices.com/
     
    Robert Klemme, Apr 14, 2010
    #6
  7. Intransition

    Robert Dober Guest

    2010/4/13 Jes=FAs Gabriel y Gal=E1n <>:
    > On Tue, Apr 13, 2010 at 6:26 PM, Intransition <> wrote=

    :
    >> I have a quick question. I just re-read:
    >>
    >> =A0http://www.engineyard.com/blog/2010/memoization-and-id2ref/
    >>
    >> I am wondering about garbage collection. Am I right to think that
    >> using
    >>
    >> =A0def foo
    >> =A0 =A0@foo ||=3D something
    >> =A0end
    >>
    >> will mean that @foo be GC'd when the object is GC'd, but if I do
    >>
    >> =A0FOO =3D []
    >>
    >> =A0def foo
    >> =A0 =A0FOO[object_id] ||=3D something
    >> =A0end
    >>
    >> Then the memorized entry will just hang around forever?
    >>
    >> If so, then how do we do it without polluting the instance var space
    >> (as in example 2), but still get the garbage collection (as in example
    >> 1)?
    >>
    >> Thanks.
    >>
    >>

    >
    > One way could be using a closure that redefines the method:
    >
    > irb(main):019:0> class A
    > irb(main):020:1> def a
    > irb(main):021:2> puts "expensive calculation of a..."
    > irb(main):022:2> a =3D "complex value"
    > irb(main):023:2> self.class.send:)define_method, :a) do

    I like the closure idea, but I guess the memozing got lost a little bit ;).
    thus I would write
    def a
    _a =3D nil
    self.class.module_eval do
    define_method :a do _a ||=3D "expensive calculation" end
    define_method :_invalidate_a do _a =3D nil end
    end
    a
    end
    Cheers
    R.

    --=20
    Learning without thought is labor lost; thought without learning is perilou=
    s.=94
    --- Confucius
     
    Robert Dober, Apr 14, 2010
    #7
  8. Intransition

    Intransition Guest

    On Apr 14, 7:02=A0am, Robert Klemme <> wrote:

    > What else do you want to handle? =A0My implementation [1] does also only
    > consider size but I have also seen implementations that consider
    > insertion or access time. =A0Personally I dislike those solutions
    > because they might throw away stuff too early and make the
    > implementation more complex. =A0The main point (resource control) is
    > already achieved by limiting the size of a LRU map.


    Fair enough, I just thought it would be nice to have the *option* of a
    time-based cache.
     
    Intransition, Apr 14, 2010
    #8
  9. 2010/4/14 Intransition <>:
    >
    > On Apr 14, 7:02=A0am, Robert Klemme <> wrote:
    >
    >> What else do you want to handle? =A0My implementation [1] does also only
    >> consider size but I have also seen implementations that consider
    >> insertion or access time. =A0Personally I dislike those solutions
    >> because they might throw away stuff too early and make the
    >> implementation more complex. =A0The main point (resource control) is
    >> already achieved by limiting the size of a LRU map.

    >
    > Fair enough, I just thought it would be nice to have the *option* of a
    > time-based cache.


    Definitively! But this would be a different beast for me. One might
    be able to implement this by using a LRUHash or by inheriting from it.
    But I would not combine that in a single class. Another option could
    be to have a modular approach where the container gets a
    DeletionPolicy that is responsible for deciding when or what to
    delete. The interface could look like

    interface DeletionPolicy
    def insert(key,val)
    def remove(key, val)
    def access(key, val)
    def next_delete # yields key
    end

    Just a quick hack though withou

    Kind regards

    robert

    --=20
    remember.guy do |as, often| as.you_can - without end
    http://blog.rubybestpractices.com/
     
    Robert Klemme, Apr 15, 2010
    #9
  10. Intransition

    Intransition Guest

    On Apr 15, 3:16=A0am, Robert Klemme <> wrote:
    > 2010/4/14 Intransition <>:
    >
    >
    >
    > > On Apr 14, 7:02=A0am, Robert Klemme <> wrote:

    >
    > >> What else do you want to handle? =A0My implementation [1] does also on=

    ly
    > >> consider size but I have also seen implementations that consider
    > >> insertion or access time. =A0Personally I dislike those solutions
    > >> because they might throw away stuff too early and make the
    > >> implementation more complex. =A0The main point (resource control) is
    > >> already achieved by limiting the size of a LRU map.

    >
    > > Fair enough, I just thought it would be nice to have the *option* of a
    > > time-based cache.

    >
    > Definitively! =A0But this would be a different beast for me. =A0One might
    > be able to implement this by using a LRUHash or by inheriting from it.
    > =A0But I would not combine that in a single class. =A0Another option coul=

    d
    > be to have a modular approach where the container gets a
    > DeletionPolicy that is responsible for deciding when or what to
    > delete. =A0The interface could look like
    >
    > interface DeletionPolicy
    > =A0 def insert(key,val)
    > =A0 def remove(key, val)
    > =A0 def access(key, val)
    > =A0 def next_delete # yields key
    > end
    >
    > Just a quick hack though withou


    I kind of like that idea.
     
    Intransition, Apr 15, 2010
    #10
  11. On 4/13/10, Intransition <> wrote:
    > On Apr 13, 1:28 pm, Caleb Clausen <> wrote:
    >> That is, FOO would behave mostly like a hash, but unused entries would
    >> age out after a while. Now if only someone would invent an appropriate
    >> Cache class.

    >
    > Ah, very good thought! An LRUCache would do pretty well here.
    >
    > Well then, that leads to me to another project. My implementation of
    > LRUCache (lrucache gem) kind of sucks in that it only handles max
    > size, so I am looking for a new one to replace it. Any
    > recommendations?


    Last time I went looking for a cache class, I found several written in
    ruby, but I really needed something that would be super-fast, so it
    had to be written in c. I did write about half of a ruby hash class in
    c before giving up.... For most use cases, tho, presumably including
    this one, a ruby implementation would be fine. Having a customizable
    age-out policy is the only other nice-to-have I can think of.
     
    Caleb Clausen, Apr 16, 2010
    #11
  12. Intransition

    Phrogz Guest

    On Apr 13, 11:28 am, Caleb Clausen <> wrote:
    > But rather than either an array or hash, what if you could write:
    >
    >   FOO=Cache.new
    >
    > That is, FOO would behave mostly like a hash, but unused entries would
    > age out after a while. Now if only someone would invent an appropriate
    > Cache class.


    Undocumented and without a test framework, I offer you a class I wrote
    a while ago that's still live and in use a couple of years later. It's
    not exactly what you're asking for, but close. (It returns string-
    based keys for accessing the objects because my usage has it running
    in a separate process from the main app.) I reasonably hate the
    decision I made to return values versus arrays based on number of
    arguments, and it's not as DRY as it could be, but it's working :)


    # Example usage
    # class SituationServer < Phrogz::Librarian
    # ...
    # end
    #
    # CLEANING_SCHEDULE = 60*60 # Clean up once an hour
    # MAX_STALENESS = 60*60 # Remove items not used in the last
    hour
    #
    # @server = SituationServer.new
    # Thread.new{
    # loop {
    # sleep CLEANING_SCHEDULE
    # removed = @server.remove_stale( MAX_STALENESS )
    # puts "Just removed #{removed} stale item#{:s unless
    removed==1}..."
    # }
    # }

    module Phrogz; end
    class Phrogz::Librarian
    def initialize( key_size=4 )
    @key_size = key_size
    @max_keys = 16 ** key_size
    @serializer = "%0#{key_size}x"
    @library = {}
    @stored = {}
    @access = {}
    end

    def store( *objects )
    keys = objects.map{ |object|
    key,obj = @library.find{ |k,o| o==object }
    unless key
    # FIXME: bail if @library.size >= MAX_KEYS, or else this will
    infinite loop
    # TODO: come up with a better strategy for finding a free key
    begin
    key = @serializer % rand( @max_keys )
    end while @library.has_key?( key )
    @library[ key ] = object
    @stored[ key ] = Time.now
    end
    key
    }
    keys.length == 1 ? keys.first : keys
    end
    alias_method :<<, :store

    def fetch( *keys )
    now = Time.now # So that multiple accesses are exactly
    synchronized
    objects = keys.map{ |key|
    if @library.has_key?( key )
    @access[ key ] = now
    @library[ key ]
    end
    }
    objects.length == 1 ? objects.first : objects
    end

    def size
    @library.size
    end

    def storage_time( *keys )
    results = @stored.values_at( *keys )
    results.length == 1 ? results.first : results
    end

    def age( *keys )
    results = @stored.values_at( *keys ).map{ |t| Time.now - t }
    results.length == 1 ? results.first : results
    end

    def last_access( *keys )
    results = @access.values_at( *keys )
    results.length == 1 ? results.first : results
    end

    def staleness( *keys )
    results = @access.values_at( *keys ).map{ |t| Time.now - t }
    results.length == 1 ? results.first : results
    end

    def discard( *keys )
    results = keys.map{ |key|
    @stored.delete(key)
    @access.delete(key)
    @library.delete(key)
    }
    results.length == 1 ? results.first : results
    end

    def remove_stale( max_staleness )
    now = Time.now
    stale_keys = @access.select{ |key,time| (now-time) >
    max_staleness }.keys
    stale_keys.each{ |key|
    @stored.delete( key )
    @access.delete( key )
    object = @library.delete( key )
    yield object if block_given?
    }
    stale_keys.length
    end

    def remove_old( max_age )
    now = Time.now
    stale_keys = @stored.select{ |key,time| (now-time) >
    max_age }.keys
    stale_keys.each{ |key|
    @stored.delete( key )
    @access.delete( key )
    object = @library.delete( key )
    yield object if block_given?
    }
    stale_keys.length
    end

    end
     
    Phrogz, Apr 18, 2010
    #12
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Chris Reedy

    How to memoize functions?

    Chris Reedy, Jun 26, 2003, in forum: Python
    Replies:
    3
    Views:
    999
    Chris Reedy
    Jun 27, 2003
  2. Replies:
    0
    Views:
    660
  3. Replies:
    2
    Views:
    775
  4. Brian Buckley

    memoize and yaml

    Brian Buckley, Nov 3, 2005, in forum: Ruby
    Replies:
    10
    Views:
    186
    Florian Groß
    Nov 9, 2005
  5. samwyse
    Replies:
    2
    Views:
    97
Loading...

Share This Page