why does this code leak?

ara howard · Jan 9, 2008

cfp2:~ > cat a.rb
#! /usr/bin/env ruby

require 'net/http'

3.times do
Net::HTTP.start('www.google.com') do |http|
http_response = http.get '/'
end

p ObjectSpace.each_object(Net::HTTPResponse){}
end

cfp2:~ > ruby a.rb
1
2
3

why is this leaking Net::HTTPResponse objects?

a @ http://codeforpeople.com/

tsuraan · Jan 9, 2008

why is this leaking Net::HTTPResponse objects?

I don't know, but jruby has the exact same behaviour. Maybe that helps?

Jan Dvorak · Jan 9, 2008

why is this leaking Net::HTTPResponse objects?

It does not leak, objects are just not immediately released by the GC. Try it
running 1000 times (on local server preferably).

Jan

Michael Bevilacqua-Linn · Jan 9, 2008

[Note: parts of this message were removed to make it a legal post.]

Why are you assuming it's leaking without giving the GC a chance to run?

require 'net/http'

3.times do
Net::HTTP.start('www.google.com') do |http|
http_response = http.get '/'
end

p ObjectSpace.each_object(Net::HTTPResponse){}
end

p "Force GC"
ObjectSpace.garbage_collect
p ObjectSpace.each_object(Net::HTTPResponse){}

C:\>ruby test.rb
1
2
3
"Force GC"
0

MBL

ara howard · Jan 9, 2008

Why are you assuming it's leaking without giving the GC a chance to
run?

because i'm an idiot and pasted the wrong buffer. here is the code:

#
# distilled behaviour from dike.rb
#
class Object
Methods = instance_methods.inject(Hash.new){|h, m| h.update m =>
instance_method(m)}
end

class Class
Methods = instance_methods.inject(Hash.new){|h, m| h.update m =>
instance_method(m)}

def new *a, &b
object = Methods["new"].bind(self).call *a, &b
ensure
ObjectSpace.define_finalizer object, finalizer
end

def finalizer
lambda{}
end
end

#
# the above makes this code leaks, but *only* Net::HTTPOK objects
#
require "net/http"

def leak!
Net::HTTP.start("localhost") do |http|
puts http.get('/').code
end
end

3.times {
puts "---"
p ObjectSpace.each_object(Net::HTTPResponse){}
leak!
GC.start
}

a @ http://codeforpeople.com/

ara howard · Jan 9, 2008

It does not leak, objects are just not immediately released by the
GC. Try it
running 1000 times (on local server preferably).

as i already posted - i posted the wrong code. here is an even more
distilled version:

cfp2:~ > cat a.rb
#
# distilled behaviour from dike.rb
#
class Class
Finalizer = lambda {}

def leak_free_finalizer
Finalizer
end

def leaky_finalizer
lambda{}
end

def finalizer
%r/leak/ =~ ARGV.first ? leaky_finalizer : leak_free_finalizer
end

def new *a, &b
object = allocate
object.send :initialize, *a, &b
object
ensure
ObjectSpace.define_finalizer object, finalizer
end
end

#
# the above makes this code leak iff ARGV has "leak" in it
#
require "yaml"

7.times {
GC.start

y Array.name => ObjectSpace.each_object(Array){}

Array.new
}

cfp2:~ > ruby a.rb
---
Array: 21
---
Array: 49
---
Array: 58
---
Array: 58
---
Array: 58
---
Array: 58
---
Array: 58

cfp2:~ > ruby a.rb leak
---
Array: 21
---
Array: 38
---
Array: 54
---
Array: 67
---
Array: 79
---
Array: 91
---
Array: 103

so.... why does installing a static finalizer work ok, but a dynamic
one leaks memory!?

i'm nice and confused now.

a @ http://codeforpeople.com/

ara howard · Jan 9, 2008

'd say you picked the wrong class for your tests - apparently a
lambda uses an array somehow. This is what I see from the modified
script (attached):

hmmm. doing this in irb suggests not:

list = []
GC.start; p ObjectSpace.each_object(Array){}; list << lambda{}
GC.start; p ObjectSpace.each_object(Array){}; list << lambda{}
GC.start; p ObjectSpace.each_object(Array){}; list << lambda{}
GC.start; p ObjectSpace.each_object(Array){}; list << lambda{}

the number of Array's will remain static. hrrrmmm....

a @ http://codeforpeople.com/

dan yoder · Jan 9, 2008

so.... why does installing a static finalizer work ok, but a dynamic

one leaks memory!?

My guess is that the lambda is keeping the scope of the new invocation
around, which includes a reference to the newly created array.

Would it be a better test not to use Array (instances of which seems
to be created also by ObjectSpace / YAML)? Ex:

class Foo ; end
7.times {
GC.start
count = ObjectSpace.each_object(Foo)
p "Count: #{count}"
Foo.new
}

-Dan

dan yoder · Jan 9, 2008

Here's an even simpler example. You don't need anything but the
following to demonstrate the problem.

class Foo
Finalizer = lambda{}
def initialize
ObjectSpace.define_finalizer(self,Finalizer)
end
end
def test
10.times do
GC.start
count = ObjectSpace.each_object(Foo) {}
p "Count: #{count}"
Foo.new
end
end
test # -> no leak
"Count: 0"
"Count: 1"
"Count: 2"
"Count: 2"
"Count: 2"
etc.

Now, re-open Foo and add an inline finalizer.

class Foo
def initialize
ObjectSpace.define_finalizer(self,lambda{})
end
end
test # -> now it leaks
"Count: 1"
"Count: 2"
"Count: 3"
"Count: 4"
etc.

I realize the scope of the lambda invocation is different in this
example, but since the behavior is so similar, I thought it likely
pointed to the same underlying issue.

Dan

ara howard · Jan 9, 2008

I realize the scope of the lambda invocation is different in this
example, but since the behavior is so similar, I thought it likely
pointed to the same underlying issue.

i think it's actually some strange interaction with yaml. check this
out:

cfp2:~ > cat a.rb
class Class
def finalizer
lambda{}
end

def new *a, &b
object = allocate
object.send :initialize, *a, &b
object
ensure
ObjectSpace.define_finalizer object, finalizer
end
end

class Foo; end
class Bar < Foo; end

c = Array

if ARGV.detect{|arg| arg["leak"]}
require "yaml"
7.times {
GC.start
y c.name => ObjectSpace.each_object(c){}
c.new
}
else
7.times {
GC.start
puts "---"
puts "#{ c.name }: #{ ObjectSpace.each_object(c){} }"
c.new
}
end

cfp2:~ > ruby a.rb
---
Array: 6
---
Array: 11
---
Array: 14
---
Array: 18
---
Array: 20
---
Array: 20
---
Array: 20

cfp2:~ > ruby a.rb leak
---
Array: 21
---
Array: 38

Rick DeNatale · Jan 9, 2008

I realize the scope of the lambda invocation is different in this
example, but since the behavior is so similar, I thought it likely
pointed to the same underlying issue.

Click to expand...

i think it's actually some strange interaction with yaml. check this
out: ...
if ARGV.detect{|arg| arg["leak"]}
require "yaml"
7.times {
GC.start
y c.name => ObjectSpace.each_object(c){}
c.new
}
else
7.times {
GC.start
puts "---"
puts "#{ c.name }: #{ ObjectSpace.each_object(c){} }"
c.new
}
end

cfp2:~ > ruby a.rb

cfp2:~ > ruby a.rb leak

Not sure how you got there Ara, I don't see where the OP ever mentioned YAML.

I think the key is where the lambda is created. The lambda is
capturing the binding.

In the first case the lambda is being created in the bindig context of
the class, and in particular self is the class.

In the second case, the lambda is being created in the binding context
of the new instance, and self is that new instance, so the lambda in
the finalizer is hanging on to it.

That's my theory anyway.

Robert Klemme · Jan 9, 2008

I realize the scope of the lambda invocation is different in this
example, but since the behavior is so similar, I thought it likely
pointed to the same underlying issue.

Click to expand...

i think it's actually some strange interaction with yaml. check this
out: ..
if ARGV.detect{|arg| arg["leak"]}
require "yaml"
7.times {
GC.start
y c.name => ObjectSpace.each_object(c){}
c.new
}
else
7.times {
GC.start
puts "---"
puts "#{ c.name }: #{ ObjectSpace.each_object(c){} }"
c.new
}
end

cfp2:~ > ruby a.rb

Click to expand...

cfp2:~ > ruby a.rb leak

Click to expand...

Not sure how you got there Ara, I don't see where the OP ever mentioned YAML.

Rick, OP == ara => true.

I think the key is where the lambda is created. The lambda is
capturing the binding.

In his revisited example (msg id
(e-mail address removed)) the non leaky finalizer
had self bound to Class and the non leaky to the particular class
instance. I believe Ara's confusion stems from the question how a class
instance can keep instances in memory.

In the first case the lambda is being created in the bindig context of
the class, and in particular self is the class.

In the second case, the lambda is being created in the binding context
of the new instance, and self is that new instance, so the lambda in
the finalizer is hanging on to it.

That's not true for the posting I mentioned above and also not for the
last one. There was only one finalizer but one branch used yaml while
the other did not.

Kind regards

robert

Rick DeNatale · Jan 9, 2008

Ah, missed that.

Anyway ara re-asked the question on ruby-core and Matz answered him
pretty much the same way I did.

On Jan 9, 2008, at 9:39 AM, dan yoder wrote:

I realize the scope of the lambda invocation is different in this
example, but since the behavior is so similar, I thought it likely
pointed to the same underlying issue.
i think it's actually some strange interaction with yaml. check this
out: ..
if ARGV.detect{|arg| arg["leak"]}
require "yaml"
7.times {
GC.start
y c.name => ObjectSpace.each_object(c){}
c.new
}
else
7.times {
GC.start
puts "---"
puts "#{ c.name }: #{ ObjectSpace.each_object(c){} }"
c.new
}
end

cfp2:~ > ruby a.rb

Click to expand...

cfp2:~ > ruby a.rb leak

Click to expand...

Not sure how you got there Ara, I don't see where the OP ever mentioned YAML.

Click to expand...

Rick, OP == ara => true.

I think the key is where the lambda is created. The lambda is
capturing the binding.

Click to expand...

In his revisited example (msg id
(e-mail address removed)) the non leaky finalizer
had self bound to Class and the non leaky to the particular class
instance. I believe Ara's confusion stems from the question how a class
instance can keep instances in memory.

In the first case the lambda is being created in the bindig context of
the class, and in particular self is the class.

In the second case, the lambda is being created in the binding context
of the new instance, and self is that new instance, so the lambda in
the finalizer is hanging on to it.

Click to expand...

That's not true for the posting I mentioned above and also not for the
last one. There was only one finalizer but one branch used yaml while
the other did not.

Kind regards

robert

ara howard · Jan 9, 2008

Ah, missed that.

Anyway ara re-asked the question on ruby-core and Matz answered him
pretty much the same way I did.

sort of - i still don't see how

1) the 'object' ended up being bound. it was a local var of another
function. makes no sense.

2) why this happened for only say, 1 of 100000 objects

i still think it's a bug, but a have a work around - see dike
announcement.

a @ http://codeforpeople.com/

Rick DeNatale · Jan 9, 2008

sort of - i still don't see how

1) the 'object' ended up being bound. it was a local var of another
function. makes no sense.

I think that this is the code you are talking about right?

class Class
Finalizer = lambda { }

def leak_free_finalizer
Finalizer
end

def leaky_finalizer
# What's self here? It's the same as for the finalizer method
# which
lambda{}
end

def finalizer
%r/leak/ =~ ARGV.first ? leaky_finalizer : leak_free_finalizer
end

def new *a, &b
object = allocate
object.send :initialize, *a, &b
object
ensure
ObjectSpace.define_finalizer object, finalizer
end
end

Looking at eval.c, it looks like lambda actually copies information
from the invocation stack, not just from the current frame. In the
leaky finalizer case we have the following on the stack when
leaky_finalizer is called, with the binding represented in hash
notation.

leaky_finalizer :self => Class
finalizer :self => Class
new :self=> Class,

bject => the new Array instance
code which called Array.new

The leak free finalizer lambda was created once at a time when no
instance to be finalized was on the stack.

ara howard · Jan 9, 2008

Looking at eval.c, it looks like lambda actually copies information
from the invocation stack, not just from the current frame. In the
leaky finalizer case we have the following on the stack when
leaky_finalizer is called, with the binding represented in hash
notation.

leaky_finalizer :self => Class
finalizer :self => Class
new :self=> Class, bject => the new Array
instance
code which called Array.new

The leak free finalizer lambda was created once at a time when no
instance to be finalized was on the stack.

yeah i think that may be true - but it doesn't make sense. the

def leaky_finalizer
lambda{}
end

is the current paradigm for preventing lambdas from enclosing a
reference to an object. what you are saying is that this call
encloses a local variable from another (the calling in the this case)
function.

how would that not be a bug? why enclose a variable that cannot
possible be reached in the code ran? this seems, to me just like this
code

void * leak (){ return(&malloc(42)) }

i just cannot image why lambda would crawl up the stack outside the
current function. if that is true then they are useless and *any*
invocation means every object in memory at the time of creation can
never be freed while that lambda exists. doesn't that seem
excessive? also i use tons of lambdas in code that does not leak so
this just seems impossible.

nevertheless you may be right!

cheers.

a @ http://codeforpeople.com/

Rick DeNatale · Jan 9, 2008

yeah i think that may be true - but it doesn't make sense. the

def leaky_finalizer
lambda{}
end

is the current paradigm for preventing lambdas from enclosing a
reference to an object. what you are saying is that this call
encloses a local variable from another (the calling in the this case)
function.

Well I'm positing that based on a rather superficial read of eval.c.

how would that not be a bug? why enclose a variable that cannot
possible be reached in the code ran?

Well I suspect that it's because no real analysis is done of what's
inside the block when a proc is created, so the assumption is that the
entire binding is needed. The Smalltalk compilers I recall would
produce different types of block objects depending on whether or not
the block contained references to variables outside the block, and/or
contained a return.

Now why it goes back down the stack, if it indeed does, I'm not sure.
Perhaps it has something to do with the lambda vs. Proc.new
differences. I think that Proc.new and lambda/proc both use the
proc_alloc function where this seems to be happening.

In any event this is probably more a topic for ruby-core so I'm
cross-posting this reply there.

this seems, to me just like this
code

void * leak (){ return(&malloc(42)) }

i just cannot image why lambda would crawl up the stack outside the
current function. if that is true then they are useless and *any*
invocation means every object in memory at the time of creation can
never be freed while that lambda exists. doesn't that seem
excessive? also i use tons of lambdas in code that does not leak so
this just seems impossible.

nevertheless you may be right!

Or not <G>

Robert Dober · Jan 10, 2008

Just trying to answer the question if it is a bug by making a minimum
version of the leaking version (and correcting Ara's terrible bug how
to write 7

and running it with 1.8 and 1.9
591/92 > cat leak.rb
# vim: sw=2 ts=2 ft=ruby expandtab tw=0 nu syn:
#
class Foo
def initialize
ObjectSpace.define_finalizer self, lambda{}
end
end

(42/6).times {
GC.start

p "Foo" => ObjectSpace.each_object(Foo){}

Foo.new
}

robert@roma:~/log/ruby/theory 13:19:00
592/93 > ruby leak.rb
{"Foo"=>0}
{"Foo"=>1}
{"Foo"=>2}
{"Foo"=>3}
{"Foo"=>4}
{"Foo"=>5}
{"Foo"=>6}
robert@roma:~/log/ruby/theory 13:19:06
593/94 > ruby1.9 leak.rb
{"Foo"=>0}
{"Foo"=>1}
{"Foo"=>1}
{"Foo"=>1}
{"Foo"=>1}
{"Foo"=>1}
{"Foo"=>1}

I do not know if this is good enough to say it is a bug in 1.8, but I
would somehow suspect so.

Cheers
Robert

ara howard · Jan 10, 2008

Just trying to answer the question if it is a bug by making a minimum
version of the leaking version (and correcting Ara's terrible bug how
to write 7 and running it with 1.8 and 1.9
591/92 > cat leak.rb
# vim: sw=2 ts=2 ft=ruby expandtab tw=0 nu syn:
#
class Foo
def initialize
ObjectSpace.define_finalizer self, lambda{}
end
end

(42/6).times {
GC.start

<snip>

nice! ;-)

so i posted something like this over on ruby-core, which i'll add for
posterity:

"

to add a note to the end of the thread the fix to my problem was
essentially this

class Class
New = instance_method :new
Objects = Hash.new
Destroy = lambda{|object_id| Objects.delete object_id}

def new *a, &b
object = allocate
Objects[object.object_id] = caller
object.send :initialize *a, &b
object
ensure
ObjectSpace.define_finalizer object, Destroy
end
end

and that

class Class
def destroy
lambda{}
end

....
ObjectSpace.define_finalizer object, destroy
....
end

perhaps i have not explained this adequately, but i still feel this is
a bug. the 'self' that is enclosed is *never* 'object' and that self
has no reference, save a local variable in another function, that
refers to 'object'.

in any case i have a workaround and dike.rb is better than ever (see
[ANN] on ruby-talk) so thanks all for the input!

"

a @ http://codeforpeople.com/

Tim Pease · Jan 10, 2008

Just trying to answer the question if it is a bug by making a minimum
version of the leaking version (and correcting Ara's terrible bug how
to write 7 and running it with 1.8 and 1.9
591/92 > cat leak.rb
# vim: sw=2 ts=2 ft=ruby expandtab tw=0 nu syn:
#
class Foo
def initialize
ObjectSpace.define_finalizer self, lambda{}
end
end

(42/6).times {
GC.start

p "Foo" => ObjectSpace.each_object(Foo){}

Foo.new
}

When you create the lambda, what is the value of "self" inside the
lambda?

The answer is that it is going to be the object in which the lambda
was created. In the code above, this would be the object that you are
trying to finalize -- i.e. an instance of Foo. Since the lambda has a
reference to the Foo instance, that instance will always be marked by
the GC, and hence, it will never be garbage collected.

You can verify this by adding a puts statement inside the lambda ...

$ cat a.rb
class Foo
def initialize
ObjectSpace.define_finalizer self, lambda {puts self.object_id}
end
end

10.times {
GC.start
Foo.new
p "Foo" => ObjectSpace.each_object(Foo){}
}

$ ruby a.rb
{"Foo"=>1}
{"Foo"=>2}
{"Foo"=>3}
{"Foo"=>4}
{"Foo"=>5}
{"Foo"=>6}
{"Foo"=>7}
{"Foo"=>8}
{"Foo"=>9}
{"Foo"=>10}

The object ID is never printed; hence, the finalizer is never called.

Now let's define the finalizer lambda outside the scope of the
instance we are trying to finalize. This prevents the lambda from
having a reference to the Foo instance.

$ cat a.rb
Finalizer = lambda do |object_id|
puts object_id
end

class Foo
def initialize
ObjectSpace.define_finalizer self, Finalizer
end
end

10.times {
GC.start
Foo.new
p "Foo" => ObjectSpace.each_object(Foo){}
}

$ ruby a.rb
{"Foo"=>1}
89480
{"Foo"=>1}
{"Foo"=>2}
89480
{"Foo"=>2}
84800
{"Foo"=>2}
89480
{"Foo"=>2}
84800
{"Foo"=>2}
89480
{"Foo"=>2}
84800
{"Foo"=>2}
{"Foo"=>3}
84780
84800
89480

You'll notice that the Foo instance count does not grow (yes, it is
shown as non-zero at the end of the program). But you'll also notice
that the finalizer is called exactly 10 times. Even though the last
Foo instance count shows 3 objects remaining, they are cleaned up as
shown by their object IDs being printed out by our finalizer.

The lesson here is that you always need to create your finalizer Proc
at the Class level, not at the instance level.

The ruby garbage collector is conservative, but it will clean up after
you just fine.

Blessings,
TwP

Why does this code works without cat ?	46	Apr 14, 2012
FFI Memory Leak	4	Dec 24, 2009
Memory leak	8	May 25, 2007
Memory leak in 1.9.2-p330?	2	Feb 8, 2011
Why does this template code compile?	2	Jan 4, 2014
[ANN] nmap-1.1.0	3	Jul 25, 2010
ANN main-4.4.0	0	Nov 25, 2010
Net::HTTP Closes STDIN	29	Jan 28, 2007

why does this code leak?

ara howard

tsuraan

Jan Dvorak

Michael Bevilacqua-Linn

ara howard

ara howard

ara howard

dan yoder

dan yoder

ara howard

Rick DeNatale

Robert Klemme

Rick DeNatale

ara howard

Rick DeNatale

ara howard

Rick DeNatale

Robert Dober

ara howard

Tim Pease

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads