Ruby Specification

Jeff Mitchell · Jul 21, 2004

Joel VanderWerf said:
In my understanding, ruby makes no guarantees about memory management,
except that unreferenced objects will eventually get recycled. A ruby
specification should state explicitly that, aside from this guarantee,
behavior is unspecified.

Of course, if programmers don't read the spec, they can still get bitten...

Technically there is not even a guarantee that an unreferenced object
will be recycled. Data on the stack can collide with a non-immediate
VALUE, causing the GC to believe a reference still exists after the
"real" references are gone.

Of course the probability of such a collision is small, but it doesn't
hurt to know it is there. For long-running daemons using many objects
and/or ruby C code using lots of stack space it may possibly be
something to consider, though I haven't looked at the numbers yet.

Joel VanderWerf · Jul 21, 2004

Jeff said:
Technically there is not even a guarantee that an unreferenced object
will be recycled. Data on the stack can collide with a non-immediate
VALUE, causing the GC to believe a reference still exists after the
"real" references are gone.

Of course the probability of such a collision is small, but it doesn't
hurt to know it is there. For long-running daemons using many objects
and/or ruby C code using lots of stack space it may possibly be
something to consider, though I haven't looked at the numbers yet.

True. I was forgetting that the GC is conservative. So "eventually" may
mean "before the interpreter exits".

Lothar Scholz · Jul 21, 2004

Hello Jeff,

JM> Of course the probability of such a collision is small, but it doesn't
JM> hurt to know it is there. For long-running daemons using many objects
JM> and/or ruby C code using lots of stack space it may possibly be
JM> something to consider, though I haven't looked at the numbers yet.

Arachno Ruby as a larger program is using the Boehm Weisser GC. That
does not only consider all elements on the stack but also everything
in the heap as possible pointers, so there are many things that can be
wrongly considered as a still in use data. But this seems to be not
the problem, when i compare it with the exact SmartEiffel Internal GC,
the overhead in non released data seems to be around 20%. The much
more serious problem in the Boehm Weisser GC is the high internal heap
fragmentation it seems to get stable after having 5 times the
dataset size, from previous posts here i believe that the ruby GC is
working well.

So if we ever change the internal memory managment (i
guess we do not) in ruby, then i hope it will also include a
compacting GC. In the meantime a lazy mark with write barrier and lazy
sweep would be a good thing to add. I found that the GC can give you a
huge time penalty because it simply runs to often and there is no way
to customize this.

Gully Foyle · Jul 21, 2004

Lothar said:
Hello Jeff,

JM> Of course the probability of such a collision is small, but it doesn't
JM> hurt to know it is there. For long-running daemons using many objects
JM> and/or ruby C code using lots of stack space it may possibly be
JM> something to consider, though I haven't looked at the numbers yet.

Arachno Ruby as a larger program is using the Boehm Weisser GC. That
does not only consider all elements on the stack but also everything
in the heap as possible pointers, so there are many things that can be
wrongly considered as a still in use data. But this seems to be not
the problem, when i compare it with the exact SmartEiffel Internal GC,
the overhead in non released data seems to be around 20%. The much
more serious problem in the Boehm Weisser GC is the high internal heap
fragmentation it seems to get stable after having 5 times the
dataset size, from previous posts here i believe that the ruby GC is
working well.

So if we ever change the internal memory managment (i
guess we do not) in ruby, then i hope it will also include a
compacting GC. In the meantime a lazy mark with write barrier and lazy
sweep would be a good thing to add. I found that the GC can give you a
huge time penalty because it simply runs to often and there is no way
to customize this.

Fascinating stuff. You sound MUCH smarter than Lothar of the Hill
People (played by Mike Meyers). And your Arachno Ruby IDE looks really
cool too.

IMHO, I think it would be good to leverage all the research and millions
already invested by Sun, IBM, et al. in improving java's gc. I think
java provides some command line options related to garbage collection.

No need to reinvent the wheel--unless the existing wheel isn't round.

I think Ruby 2.0 is very lucky to be looking into bytecode generation
and gc today because of all the work done recently by others.

Hal Fulton · Jul 21, 2004

Gully said:
Fascinating stuff. You sound MUCH smarter than Lothar of the Hill
People (played by Mike Meyers). And your Arachno Ruby IDE looks really
cool too.

Haha, now, play nice. Lothar is a sharp guy. Though LOL at Mike Meyers.

IMHO, I think it would be good to leverage all the research and millions
already invested by Sun, IBM, et al. in improving java's gc. I think
java provides some command line options related to garbage collection.

No need to reinvent the wheel--unless the existing wheel isn't round.

But is it? Personally GC is one of those areas where I'm completely
ignorant. I need to delve into SICP on of these days... it's on my
to-do list.

Hal

Ruben · Jul 21, 2004

At Wed, 21 Jul 2004 14:05:43 +0900,

Lothar said:
Arachno Ruby as a larger program is using the Boehm Weisser GC. That
does not only consider all elements on the stack but also everything
in the heap as possible pointers, so there are many things that can be
wrongly considered as a still in use data. But this seems to be not

Do you use it as a drop-in replacement for "malloc" ? Because, from
what i remember, this GC provides several functions to give it more
precise information about the data in your heap. I think there are
functions to tell it to allocate for example a chunk of memory for
binary data for which you, as the programmer, guarantee that it will
never contain pointers.

the problem, when i compare it with the exact SmartEiffel Internal GC,
the overhead in non released data seems to be around 20%. The much
more serious problem in the Boehm Weisser GC is the high internal heap
fragmentation it seems to get stable after having 5 times the
dataset size, from previous posts here i believe that the ruby GC is
working well.

This behavior depends for a great deal on the behavior of your
application. For BDW, fragmentation could only be improved by having a
better allocation strategy.

So if we ever change the internal memory managment (i guess we do
not) in ruby, then i hope it will also include a compacting GC. In
the meantime a lazy mark with write barrier and lazy sweep would be
a good thing to add. I found that the GC can give you a huge time
penalty because it simply runs to often and there is no way to
customize this.

Compacting is only possible if you know the root set and have exact
information about which cells contain a pointer and which cells
don't. AFAIK, a pure compacting collector isn't possible with ruby
because of the C extensions... or scanning the raw stack for finding
pointers in the root set (?)

Ruben

Lothar Scholz · Jul 21, 2004

Hello Ruben,

R> At Wed, 21 Jul 2004 14:05:43 +0900,

R> Do you use it as a drop-in replacement for "malloc" ? Because, from

Yes i only changed malloc and calloc.

R> what i remember, this GC provides several functions to give it more
R> precise information about the data in your heap. I think there are
R> functions to tell it to allocate for example a chunk of memory for
R> binary data for which you, as the programmer, guarantee that it will
R> never contain pointers.

Right, it can completely avoid scanning heap data if you give a type
descriptor for allocated memory that tells the GC where it must look
for embedded pointers. But this requires a larger hack in the
SmartEiffel compiler and i hope that the new 2.0 release will fix the
bugs in there internal GC.

R> This behavior depends for a great deal on the behavior of your
R> application. For BDW, fragmentation could only be improved by having a
R> better allocation strategy.

Right, i heared about the same numbers on the smalleiffel mailing list
from other persons.

R> Compacting is only possible if you know the root set and have exact
R> information about which cells contain a pointer and which cells
R> don't. AFAIK, a pure compacting collector isn't possible with ruby
R> because of the C extensions... or scanning the raw stack for finding
R> pointers in the root set (?)

Right, it is impossible with the current design and it needs a lot of
changes to external C extensions and also the use of an own ruby
thread instead of mixing it with the C stack. It's just that when you
talk about a complete rewrite, like the OP's wish for a "rubycc" then
it should be discussed. "matz" already pointed out that he don't care
about fragmentation is will not add something like this to the
official ruby implementation.

[ANN] Ruby draft specification	5	Dec 1, 2009
RULA?	1	Jul 22, 2004
Format specification mini-language for list joining	5	Nov 10, 2012
Exception specification checked at runtime?	8	Nov 1, 2007
mysql gem: version specification and update problem	2	Dec 11, 2007
ruby + Qt + Windows + 64 bit	0	Jan 24, 2014
Best practices for hacking on ruby code base	0	Sep 3, 2013
Relax NG specification oddity	4	May 17, 2006

Ruby Specification

Jeff Mitchell

Joel VanderWerf

Lothar Scholz

Gully Foyle

Hal Fulton

Ruben

Lothar Scholz

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads