Ruby Specification

J

Jeff Mitchell

Joel VanderWerf said:
In my understanding, ruby makes no guarantees about memory management,
except that unreferenced objects will eventually get recycled. A ruby
specification should state explicitly that, aside from this guarantee,
behavior is unspecified.

Of course, if programmers don't read the spec, they can still get bitten...

Technically there is not even a guarantee that an unreferenced object
will be recycled. Data on the stack can collide with a non-immediate
VALUE, causing the GC to believe a reference still exists after the
"real" references are gone.

Of course the probability of such a collision is small, but it doesn't
hurt to know it is there. For long-running daemons using many objects
and/or ruby C code using lots of stack space it may possibly be
something to consider, though I haven't looked at the numbers yet.
 
J

Joel VanderWerf

Jeff said:
Technically there is not even a guarantee that an unreferenced object
will be recycled. Data on the stack can collide with a non-immediate
VALUE, causing the GC to believe a reference still exists after the
"real" references are gone.

Of course the probability of such a collision is small, but it doesn't
hurt to know it is there. For long-running daemons using many objects
and/or ruby C code using lots of stack space it may possibly be
something to consider, though I haven't looked at the numbers yet.

True. I was forgetting that the GC is conservative. So "eventually" may
mean "before the interpreter exits".
 
L

Lothar Scholz

Hello Jeff,


JM> Of course the probability of such a collision is small, but it doesn't
JM> hurt to know it is there. For long-running daemons using many objects
JM> and/or ruby C code using lots of stack space it may possibly be
JM> something to consider, though I haven't looked at the numbers yet.

Arachno Ruby as a larger program is using the Boehm Weisser GC. That
does not only consider all elements on the stack but also everything
in the heap as possible pointers, so there are many things that can be
wrongly considered as a still in use data. But this seems to be not
the problem, when i compare it with the exact SmartEiffel Internal GC,
the overhead in non released data seems to be around 20%. The much
more serious problem in the Boehm Weisser GC is the high internal heap
fragmentation it seems to get stable after having 5 times the
dataset size, from previous posts here i believe that the ruby GC is
working well.

So if we ever change the internal memory managment (i
guess we do not) in ruby, then i hope it will also include a
compacting GC. In the meantime a lazy mark with write barrier and lazy
sweep would be a good thing to add. I found that the GC can give you a
huge time penalty because it simply runs to often and there is no way
to customize this.
 
G

Gully Foyle

Lothar said:
Hello Jeff,


JM> Of course the probability of such a collision is small, but it doesn't
JM> hurt to know it is there. For long-running daemons using many objects
JM> and/or ruby C code using lots of stack space it may possibly be
JM> something to consider, though I haven't looked at the numbers yet.

Arachno Ruby as a larger program is using the Boehm Weisser GC. That
does not only consider all elements on the stack but also everything
in the heap as possible pointers, so there are many things that can be
wrongly considered as a still in use data. But this seems to be not
the problem, when i compare it with the exact SmartEiffel Internal GC,
the overhead in non released data seems to be around 20%. The much
more serious problem in the Boehm Weisser GC is the high internal heap
fragmentation it seems to get stable after having 5 times the
dataset size, from previous posts here i believe that the ruby GC is
working well.

So if we ever change the internal memory managment (i
guess we do not) in ruby, then i hope it will also include a
compacting GC. In the meantime a lazy mark with write barrier and lazy
sweep would be a good thing to add. I found that the GC can give you a
huge time penalty because it simply runs to often and there is no way
to customize this.
Fascinating stuff. You sound MUCH smarter than Lothar of the Hill
People (played by Mike Meyers). And your Arachno Ruby IDE looks really
cool too.

IMHO, I think it would be good to leverage all the research and millions
already invested by Sun, IBM, et al. in improving java's gc. I think
java provides some command line options related to garbage collection.

No need to reinvent the wheel--unless the existing wheel isn't round. :)

I think Ruby 2.0 is very lucky to be looking into bytecode generation
and gc today because of all the work done recently by others.
 
H

Hal Fulton

Gully said:
Fascinating stuff. You sound MUCH smarter than Lothar of the Hill
People (played by Mike Meyers). And your Arachno Ruby IDE looks really
cool too.

Haha, now, play nice. Lothar is a sharp guy. Though LOL at Mike Meyers.
IMHO, I think it would be good to leverage all the research and millions
already invested by Sun, IBM, et al. in improving java's gc. I think
java provides some command line options related to garbage collection.

No need to reinvent the wheel--unless the existing wheel isn't round. :)

But is it? Personally GC is one of those areas where I'm completely
ignorant. I need to delve into SICP on of these days... it's on my
to-do list. ;)


Hal
 
R

Ruben

At Wed, 21 Jul 2004 14:05:43 +0900,
Lothar said:
Arachno Ruby as a larger program is using the Boehm Weisser GC. That
does not only consider all elements on the stack but also everything
in the heap as possible pointers, so there are many things that can be
wrongly considered as a still in use data. But this seems to be not

Do you use it as a drop-in replacement for "malloc" ? Because, from
what i remember, this GC provides several functions to give it more
precise information about the data in your heap. I think there are
functions to tell it to allocate for example a chunk of memory for
binary data for which you, as the programmer, guarantee that it will
never contain pointers.
the problem, when i compare it with the exact SmartEiffel Internal GC,
the overhead in non released data seems to be around 20%. The much
more serious problem in the Boehm Weisser GC is the high internal heap
fragmentation it seems to get stable after having 5 times the
dataset size, from previous posts here i believe that the ruby GC is
working well.

This behavior depends for a great deal on the behavior of your
application. For BDW, fragmentation could only be improved by having a
better allocation strategy.
So if we ever change the internal memory managment (i guess we do
not) in ruby, then i hope it will also include a compacting GC. In
the meantime a lazy mark with write barrier and lazy sweep would be
a good thing to add. I found that the GC can give you a huge time
penalty because it simply runs to often and there is no way to
customize this.

Compacting is only possible if you know the root set and have exact
information about which cells contain a pointer and which cells
don't. AFAIK, a pure compacting collector isn't possible with ruby
because of the C extensions... or scanning the raw stack for finding
pointers in the root set (?)

Ruben
 
L

Lothar Scholz

Hello Ruben,

R> At Wed, 21 Jul 2004 14:05:43 +0900,

R> Do you use it as a drop-in replacement for "malloc" ? Because, from

Yes i only changed malloc and calloc.

R> what i remember, this GC provides several functions to give it more
R> precise information about the data in your heap. I think there are
R> functions to tell it to allocate for example a chunk of memory for
R> binary data for which you, as the programmer, guarantee that it will
R> never contain pointers.

Right, it can completely avoid scanning heap data if you give a type
descriptor for allocated memory that tells the GC where it must look
for embedded pointers. But this requires a larger hack in the
SmartEiffel compiler and i hope that the new 2.0 release will fix the
bugs in there internal GC.

R> This behavior depends for a great deal on the behavior of your
R> application. For BDW, fragmentation could only be improved by having a
R> better allocation strategy.

Right, i heared about the same numbers on the smalleiffel mailing list
from other persons.

R> Compacting is only possible if you know the root set and have exact
R> information about which cells contain a pointer and which cells
R> don't. AFAIK, a pure compacting collector isn't possible with ruby
R> because of the C extensions... or scanning the raw stack for finding
R> pointers in the root set (?)

Right, it is impossible with the current design and it needs a lot of
changes to external C extensions and also the use of an own ruby
thread instead of mixing it with the C stack. It's just that when you
talk about a complete rewrite, like the OP's wish for a "rubycc" then
it should be discussed. "matz" already pointed out that he don't care
about fragmentation is will not add something like this to the
official ruby implementation.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,756
Messages
2,569,535
Members
45,008
Latest member
obedient dusk

Latest Threads

Top