GC run at end of script execution - order in which objects are claimed?

  • Thread starter Tilman Sauerbeck
  • Start date
T

Tilman Sauerbeck

Hi,
when ruby finished executing code, it will invoke the GC to claim all
remaining objects.

Is the order, in which the GC claims the objects at this point, defined
at all or does it claim them in undefined order?

If it's defined, how does it decide whether object A or object B should
be claimed first?

Background:
In my Ruby extension, there are "parent" objects and "child" objects.
The "parent" object must not be freed if there are still "child" objects
alive. I enforce this policy by using rb_gc_mark(). This seems to work,
cause there are no problems as long as the ruby script executes.

However, when I quit the application, the GC seems to break this policy.

Thanks in advance for any hints,
 
T

ts

T> Is the order, in which the GC claims the objects at this point, defined
T> at all or does it claim them in undefined order?

It's like the key for an hash : there is a defined order but you can't
predict it :)

T> In my Ruby extension, there are "parent" objects and "child" objects.
T> The "parent" object must not be freed if there are still "child" objects
T> alive. I enforce this policy by using rb_gc_mark(). This seems to work,
T> cause there are no problems as long as the ruby script executes.

You are just lucky :)

One day, probably, you'll have a problem.

T> However, when I quit the application, the GC seems to break this policy.

You must write your extension, in such a way that the child are freed before
the parent. Never expect that ruby will do this for you.


Guy Decoux
 
Y

Yukihiro Matsumoto

Hi,

In message "Re: GC run at end of script execution - order in which objects are claimed?"

|In my Ruby extension, there are "parent" objects and "child" objects.
|The "parent" object must not be freed if there are still "child" objects
|alive. I enforce this policy by using rb_gc_mark(). This seems to work,
|cause there are no problems as long as the ruby script executes.
|
|However, when I quit the application, the GC seems to break this policy.

Yes, GC calls destructors (dfree) in random order.

Questions:

Is it OK to terminate the process without any freeing?
Or is it mandatory to destruct objects in certain order?

matz.
 
T

Tilman Sauerbeck

ts said:
T> Is the order, in which the GC claims the objects at this point, defined
T> at all or does it claim them in undefined order?

It's like the key for an hash : there is a defined order but you can't
predict it :)
Okay.

T> However, when I quit the application, the GC seems to break this policy.

You must write your extension, in such a way that the child are freed before
the parent. Never expect that ruby will do this for you.

I'm confused - I thought I'm telling Ruby in what order the objects
should be freed by the way I'm calling rb_gc_mark().

If I cannot define the order in which the objects are freed by the
rb_gc_mark() function calls, how else could I do that?

I have a feeling there will be a lot of problems if I hack my extension
so that it doesn't do anything in the object-free callback.
I'd have to keep track of objects that Ruby thinks are freed but that
still exist in C space... and free them at some point. Ugh.

Please shed some more light on this.
 
T

Tilman Sauerbeck

Yukihiro Matsumoto said:
Hi,

In message "Re: GC run at end of script execution - order in which objects are claimed?"

|In my Ruby extension, there are "parent" objects and "child" objects.
|The "parent" object must not be freed if there are still "child" objects
|alive. I enforce this policy by using rb_gc_mark(). This seems to work,
|cause there are no problems as long as the ruby script executes.
|
|However, when I quit the application, the GC seems to break this policy.

Questions:

Is it OK to terminate the process without any freeing?

That would be very hackish IMHO :/
Or is it mandatory to destruct objects in certain order?

Since the random free'ing is causing sincere memory management problems
in my extension, I'd say "yes". Although I think there must be some
not-so-hard way to avoid this; there must have been extension developers
who had the very same problem.
 
T

ts

T> I'm confused - I thought I'm telling Ruby in what order the objects
T> should be freed by the way I'm calling rb_gc_mark().

T> If I cannot define the order in which the objects are freed by the
T> rb_gc_mark() function calls, how else could I do that?

I can *only* give you an example

svg% cat b.rb
#!/usr/bin/ruby
require 'bz2'
io = File.open('b.rb.bz2', 'w')
bz2 = BZ2::Writer.new(io)
IO.foreach('b.rb') do |line|
bz2.puts line
end
svg%

svg% b.rb
svg%

svg% bunzip2 < b.rb.bz2
#!/usr/bin/ruby
require 'bz2'
io = File.open('b.rb.bz2', 'w')
bz2 = BZ2::Writer.new(io)
IO.foreach('b.rb') do |line|
bz2.puts line
end
svg%

To make this example work, `bz2' must be freed before `io' because it must
flush its buffer.

bz2.c is written in such way, that it don't depend in what order the
objects will be freed.


Guy Decoux
 
Y

Yukihiro Matsumoto

Hi,

In message "Re: GC run at end of script execution - order in which objects are claimed?"

|> Is it OK to terminate the process without any freeing?
|
|That would be very hackish IMHO :/

I'm not sure how much hackish it is. Basically all resources are
reclaimed by the operating system at the process termination. If you
have something that can not be reclaimed automatically by the OS,
you can not expect GC to handle it automagically either.

|> Or is it mandatory to destruct objects in certain order?
|
|Since the random free'ing is causing sincere memory management problems
|in my extension, I'd say "yes". Although I think there must be some
|not-so-hard way to avoid this; there must have been extension developers
|who had the very same problem.

Unfortunately I cannot think of any "not-so-hard way" to free them in
the order you expect. Freeing in the "right" order increases the
process termination cost by the order of magnitudes, I suspect. I
don't think it's worth the cost.

The memory will be vanished with the process anyway, unless your
"memory" resides outside of the process, i.e. in the database.

Don't get me wrong. I'm saying that I don't know how to satisfy your
requirement (yet), not refusing you or your idea.

matz.
 
Y

Yukihiro Matsumoto

Hi,

In message "Re: GC run at end of script execution - order in which objects are claimed?"

|Since the random free'ing is causing sincere memory management problems
|in my extension, I'd say "yes". Although I think there must be some
|not-so-hard way to avoid this; there must have been extension developers
|who had the very same problem.

Oh, I got an idea. How about making a parent free its children
objects eagerly in dfree function, then free itself? You can use some
kind of flag to prevent double free.

matz.
 
G

Glenn Parker

Yukihiro said:
I'm not sure how much hackish it is. Basically all resources are
reclaimed by the operating system at the process termination. If you
have something that can not be reclaimed automatically by the OS,
you can not expect GC to handle it automagically either.

A counter-example would be creating a named pipe, or a shared memory
segment, or a large tmp file that should be removed at program exit. If
an application relies on finalizers or other semi-non-deterministic
tricks to accomplish the cleanup, then it could easily leave system
resources allocated. IMHO, such an application is flawed, since there
are much better ways to guarantee cleanup at the end of an application.
It's a matter of setting expectations.

I also like exiting without going through a complete GC at the end of
execution. It's faster and avoids the tricky problems with the ordering
of destructors. But, I'm also willing to mandate the use of at_exit()
for critical cleanup tasks.
 
Y

Yukihiro Matsumoto

Hi,

In message "Re: GC run at end of script execution - order in which objects are claimed?"

|> I'm not sure how much hackish it is. Basically all resources are
|> reclaimed by the operating system at the process termination. If you
|> have something that can not be reclaimed automatically by the OS,
|> you can not expect GC to handle it automagically either.
|
|A counter-example would be creating a named pipe, or a shared memory
|segment, or a large tmp file that should be removed at program exit. If
|an application relies on finalizers or other semi-non-deterministic
|tricks to accomplish the cleanup, then it could easily leave system
|resources allocated. IMHO, such an application is flawed, since there
|are much better ways to guarantee cleanup at the end of an application.
| It's a matter of setting expectations.

I think we are saying same thing. My "something that can not be
reclaimed automatically" is your "a named pipe, or a shared memory
segment, or a large tmp file that should be removed at program exit",
right?

matz.
 
G

Glenn Parker

Yukihiro said:
Hi,

In message "Re: GC run at end of script execution - order in which objects are claimed?"

|> I'm not sure how much hackish it is. Basically all resources are
|> reclaimed by the operating system at the process termination. If you
|> have something that can not be reclaimed automatically by the OS,
|> you can not expect GC to handle it automagically either.
|
|A counter-example would be creating a named pipe, or a shared memory
|segment, or a large tmp file that should be removed at program exit. If
|an application relies on finalizers or other semi-non-deterministic
|tricks to accomplish the cleanup, then it could easily leave system
|resources allocated. IMHO, such an application is flawed, since there
|are much better ways to guarantee cleanup at the end of an application.
| It's a matter of setting expectations.

I think we are saying same thing. My "something that can not be
reclaimed automatically" is your "a named pipe, or a shared memory
segment, or a large tmp file that should be removed at program exit",
right?

Yes, we are saying the same thing.
 
T

Tobias Peters

Yukihiro said:
Is it OK to terminate the process without any freeing?

In the past, I made use of the end-of-process freeing for testing
purposes, i.e. I tested that I registered correct "free" functions for
extension objects. Since the GC is conservative, the only reliable way
to do so was to start a separate ruby process, let it terminate, and
evaluate its debugging printouts.
Or is it mandatory to destruct objects in certain order?

Unnecessary, IMHO.

Tobias
 
T

Tilman Sauerbeck

ts said:
T> I'm confused - I thought I'm telling Ruby in what order the objects
T> should be freed by the way I'm calling rb_gc_mark().

T> If I cannot define the order in which the objects are freed by the
T> rb_gc_mark() function calls, how else could I do that?

I can *only* give you an example

[snip]

To make this example work, `bz2' must be freed before `io' because it must
flush its buffer.

bz2.c is written in such way, that it don't depend in what order the
objects will be freed.

Thanks, I'll consider jumping through all of this hoops to make it work.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,776
Messages
2,569,603
Members
45,189
Latest member
CryptoTaxSoftware

Latest Threads

Top