JRuby disabling ObjectSpace: what implications?

  • Thread starter Charles Oliver Nutter
  • Start date
R

Robert Klemme

In general, though, we haven't explored JVMTI because we want JRuby to
be the best production environment for deploying apps, and nobody will
EVER turn on JVMTI on their production servers.

Well, it depends on the overhead and on the invocation model. I
assumed you would be starting a JVM per process but your other remarks
sound more like there is one JVM for JRuby programs...
Is it really ok? You need to remember that JRuby opens up the
possibility of running many, many applications in the same process, as
well as asynchronous algorithms with true parallel threads. We can't
expect people to cripple all that so they can walk EVERY object in the
system. "Stop the world" is awful when you start breaking the ability to
do many things in parallel, as you can in JRuby.

Ok, I see I need to dive further into JRuby before I discuss this further. :)
But it may be that for cases where each_object is needed, this is a
reasonable thing to do. I think if someone were to submit an
implementation of each_object that uses JVMTI, we would certainly accept
it :)

Hint, hint... :)
The problem is not so much that the object references move as that you
would have to lock the memory locations for some period of time to be
able to walk the object table. And I think that's *bad* especially when
we're looking at JRuby allowing folks to run dozens of apps in the same
process and memory space out of the box. We can't lock things down like
that.

I don't understand this remark of yours. If you implement this in Java
land (as you did apparently with WeakReferences) then there is no need
to lock anything. You just traverse the list (or a copy of the list)
and if a ref has been set to null you do not pass it to the callback.

If it is some kind of native code (possibly via JNI or other
interfaces) probably more care has to be taken, although I'd assume
that JNI takes care of this (i.e. once the callback is invoked with a
non null argument the object stays life until after the callback
returns unless you clear that reference of course).

Traversal during #each_object in that respect is similar to traversal
through an ordinary collection - during that a GC can occur just the
same but that does not affect the traversal in any way.

What am I missing?

Kind regards

robert
 
C

Charles Oliver Nutter

mortee said:
Charles said:
Actually, we do that a bit already. For example, we do not track arrays
constructed during argument processing, since they are typically
transient. The problem is that we could only choose to track all Ruby
objects, for example...which would cripple other JRuby apps running in
the same process.
[...]

The problem is not so much that the object references move as that you
would have to lock the memory locations for some period of time to be
able to walk the object table. And I think that's *bad* especially when
we're looking at JRuby allowing folks to run dozens of apps in the same
process and memory space out of the box. We can't lock things down like
that.

Sorry for the extremely uninitiated and naive question - but when you're
about to enumerate each object in an application, aren't you interested
only in this application's objects anyway? So why would you have to lock
anything about the other ruby apps in the same process? Is that kind of
distinguishing objects impossible on the GC/enumeration level?

As far as I know there's no way to have JVMTI enumerate only objects
created by a specific application in a given JVM. So any sort of
ObjectSpace impl based on it would have to take that into consideration.

- Charlie
 
C

Charles Oliver Nutter

I think of each_object as very much a MRI implementation feature that
the rest of us
implementors struggle to implement. Because of this, the community and
core members of
each implementation need to really beginning discussing whether or not
each_object is a
Ruby feature or an MRI feature.

That's actually a really good point. each_object is more a feature of an
individual implementation's memory model than a general feature that can
be applied to every Ruby implementation. In many cases, like ours, you
simply don't have control over that memory model enough to provide a
real each_object implementation (and _id2ref requires tricks too, but
it's at least bounded and explicit). So it may be fair to say that
each_object is an MRI feature we emulate, but cannot simulate well
enough for it to translate appropriately.
 
R

Robert Klemme

2007/10/29 said:
mortee said:
Charles said:
Actually, we do that a bit already. For example, we do not track arrays
constructed during argument processing, since they are typically
transient. The problem is that we could only choose to track all Ruby
objects, for example...which would cripple other JRuby apps running in
the same process.
[...]

The problem is not so much that the object references move as that you
would have to lock the memory locations for some period of time to be
able to walk the object table. And I think that's *bad* especially when
we're looking at JRuby allowing folks to run dozens of apps in the same
process and memory space out of the box. We can't lock things down like
that.

Sorry for the extremely uninitiated and naive question - but when you're
about to enumerate each object in an application, aren't you interested
only in this application's objects anyway? So why would you have to lock
anything about the other ruby apps in the same process? Is that kind of
distinguishing objects impossible on the GC/enumeration level?

As far as I know there's no way to have JVMTI enumerate only objects
created by a specific application in a given JVM. So any sort of
ObjectSpace impl based on it would have to take that into consideration.

Hm, if you host different applications in the same JVM you probably
need separate class loaders anyway to separate changes on classes.
Maybe you can use that to partition the heap. Alternatively you could
use IterateOverObjectsReachableFromObject() and start from main. Just
a few wild guesses.

Btw, but the issue with stopping the world would still not go away.
Too bad. A possible solution would be to implement the callback in a
way that it places all references in a Java collection. Only after it
finishes the Ruby land callback is invoked for each instance. The
downside is that you need more space (i.e. for the collection which
could become largish) but on the plus side is that you do not have any
overhead (other than incurred by JVMTI) during "normal" operation and
you can limit the stop the world time to just the copying phase which
might be acceptable. Charles, what do you think?

Kind regards

robert
 
H

Helder Ribeiro

Actually, we do that a bit already. For example, we do not track arrays
constructed during argument processing, since they are typically
transient. The problem is that we could only choose to track all Ruby
objects, for example...which would cripple other JRuby apps running in
the same process.

In general, though, we haven't explored JVMTI because we want JRuby to
be the best production environment for deploying apps, and nobody will
EVER turn on JVMTI on their production servers.



I was referring to non-JVMTI solutions, but you're right, JVMTI does
provide this capability.


Is it really ok? You need to remember that JRuby opens up the
possibility of running many, many applications in the same process, as
well as asynchronous algorithms with true parallel threads. We can't
expect people to cripple all that so they can walk EVERY object in the
system. "Stop the world" is awful when you start breaking the ability to
do many things in parallel, as you can in JRuby.

But it may be that for cases where each_object is needed, this is a
reasonable thing to do.

Exactly. I think that each_object rarely has to go into production
code, but is very handy (and, to be honest, just fun, really) in
debugging/testing/experimenting. For those type situations, I don't
really think a "stop the world" approach is so terrible. I find it
less of a disturbance than having this off-code switch.


I think if someone were to submit an
 
C

charles.nutter

Charles said:
As some of you may have heard, we're considering disabling
ObjectSpace.each_object by default in JRuby.

I brought this up at RubyConf, and got about 50% of people saying "I
agree" and 50% of people saying "I do not agree". As it stands now, we
will proceed with having ObjectSpace.each_object disabled by default in
JRuby 1.1 final. See the rest of this thread for the backstory and notes
on test/unit.

The folks who disagree appear to only disagree on principal, rather than
based on any real demonstrable problem with turning each_object off. On
the other hand, the folks who want to disable it have real-world
concerns: performance on the apps they're running. Until there's a
compelling, real-world, non-ideological reason to leave each_object
enabled by default, it will be disabled in JRuby (enable with +O flag or
jruby.objectspace.enabled=true property).

This change is already there in 1.1b1, released on Friday evening.

- Charlie
 
C

charles.nutter

Robert said:
Btw, but the issue with stopping the world would still not go away.
Too bad. A possible solution would be to implement the callback in a
way that it places all references in a Java collection. Only after it
finishes the Ruby land callback is invoked for each instance. The
downside is that you need more space (i.e. for the collection which
could become largish) but on the plus side is that you do not have any
overhead (other than incurred by JVMTI) during "normal" operation and
you can limit the stop the world time to just the copying phase which
might be acceptable. Charles, what do you think?

It's certainly possible to do this, but it would probably need to create
a giant strong-referenced list of objects for iteration. Part of my hard
rules for implementing ObjectSpace is that it MUST NOT interfere with an
object's normal lifecycle.

- Charlie
 
C

charles.nutter

mortee said:
Speaking of multiple cases of possible class-specific instance
tracking... isn't it possible to register your interest in some such
class at some point explicitely from program code - and then any class
could be made enumerable.

Yes, that is possible...but it solves only part of the problem. Just
having ObjectSpace.each_object enableable through a flag allows it to be
fully functional when you want it and out of the way the rest of the time.

- Charlie
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,777
Messages
2,569,604
Members
45,219
Latest member
KristieKoh

Latest Threads

Top