Problem with weak references on OS X 10.3

C

Caleb Clausen

I am having problems with weak references. The program below exhibits
the problem:

100_000.times{|n|
o=3DObject.new;
i=3Do.__id__;
o2=3DObjectSpace._id2ref(i);
o.equal? o2 or raise "o=3D#{o}, i=3D#{"%x"%i}, o2=3D#{o2.inspect}, n=3D#{=
n}"
}

The exception should never be raised. On my OS X 10.3.9 system (and at
least 1 other) it does get eventually raised after a few hundred
iterations using ruby 1.8 and 1.9. With the (apple-supplied) ruby 1.6,
it does not happen. Tests on several Windows and Linux systems have
never observed a problem, using ruby 1.8 and 1.9. I don't know if it's
a problem on OS X 10.4; I don't have access to any 10.4 systems.

The problem seems to be in the call to __id__. Usually, it works
correctly, but every once in a while it returns the id of some random
symbol. Does anyone know why this is happening?
 
R

Robert Klemme

Caleb Clausen said:
I am having problems with weak references. The program below exhibits
the problem:

100_000.times{|n|
o=Object.new;
i=o.__id__;
o2=ObjectSpace._id2ref(i);
o.equal? o2 or raise "o=#{o}, i=#{"%x"%i}, o2=#{o2.inspect}, n=#{n}"
}

The exception should never be raised. On my OS X 10.3.9 system (and at
least 1 other) it does get eventually raised after a few hundred
iterations using ruby 1.8 and 1.9. With the (apple-supplied) ruby 1.6,
it does not happen. Tests on several Windows and Linux systems have
never observed a problem, using ruby 1.8 and 1.9. I don't know if it's
a problem on OS X 10.4; I don't have access to any 10.4 systems.

The problem seems to be in the call to __id__. Usually, it works
correctly, but every once in a while it returns the id of some random
symbol. Does anyone know why this is happening?

I'm a bit confused: where are the WeakReferences your subject mentions?
Also, on my 1.8.3 on cygwin this runs without a problem. If the code throws
then I presume there is a problem with the Ruby interpreter you use
(platform induced int overflow?).

Kind regards

robert
 
C

Caleb Clausen

I'm a bit confused: where are the WeakReferences your subject mentions?
Also, on my 1.8.3 on cygwin this runs without a problem. If the code thr= ows
then I presume there is a problem with the Ruby interpreter you use
(platform induced int overflow?).

The call to __id__ creates the weak reference. Anyway, I consider it a
weak reference, even though there's no WeakRef involved; perhaps you
don't. (__id__ is what WeakRef uses internally.)

I now see that I also get the problem with my ruby 1.6 _if_ I run the
test program within irb; without irb, it runs without problems.

I've also tried a variant that creates an actual WeakRef (calling
WeakRef.new and #__getobj__ instead of __id__ and
ObjectSpace._id2ref); it does not (AFAICT) get the same error, but
instead a different one, which also seems like it shouldn't happen.
Here's the modified script:


'require 'weakref'
100_000.times{|n|
o=3DObject.new;
i=3DWeakRef.new o;
o2=3DObjectSpace._id2ref(i.__getobj__);
o.equal? o2 or raise "o=3D#{o}, i=3D#{"%x"%i}, o2=3D#{o2.inspect}, n=3D#{=
n}"
}

And the error I get:
weakref_bug.rb:5:in `_id2ref': cannot convert Object into Integer (TypeErro=
r)
from weakref_bug.rb:5
from weakref_bug.rb:2:in `times'
from weakref_bug.rb:2

I agree that it does seem to be a problem with the interpreter.
 
R

Robert Klemme

2006/2/4 said:
The call to __id__ creates the weak reference. Anyway, I consider it a
weak reference, even though there's no WeakRef involved; perhaps you
don't. (__id__ is what WeakRef uses internally.)

You're right - I don't. Object#__id__ returns an object id.
I now see that I also get the problem with my ruby 1.6 _if_ I run the
test program within irb; without irb, it runs without problems.

I would not count on IRB in such circumstances - especially if local
variables are involved. IRB does certain things differently there. Did
you only test in IRB or also in a Ruby script?
I've also tried a variant that creates an actual WeakRef (calling
WeakRef.new and #__getobj__ instead of __id__ and
ObjectSpace._id2ref); it does not (AFAICT) get the same error, but
instead a different one, which also seems like it shouldn't happen.
Here's the modified script:


'require 'weakref'
100_000.times{|n|
o=3DObject.new;
i=3DWeakRef.new o;
o2=3DObjectSpace._id2ref(i.__getobj__);
o.equal? o2 or raise "o=3D#{o}, i=3D#{"%x"%i}, o2=3D#{o2.inspect}, n=3D= #{n}"
}

And the error I get:
weakref_bug.rb:5:in `_id2ref': cannot convert Object into Integer (TypeEr= ror)
from weakref_bug.rb:5
from weakref_bug.rb:2:in `times'
from weakref_bug.rb:2

I agree that it does seem to be a problem with the interpreter.

Not so fast. This error you are seeing is absolutely expected:
i.__getobj__ returns the original instance. If that is not an object
id (which it isn't in your case) it's not a legal argument for
ObjectSpace._id2ref().

You probably wanted o2=3Di.__getobj__

Since you keep a reference to o all the time in the block,
ObjectSpace._id2ref must always return the same instance. *If* you
actually see the error you claimed you saw initially then there's
something seriously broken. At the moment I rather suspect it's some
other issue (such as testing in IRB). I'd also try to use brackets
around the equality test - just to be sure that precedence doesn't
come into play.

robert
 
C

Caleb Clausen

I would not count on IRB in such circumstances - especially if local
variables are involved. IRB does certain things differently there. Did
you only test in IRB or also in a Ruby script?

It happens running it with plain ruby (no irb) on my ruby 1.8 (and
1.9). I only mentioned it because irb does seem to be required to
create the problem on my ruby 1.6. Irb is not the problem; it doesn't
treat local variables that differently.
Not so fast. This error you are seeing is absolutely expected:
i.__getobj__ returns the original instance. If that is not an object
id (which it isn't in your case) it's not a legal argument for
ObjectSpace._id2ref().

You probably wanted o2=3Di.__getobj__

Uh-oh. You're right. Too much monkey code and hack, not enough look and thi=
nk.

(After hurriedly fixing my test...) Ok, so if I _correctly_ use
WeakRefs, there is no problem. That is interesting, and I'd sure like
to know why, because it's not obvious to me. I'm going to investigate
this deeper, and see if I isolate the difference that lets WeakRef
work.
Since you keep a reference to o all the time in the block,
ObjectSpace._id2ref must always return the same instance. *If* you
actually see the error you claimed you saw initially then there's
something seriously broken. At the moment I rather suspect it's some
other issue (such as testing in IRB). I'd also try to use brackets
around the equality test - just to be sure that precedence doesn't
come into play.

I'm pretty sure about the precedence of or, but just in case, I tried
it with more parens. It's still broken.
 
R

Robert Klemme

Caleb Clausen said:
It happens running it with plain ruby (no irb) on my ruby 1.8 (and
1.9). I only mentioned it because irb does seem to be required to
create the problem on my ruby 1.6. Irb is not the problem; it doesn't
treat local variables that differently.


Uh-oh. You're right. Too much monkey code and hack, not enough look
and think.

(After hurriedly fixing my test...) Ok, so if I _correctly_ use
WeakRefs, there is no problem. That is interesting, and I'd sure like
to know why, because it's not obvious to me. I'm going to investigate
this deeper, and see if I isolate the difference that lets WeakRef
work.


I'm pretty sure about the precedence of or, but just in case, I tried
it with more parens. It's still broken.

I was also, but sometimes it's better to explicitely rule potential sources
of error out.

I have to admit I still cannot believe that you actually saw the results you
claimed to see initially. Can anybody verify this on Mac OS please? I
don't have a Mac around otherwise I'd do it. I've attached an equivalent
version of the script.

Kind regards

robert
 
M

Mauricio Fernandez

Caleb Clausen said:
100_000.times{|n|
o=Object.new;
i=o.__id__;
o2=ObjectSpace._id2ref(i);
o.equal? o2 or raise "o=#{o}, i=#{"%x"%i}, o2=#{o2.inspect}, n=#{n}"
}

The exception should never be raised. On my OS X 10.3.9 system (and at
least 1 other) it does get eventually raised after a few hundred
iterations using ruby 1.8 and 1.9. With the (apple-supplied) ruby 1.6,
it does not happen. Tests on several Windows and Linux systems have
never observed a problem, using ruby 1.8 and 1.9. I don't know if it's
a problem on OS X 10.4; I don't have access to any 10.4 systems.

The problem seems to be in the call to __id__. Usually, it works
correctly, but every once in a while it returns the id of some random
symbol. Does anyone know why this is happening?

I can reproduce on ruby 1.8.4 (2005-12-24) [powerpc-darwin7.9.0]:

o=#<Object:0x1d421c>, i=ea10e, o2=:reject, n=448 (RuntimeError)

It looks like the object id wrapped in some way and now points to a
symbol? Clearly looks like a bug.

0x1d421c.to_s(2) # => "111010100001000011100"
0xea10e.to_s(2) # => "11101010000100001110"
0xea10e.class # => Fixnum
(2 * 0xea10e).to_s(2) # => "111010100001000011100"

So far so good.

Now, in gc.c:

static VALUE
id2ref(obj, id)
VALUE obj, id;
{
unsigned long ptr, p0;

rb_secure(4);
p0 = ptr = NUM2ULONG(id);
if (ptr == Qtrue) return Qtrue;
if (ptr == Qfalse) return Qfalse;
if (ptr == Qnil) return Qnil;
if (FIXNUM_P(ptr)) return (VALUE)ptr;
if (SYMBOL_P(ptr) && rb_id2name(SYM2ID((VALUE)ptr)) != 0) {
return (VALUE)ptr;
}

(SYMBOL_FLAG == 0x0e)

NUM2ULONG is rb_num2ulong, which calls rb_num2long, which uses FIX2LONG.
id was 111010100001000011101b and ptr becomes 11101010000100001110b, which
matches the SYMBOL_FLAG.

I'd conjecture that the above works on Linux because glibc's malloc() always
returns 8-byte aligned memory addresses, which doesn't seem to be the case in
OSX:

0x1d421c % 8 # => 4

Another possibility would be that the address space for the data segment
used in OSX is lower than on Linux, so the SYM2ID matches an existent
symbol:

RUBY_PLATFORM # => "i686-linux"
Object.new.inspect # => "#<Object:0xb7d44d7c>"
0xb7d44d7c >> 9 # => 6023718
# we shouldn't have 6 million symbols
0x1d421c >> 9 # => 3745
# but 4000 are indeed possible

The relevant code hasn't changed between 1.6 and 1.8; could it be that the
Apple-supplied 1.6 binary was built specially to use 8-byte alignment, or
that the memory layout has changed in the meantime?

If so, possible fixes would include:
* modifying the configure to use the magic options
* using posix_memalign or such
 
L

Logan Capaldo

--Apple-Mail-5--164773724
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
charset=US-ASCII;
delsp=yes;
format=flowed


I have to admit I still cannot believe that you actually saw the
results you claimed to see initially. Can anybody verify this on
Mac OS please? I don't have a Mac around otherwise I'd do it.
I've attached an equivalent version of the script.

logan:/Users/logan/Projects/Ruby Experiments% ruby idref.rb
idref.rb:7: 152: #<Object:0x1e861c> :$@ - 1000206 1000206 (RuntimeError)
from idref.rb:3
logan:/Users/logan/Projects/Ruby Experiments% ruby -v
ruby 1.8.4 (2005-12-24) [powerpc-darwin8.4.0]
logan:/Users/logan/Projects/Ruby Experiments% uname -a
Darwin Logan-Capaldos-Computer.local 8.4.0 Darwin Kernel Version
8.4.0: Tue Jan 3 18:22:10 PST 2006; root:xnu-792.6.56.obj~1/
RELEASE_PPC Power Macintosh powerpc



--Apple-Mail-5--164773724--
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top