slow method searching?

D

David Garamond

$ time ruby -e'a=(1..200000).to_a; a.classx'
-e:1: undefined method `classx' for #<Array:0x4027c18c> (NoMethodError)

real 0m3.768s
user 0m1.640s
sys 0m0.010s

$ time ruby -e'a=(1..200000).to_a; b=nil; b.classx'
-e:1: undefined method `classx' for nil:NilClass (NoMethodError)

real 0m0.288s
user 0m0.140s
sys 0m0.010s

$ time ruby -e'a=(1..400000).to_a; a.classx'
-e:1: undefined method `classx' for #<Array:0x4027c18c> (NoMethodError)

real 0m9.448s
user 0m3.980s
sys 0m0.050s

$ time ruby -e'a=(1..400000).to_a; b=[]; b.classx'
-e:1: undefined method `classx' for []:Array (NoMethodError)

real 0m0.550s
user 0m0.260s
sys 0m0.020s

$ time ruby -e'a=(1..1000000).to_a; a.classx'
-e:1: undefined method `classx' for #<Array:0x4027c18c> (NoMethodError)

real 0m36.652s
user 0m16.040s
sys 0m0.090s

$ time ruby -e'a=(1..1000000).to_a; b=1; b.classx'
-e:1: undefined method `classx' for 1:Fixnum (NoMethodError)

real 0m1.593s
user 0m0.690s
sys 0m0.010s


Why does it take longer and longer for Ruby to throw NoMethodError for
large arrays? What is going on here? This happens with hash too.
 
D

Dave Thomas

Why does it take longer and longer for Ruby to throw NoMethodError for
large arrays? What is going on here? This happens with hash too.

Perhaps the time is being spent constructing the arrays?

Cheers

Dave
 
S

Sean O'Dell

You're timing how long it takes the OS to load the time program, the Ruby
interpreter, Ruby to load your script and then Ruby to run your code. The OS
is adding variance to your final times, probably. Try putting all of your
tests in a single Ruby script file. Place a pause or a gets in Ruby code to
allow the OS to stop all possible hard disk activity before commencing the
actual tests.

Sean O'Dell
 
L

Lennon Day-Reynolds

Dave,

The arrays are being built in both cases; the question is why it takes
to long to determine that a method is missing on a large array. It's
also not just lazy construction of the arrays or anything, as methods
that *do* exist execute as quickly on large arrays as on empty ones.

Lennon
 
L

Lennon Day-Reynolds

Sean,

Multiple runs of the same tests, and grouping the tests differently,
give me the same type of results David was seeing. The load and memory
issues between invocations can make some difference, certainly, should
not be the source of a 1000% variance. Also, if it were just OS-level
variation, any one test run multiple times should display the same
behavior.

Lennon
 
D

David Garamond

Dave said:
Perhaps the time is being spent constructing the arrays?

Note that I'm comparing these two:

$ time ruby -e'a=(1..200000).to_a; a.classx'; # slow
$ time ruby -e'a=(1..200000).to_a; b=nil; b.classx'; # fast

Also I've tried this in irb and after constructing the array, each
subsequent call to a.classx or a.foo (any other unknown methods) are
very slow, even repeatedly. So probably the problem is not in method
caching, but somehow ruby iterates or checks each element in the array?

Btw, this is ruby 1.8.1 (2004-04-30) [i686-linux-gnu]
 
G

George Ogata

David Garamond said:
Why does it take longer and longer for Ruby to throw NoMethodError for
large arrays? What is going on here? This happens with hash too.

Ok, my turn to guess... :)


g@crash:~$ time ruby -e 'a=(1..2000000).to_a; a.classx'
-e:1: undefined method `classx' for #<Array:0x401cfebc> (NoMethodError)

real 0m6.100s
user 0m5.897s
sys 0m0.104s
g@crash:~$ time ruby -e 'a=(1..2000000).to_a; begin a.classx; rescue NoMethodError; end'

real 0m0.458s
user 0m0.449s
sys 0m0.008s


I'm guessing the time is spent "stringifying" the array when
outputting exception message. Note that for shorter arrays, the array
is actually printed.


g@crash:~$ time ruby -e 'a=(1..2).to_a; a.classx'
-e:1: undefined method `classx' for [1, 2]:Array (NoMethodError)

real 0m0.005s
user 0m0.003s
sys 0m0.001s


I don't know what the criteria is for when to print the object and
when not to; perhaps it's based on the length of the string in which
case the large array is stringified only to have the string thrown
away.
 
C

Carlos

$ time ruby -e'a=(1..200000).to_a; a.classx'
-e:1: undefined method `classx' for #<Array:0x4027c18c> (NoMethodError)

real 0m3.768s
user 0m1.640s
sys 0m0.010s

$ time ruby -e'a=(1..200000).to_a; b=nil; b.classx'
-e:1: undefined method `classx' for nil:NilClass (NoMethodError)

In my computer, there is a noticeable delay between the "-e:1" and the rest
of the exception message.

During that time, there is a lot of "mremap" system calls whose meaning I
don't know (GC? OS swapping(?)? A very long longjmp(!?)?).

Before I paste the last lines of the strace, which show the mremaps, I will
write yet another paragraph (this), because if I'm going to waste internet
bandwidth with non-answers, I'll do it good :).

write(2, "-e:1", 4) = 4
brk(0) = 0x8158000
brk(0x8187000) = 0x8187000
mmap2(NULL, 270336, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0x402d1000
brk(0) = 0x8187000
brk(0) = 0x8187000
brk(0x816b000) = 0x816b000
brk(0) = 0x816b000
mremap(0x402d1000, 270336, 536576, MREMAP_MAYMOVE) = 0x402d1000
mremap(0x402d1000, 536576, 536576, MREMAP_MAYMOVE) = 0x402d1000
mremap(0x402d1000, 536576, 1069056, MREMAP_MAYMOVE) = 0x402d1000
mremap(0x402d1000, 1069056, 1069056, MREMAP_MAYMOVE) = 0x402d1000
mremap(0x402d1000, 1069056, 2134016, MREMAP_MAYMOVE) = 0x402d1000
write(2, ": ", 2) = 2
write(2, "undefined method lassx\' for #<"..., 49) = 49
write(2, " (", 2) = 2
write(2, "NoMethodError", 13) = 13
write(2, ")\n", 2) = 2
exit_group(1) = ?

Good luck.
 
M

Mauricio Fernández

$ time ruby -e'a=(1..1000000).to_a; a.classx'
-e:1: undefined method `classx' for #<Array:0x4027c18c> (NoMethodError)

real 0m36.652s
user 0m16.040s
sys 0m0.090s

$ time ruby -e'a=(1..1000000).to_a; b=1; b.classx'
-e:1: undefined method `classx' for 1:Fixnum (NoMethodError)

real 0m1.593s
user 0m0.690s
sys 0m0.010s


Why does it take longer and longer for Ruby to throw NoMethodError for
large arrays? What is going on here? This happens with hash too.

When the exception is printed, a string representation of the associated
object is created by calling #inspect. See name_err_mesg_to_str in
error.c:

d = rb_protect(rb_inspect, obj, 0);
if (NIL_P(d) || RSTRING(d)->len > 65) {

inspect is always executed, and the result is only discarded if bigger
than 65 chars. It would be trivial to add a test for T_ARRAY or T_HASH
and avoid calling inspect if the number of elements is bigger than some
N, in addition to (and before) the 65-chars check.

BTW, one oddity in the implementation: ruby uses internally a constant
named "message" (lower case), which you cannot access from Ruby neither as
NameError::message nor with NameError.const_get("message"), however
NameError::message
Data


There's several such tricks in the implementation (like instance
variables without @, etc) ;-)

--
Running Debian GNU/Linux Sid (unstable)
batsman dot geo at yahoo dot com

People disagree with me. I just ignore them.
-- Linus Torvalds, regarding the use of C++ for the Linux kernel
 
L

Lennon Day-Reynolds

I think I found it:

eval.c from ruby-1.8.1, in 'method_missing', lines 4908-4910
nargv = ALLOCA_N(VALUE, argc+1);
nargv[0] = ID2SYM(id);
MEMCPY(nargv+1, argv, VALUE, argc);

At that point, the 'argv' array includes the object on which the
method lookup failed. So, the 'method_missing' throw includes a bulk
memory copy of the object contents. I'm not sure why it would work
that way instead of just returning another reference to the same
object, but there must be some reason.

Lennon
 
M

Mauricio Fernández

Why does it take longer and longer for Ruby to throw NoMethodError for
large arrays? What is going on here? This happens with hash too.

When the exception is printed, a string representation of the associated
object is created by calling #inspect. See name_err_mesg_to_str in
error.c: [...]
BTW, one oddity in the implementation: ruby uses internally a constant
named "message" (lower case), which you cannot access from Ruby neither as
NameError::message nor with NameError.const_get("message"), however
NameError::message
Data

Just for a laugh, here's some additional confirmation of the above:

batsman@tux-chan:/tmp$ cat dfghdfg.rb
def test_exc(code)
s = Time.new
eval code
rescue NameError
puts "Eval: #{code}"
puts "Message: #{$!.message}"
puts "Took #{Time.new-s} seconds."
end

test_exc %{ a=(1..400000).to_a; a.classx }
test_exc %{ a=(1..400000).to_a; b=[]; b.classx }

puts "=" * 80
puts "Redefining #to_str for exception messages..."
puts

klass = nil
ObjectSpace.each_object(Class){|c| klass = c if c.ancestors.include?(Data) && c!= Data}
klass.send:)define_method, :to_str) { "booooo spooky NameError" }

test_exc %{ a=(1..400000).to_a; a.classx }
test_exc %{ a=(1..400000).to_a; b=[]; b.classx }

batsman@tux-chan:/tmp$ ruby dfghdfg.rb
Eval: a=(1..400000).to_a; a.classx
Message: undefined method `classx' for #<Array:0x401c941c>
Took 1.691099 seconds.
Eval: a=(1..400000).to_a; b=[]; b.classx
Message: undefined method `classx' for []:Array
Took 0.106893 seconds.
================================================================================
Redefining #to_str for exception messages...

Eval: a=(1..400000).to_a; a.classx
Message: booooo spooky NameError
Took 0.115443 seconds.
Eval: a=(1..400000).to_a; b=[]; b.classx
Message: booooo spooky NameError
Took 0.110813 seconds.


--
Running Debian GNU/Linux Sid (unstable)
batsman dot geo at yahoo dot com

Debian is like Suse with yast turned off, just better. :)
-- Goswin Brederlow
 
E

Eric Hodel

--yRXsXhSYq7voLSEz
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
You're timing how long it takes the OS to load the time program, the
Ruby interpreter, Ruby to load your script and then Ruby to run your
code. The OS is adding variance to your final times, probably.

The OS does not add 35 seconds of variance.
Try putting all of your tests in a single Ruby script file. Place a
pause or a gets in Ruby code to allow the OS to stop all possible hard
disk activity before commencing the actual tests.

Hint: This is similar, possibly related, to ruby-core:2661

--=20
Eric Hodel - (e-mail address removed) - http://segment7.net
All messages signed with fingerprint:
FEC2 57F1 D465 EB15 5D6E 7C11 332A 551C 796C 9F04


--yRXsXhSYq7voLSEz
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (FreeBSD)

iD8DBQFA5fPZMypVHHlsnwQRAuMRAKCdv51p/Xsd1qvYlLCsm/loBpELQACfclFF
6VzE+qd+D4YETDCloHimNbk=
=Qdci
-----END PGP SIGNATURE-----

--yRXsXhSYq7voLSEz--
 
D

David Garamond

George said:
g@crash:~$ time ruby -e 'a=(1..2000000).to_a; a.classx'
-e:1: undefined method `classx' for #<Array:0x401cfebc> (NoMethodError)

real 0m6.100s
user 0m5.897s
sys 0m0.104s
g@crash:~$ time ruby -e 'a=(1..2000000).to_a; begin a.classx; rescue NoMethodError; end'

real 0m0.458s
user 0m0.449s
sys 0m0.008s


I'm guessing the time is spent "stringifying" the array when
outputting exception message. Note that for shorter arrays, the array
is actually printed.

I guess you're right.

$ time ruby -e'a=(1..1000000).to_a; a.classx'
-e:1: undefined method `classx' for #<Array:0x4027c18c> (NoMethodError)

real 0m19.999s
user 0m16.680s
sys 0m0.060s

$ time ruby -e'a=(1..1000000).to_a; class Array; def inspect;
"Array[#{self.length}]" end end; a.classx'
-e:1: undefined method `classx' for Array[1000000]:Array (NoMethodError)

real 0m0.877s
user 0m0.710s
sys 0m0.040s

So I think what needs to be done is make Array#inspect smarter, right?
There's no way a 1-mil element array will have a string representation
of < 65 chars.
 
D

David Garamond

David said:
So I think what needs to be done is make Array#inspect smarter, right?
There's no way a 1-mil element array will have a string representation
of < 65 chars.

What about:

a) adding an optional argument to inspect, which is the maximum number
of chars wanted.

or (easier):

b) the default method_missing doesn't use inspect, instead just print
"#<#{o.class}:#{o.id.to_s(16)}>".

Btw, it's not o.id actually, what is it?
 
D

daz

David said:
<snip>

b) the default method_missing doesn't use inspect, instead just print
"#<#{o.class}:#{o.id.to_s(16)}>".

Btw, it's not o.id actually, what is it?

## 2 * id

o = Object.new
oid = o.object_id

puts (oid ).to_s(16) #-> 183e26a
puts (oid << 1).to_s(16) #-> 307c4d4
p o #-> #<Object:0x307c4d4>


I guess it runs through some fixnum conversion code but
it's not broken by it, so why change it :-?


daz
 
P

Paul Brannan

a) adding an optional argument to inspect, which is the maximum number
of chars wanted.

This will break classes that define their own inspect method.

Paul
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,777
Messages
2,569,604
Members
45,234
Latest member
SkyeWeems

Latest Threads

Top