Ruby internals & other questions

R

Ralph Shnelvar

Is there a document or website that describes how Ruby works?

For instance ...



y=0
1_000_000.times {|x| y+=x}



(1) Does the block get compiled a million times?

(2) What's the best Ruby way to do a sum from 1 to 1_000_000

(3) Is there a difference in speed between IRB.exe and ruby.exe in
executing the above code?

(4) In IRB, whats the best way to time the code, above?
 
R

Rob Biedenharn

Is there a document or website that describes how Ruby works?

Yes, plenty. You could start with ruby-lang.org or ask Google for
JRuby, Rubinius, MacRuby, Maglev, or IronRuby and seek out the source
code yourself.
For instance ...



y=0
1_000_000.times {|x| y+=x}



(1) Does the block get compiled a million times?
Well, "compiled" makes the answer tricky, but the answer is basically
"no." It gets executed a million times, but the compilation or AST
production will only happen once (ignoring what JRuby's JIT might do).
(2) What's the best Ruby way to do a sum from 1 to 1_000_000
Use a formula and don't actually do a "sum". (Which is true in any
language)
(3) Is there a difference in speed between IRB.exe and ruby.exe in
executing the above code?
Probably not significant.
(4) In IRB, whats the best way to time the code, above?
Benchmark it. (part of Ruby's standard library)


-Rob

Rob Biedenharn http://agileconsultingllc.com
(e-mail address removed)
 
M

Marnen Laibow-Koser

Ralph said:
Is there a document or website that describes how Ruby works?

Well, every interpreter could be implemented a bit differently. There
*is* a spec, although I don't know a URL offhand.
For instance ...



y=0
1_000_000.times {|x| y+=x}



(1) Does the block get compiled a million times?

Certainly not! The block is an object, and is passed as an argument to
the times function.
(2) What's the best Ruby way to do a sum from 1 to 1_000_000

Gauss's method:
def triangular(n)
(n + 1) * n / 2
end
(3) Is there a difference in speed between IRB.exe and ruby.exe in
executing the above code?

exe? What are you, one of those weird Windows users? :)

I don't see why there would be a difference -- it's the same
interpreter. But why not test it?
(4) In IRB, whats the best way to time the code, above?

I think there's a benchmark library or something like that.

Best,
 
R

Robert Klemme

2009/11/25 Ralph Shnelvar said:
Is there a document or website that describes how Ruby works?

For instance ...



y=0
1_000_000.times {|x| y+=x}



(1) Does the block get compiled a million times?

Of course not.
(2) What's the best Ruby way to do a sum from 1 to 1_000_000

Same as in other languages: http://en.wikipedia.org/wiki/Arithmetic_series
(3) Is there a difference in speed between IRB.exe and ruby.exe in
executing the above code?

Try it out. Benchmark is your friend.
(4) In IRB, whats the best way to time the code, above?

irb(main):004:0> require 'benchmark'
=> true
irb(main):005:0> Benchmark.measure { sleep 2 }
=> #<Benchmark::Tms:0x100d2068 @label="", @real=2.0, @cstime=0.0,
@cutime=0.0, @stime=0.0, @utime=0.0, @total=0.0>
irb(main):006:0> Benchmark.measure { sleep 2 }.to_s
=> " 0.000000 0.000000 0.000000 ( 2.000000)\n"
irb(main):007:0>

Cheers

robert
 
A

Aldric Giacomoni

Ralph said:
y=0
1_000_000.times {|x| y+=x}

y = (1..1_000_000).inject { |a, b| a + b }

More idiomatic, though maybe yours is easier to read at first.
If you're not sure what it does, read the rdoc, then ask us questions
(rdocs aren't -always- crystal clear).
To compare which one is faster, a google search with the keywords :
"ruby, benchmark, bmbm" will help.

To time the code, the simplest method (though not the most elegant) is
to do this:
#beginning of code
t0 = Time.now
#all of code
puts "This took #{Time.now - t0} seconds."
 
K

Kirk Haines

[Note: parts of this message were removed to make it a legal post.]

Is there a document or website that describes how Ruby works?

For instance ...

You have the source. That's often the best way to really understand how a
given implementation works.


y=0
1_000_000.times {|x| y+=x}


(1) Does the block get compiled a million times?

No. The block's source is only evaluated once. Different implementations
will represent it differently, but the end result is that the block becomes
an evaluated thing that will then be executed a million times.

(2) What's the best Ruby way to do a sum from 1 to 1_000_000

Math is your friend. The sum of a string of integers from 1 to n is: (n**2
+ n)/2

def sum(n)
(n**2 + n) / 2
end

However, in that code you wrote, you are actually summing the numbers from
zero to 999999. Fixnum#times starts at 0. You probably actually want
something like 1.upto(1000000). Math is still the best way to do it,
though.

(3) Is there a difference in speed between IRB.exe and ruby.exe in
executing the above code?

Not really. irb is just a bunch of code to facilitate a sort of ruby
shell. The code is all still be executed by Ruby, though.

(4) In IRB, whats the best way to time the code, above?
Read up on the 'benchmark' library. There are several ways to use it.

One way:

require 'benchmark'

Benchmark.bm {|bm| bm.report {y = 0; 1000000.times {|n| y += n}}}


Kirk Haines
 
R

Rob Biedenharn

y = (1..1_000_000).inject { |a, b| a + b }

More idiomatic, though maybe yours is easier to read at first.
If you're not sure what it does, read the rdoc, then ask us questions
(rdocs aren't -always- crystal clear).
To compare which one is faster, a google search with the keywords :
"ruby, benchmark, bmbm" will help.


...but not equivalent:
y=0
1_000_000.times {|x| y+=x }

y = (0..999_999).inject {|a,b| a+b}

Although I wouldn't necessarily call this use of inject idiomatic.

-Rob

Rob Biedenharn http://agileconsultingllc.com
(e-mail address removed)
 
K

Kirk Haines

[Note: parts of this message were removed to make it a legal post.]

y = (1..1_000_000).inject { |a, b| a + b }

More idiomatic, though maybe yours is easier to read at first.

Ugh. His, while wrong for what he is trying to do (sum from 1 to 1000000)
is vastly superior to using inject like that. It's not idiomatic. It's
obtuse.

If one really wants to figure it out iteratively:

y = 0; 1.upto(1000000) {|x| y += x}

or

y = 0; (1..1000000).each {|x| y += x}

Are both easier to read at first, at second, and at 1000000 viewings than
using inject is. Additionally, inject has no advantage with regard to
either execution speed or object creation (less object creation is generally
better). There is no point in using it in a case like this. Inject is
whiz-bang cool, and sometimes seems like an elegant solution, but it usually
makes code slower and harder to read when people use it.


Kirk Haines
 
D

David Masover

Is there a document or website that describes how Ruby works?

You're going to have to get a lot more specific.
y=0
1_000_000.times {|x| y+=x}

(1) Does the block get compiled a million times?

Implementation-specific, but I doubt it.
(2) What's the best Ruby way to do a sum from 1 to 1_000_000

What do you mean by "best"? Your way probably won't work, by the way -- it
will count from 0 to 999_999, not from 1 to 1_000_000.

This is probably the most idiomatic way:

(1..1_000_000).inject(&:+)

But if you mean the fastest way, I would guess it would be something like
this, in pure ruby:

y = 0
i = 0
while i < 1_000_000
i += 1
y += i
end

Realistically, though, Ruby probably isn't the best language. Inline C might
be better:

require 'inline'
class Foo
inline do |builder|
builder.c <<-END
long sum(long max) {
long result = 0;
long i;
for(i=1; i<=max; i++) {
result += i;
}
return result;
}
END
end
end
puts Foo.new.sum(1_000_000)

To be fair, this takes longer on my system, but I think that's because the C
compiler is being run each time. I'm sure there's a way to avoid that, but I
haven't looked. This is also going to be much more difficult for you on
Windows than, well, any other platform.

But you should keep some things in mind -- this is a really arbitrary
benchmark, of the sort that you'd never actually use in real code.
Try this instead:

n = 1_000_000
n*(n+1)/2

The way to be faster in any language is to improve your algorithm -- and your
algorithm is much more likely to be a bottleneck than the language in
question. That's why I use Ruby in the first place.
(3) Is there a difference in speed between IRB.exe and ruby.exe in
executing the above code?

Maybe. If both are from the same version of Ruby, there shouldn't be anything
significant. You could test it, though.
(4) In IRB, whats the best way to time the code, above?

The simplest way is:

require 'benchmark'
require 'foo' # if you do the RubyInline example I gave
Benchmark.bm do |x|
x.report { Foo.new.sum(1_000_000) }
x.report { y = 0; 1_000_001.times {|x| y += x} }
x.report { (1..1_000_000).inject(&:+) }
x.report { n = 1_000_000; n*(n+1)/2 }
end

That'll work anywhere, though it's going to be a bit cumbersome in irb.
Someone else may have a "best" way.

I haven't run this test, though. I have no plans to, unless someone really
wants to claim that any of the loops are faster than those three integer
operations.
 
D

David Masover

Ugh. His, while wrong for what he is trying to do (sum from 1 to 1000000)
is vastly superior to using inject like that. It's not idiomatic. It's
obtuse.

I find this is actually significantly easier to read, though it's probably
because I've been doing it for awhile:

(1..1_000_000).inject(&:+)

I'd much rather have a #sum method on enumerable, but that's almost as
concise, though it takes a bit to explain why it works.

But even spelling it out, it's pretty clear if you use descriptive variables:

(1..1_000_000).inject{|sum, i| sum + i}

I don't think that's less readable, except for the fact that you have to
understand how inject works.
inject has no advantage with regard to
either execution speed or object creation

But it does have the theoretical advantage of fitting exactly the map/reduce
pattern -- inject _is_ reduce, by definition and by alias. It's overkill here,
and I'm probably over my head, but my understanding of why map/reduce is
efficient:

In theory, map lets you spread your dataset to up to n machines, where n is
the number of items in your dataset, and let each machine perform whatever
calculation was called for in the map. Once you've already done that, reduce
makes sense -- have each machine perform the reduce (inject) function, passing
the result to the next machine, rather than having to aggregate the result of
the map into a single location.

In reality, none of this really applies to Ruby, at least not to the standard
map/inject methods. But it's worth thinking about, and probably good practice
for the manycore monstrosities of the future.
 
R

Ralph Shnelvar

MLK> Gauss's method:
MLK> def triangular(n)
MLK> (n + 1) * n / 2
MLK> end

Ugh ... you _know_ that is not what I meant!
 
A

Aldric Giacomoni

Ralph said:
MLK> Gauss's method:
MLK> def triangular(n)
MLK> (n + 1) * n / 2
MLK> end

Ugh ... you _know_ that is not what I meant!

Ralph, you do have to be careful. I got yelled at for answering the
questions you _meant_ to ask because they weren't the answers to the
questions you _did_ ask ;)
 
R

Robert Klemme

y = (1..1_000_000).inject { |a, b| a + b }

There is an issue with this approach. The proper implementation for the
general case of summing any number of values would look like this:

y = (1..1_000_000).inject(0) { |a, b| a + b }

Kind regards

robert


PS: Yes, I deliberately left out the explanation what the issue is. :)
 
M

Marnen Laibow-Koser

Ralph said:
MLK> Gauss's method:
MLK> def triangular(n)
MLK> (n + 1) * n / 2
MLK> end

Ugh ... you _know_ that is not what I meant!

Why not? You'd rather do that iteratively? What kind of masochist are
you? :D

Best,
 
R

Ralph Shnelvar

MLK> Why not? You'd rather do that iteratively? What kind of masochist are
MLK> you?

Next time I'll be more of a masochist and lay out a more arbitrary set
of criteria so that the focus of what I am trying to understand moves
elsewhere.
 
A

Aldric Giacomoni

Ralph said:
Next time I'll be more of a masochist and lay out a more arbitrary set
of criteria so that the focus of what I am trying to understand moves
elsewhere.

Actually, and with all friendly intent, please read this:
http://catb.org/~esr/faqs/smart-questions.html
It's a pretty interesting read. Since I've read it, it's occurred
several times that the following happens to me:
1) I have a problem/question to which I can't find the answer
2) I go to a forum / newsgroup and start to write the problem.
3) In the course of laying out the problem, I think of ways to explain
exactly the problem (how it occurs, how to repeat it, etc)
4) I find the solution, close the tab, and continue working.

If you want the right answer, ask the right question. It can be a bit
tricky at first, but it's quite a useful skill.
 
B

Benoit Daloze

[Note: parts of this message were removed to make it a legal post.]

Ah?

realtime { p (1..1_000_000).inject {|s,e| s + e} }
=> 0.15593600273132324

realtime { p (1..1_000_000).inject(&:+) }
=> 0.1163489818572998

realtime { p (1..1_000_000).inject:)+) }
=> 0.08730292320251465

realtime { y=0; (1..1_000_000).each {|x| y+=x}; p y }
=> 0.12649822235107422

realtime { y=0; 1.upto(1_000_000) {|x| y+=x}; p y }
=> 0.12580513954162598

realtime { y=0;i=0; while(i<1_000_000); i+=1;y+=i; end; p y }
=> 0.05157589912414551

realtime { y=0; for i in (1..1_000_000); y+=i; end; p y }
=> 0.13959503173828125


Seem the inject:)+), which is a bit less idiomatic, is clearly better to do
this.
And clearly inject is fast and good for memory I think (no outside local
variables needed, that's its power).

The while loop is the fastest, while completely looking awful.


(1) Does the block get compiled a million times?
No

(2) What's the best Ruby way to do a sum from 1 to 1_000_000
=> The most "Ruby way" is inject to sum values in an Array

(3) Is there a difference in speed between IRB.exe and ruby.exe in
executing the above code?
Let's see:
IRB > realtime { p (1..1_000_000).inject:)+) }
=> 0.08730292320251465
RUBY > ruby test.rb
0.08849906921386719

It's the same. IRB even look better here.

(4) In IRB, whats the best way to time the code, above?
require "benchmark"
include Benchmark
p realtime { p (1..1_000_000).inject:)+) }

If you more details, look Benchmark module.
require "benchmark"
include Benchmark
bm { |b|
b.report("mytest") { (1..1_000_000).inject:)+) }
}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top