ruby in 50 milliseconds or less

J

Joel VanderWerf

If you use ruby 1.8 for quick command line tasks, and you use gems, you
may notice that the interpreter has an execution overhead that is small
but noticeable and irritating when repeated often enough.

$ time RUBYOPT='' ruby -e 1
ruby -e 1 0.01s user 0.00s system 105% cpu 0.011 total
$ time RUBYOPT='rubygems' ruby -e 1
RUBYOPT='rubygems' ruby -e 1 0.58s user 0.06s system 94% cpu 0.675 total

This is greatly improved in 1.9, which has gems built in.

$ time RUBYOPT='rubygems' ruby19 -e 1
RUBYOPT='rubygems' ruby19 -e 1 0.02s user 0.01s system 48% cpu 0.067 total

An order of magnitude improvement makes the delay much more acceptable,
but if you're working with 1.8, that's not an option.

So here's a hack for 1.8 that restores the speed of bare-metal ruby but
still lets you use gems. What it does is redefine Kernel#require to try
loading things without rubygems, but fall back to using rubygems when
there is a load failure.

Put the file in a dir on your $LOAD_PATH, and set RUBYOPT to reference
it, as shown below. *Note:* I haven't tested this widely yet. It may
break libraries that do their own hacking with require or use LOAD_ERROR
for their own devious purposes. I advise not using this hack in
production code without careful testing.

$ cat gem-fallback.rb
module Kernel
req = method :require
define_method :require do |*args|
begin
req.call(*args)
rescue LoadError
Kernel.module_eval do
define_method:)require, &req)
end
require 'rubygems'
require(*args)
end
end
end

$ time RUBYOPT='rgem-fallback' ruby -e 1
RUBYOPT='rgem-fallback' ruby -e 1 0.01s user 0.00s system 71% cpu 0.011
total

$ time RUBYOPT='rgem-fallback' ruby -e "require 'tagz'"
RUBYOPT='rgem-fallback' ruby -e "require 'tagz'" 0.60s user 0.07s
system 79% cpu 0.850 total
 
B

Brian Candler

Or simpler: only put "require 'rubygems'" at the top of scripts which
use rubygems.

(Obviously less convenient than using RUBYOPT of course, but your script
may be more portable)
 
R

Roger Pack

Looks nice, but it's solving a different problem, isn't it? It appears
that you're actually speeding up the gem loading process. My hack only
makes a difference if you're running a script that doesn't use gems at
all.

Put them together and it's a win in both cases!

It's genius! :)
=r
 
J

Joel VanderWerf

Brian said:
Or simpler: only put "require 'rubygems'" at the top of scripts which
use rubygems.

(Obviously less convenient than using RUBYOPT of course, but your script
may be more portable)

Except I'd rather not have to guess/remember which things are installed
as gems, and do it correctly on each host I'm running the script on. So
the require hack figures that out for me. (We do some embedded work on
smartphones, gumstix, and geode, so some of our systems don't use gems
at all.)
 
J

Joel VanderWerf

Here's an extreme example where this makes a huge difference:

I have a dir tree with large numbers of small gps log files, in CSV
format, and I want to use ruby -a (autosplit) to work with them.

With RUBYOPT=rgem-fallback (or of course RUBYOPT=''):

$ time find . -type f -exec ruby -F, -ane '$F' {} \;
RUBYOPT='' find . -type f -exec ruby -F, -ane '$F' {} \; 2.06s user
1.67s system 39% cpu 9.431 total

With RUBYOPT=rubygems:

$ time find . -type f -exec ruby -F, -ane '$F' {} \;
find . -type f -exec ruby -F, -ane '$F' {} \; 219.02s user 61.52s
system 93% cpu 4:59.26 total

Of course, awk would probably be even faster, but ...
 
J

Joel VanderWerf

Roger said:

On linux, faster_rubygems seems to have even more of an impact than on
the windows installation you benchmarked. With about 250 gems installed:

$ time ruby examples/require_rubygems_normal.rb
done
ruby examples/require_rubygems_normal.rb 0.57s user 0.05s system 85%
cpu 0.726 total

$ time ruby examples/require_fast_start.rb
done
ruby examples/require_fast_start.rb 0.04s user 0.02s system 46% cpu
0.121 total

Very nice!

I had been thinking of something along similar, locating all gem lib
dirs. But instead of pushing them all on $:, the idea was to set up a
single dir with symlinks to all the gem lib dirs. I expect it would be
faster because it would offload more of the path search to the
filesystem, rather than to ruby.
 
J

Joel VanderWerf

Joel said:
If you use ruby 1.8 for quick command line tasks, and you use gems, you
may notice that the interpreter has an execution overhead that is small
but noticeable and irritating when repeated often enough.

$ time RUBYOPT='' ruby -e 1
ruby -e 1 0.01s user 0.00s system 105% cpu 0.011 total
$ time RUBYOPT='rubygems' ruby -e 1
RUBYOPT='rubygems' ruby -e 1 0.58s user 0.06s system 94% cpu 0.675 total

This is greatly improved in 1.9, which has gems built in.

$ time RUBYOPT='rubygems' ruby19 -e 1
RUBYOPT='rubygems' ruby19 -e 1 0.02s user 0.01s system 48% cpu 0.067 total

An order of magnitude improvement makes the delay much more acceptable,
but if you're working with 1.8, that's not an option.

So here's a hack for 1.8 that restores the speed of bare-metal ruby but
still lets you use gems. What it does is redefine Kernel#require to try
loading things without rubygems, but fall back to using rubygems when
there is a load failure.

Put the file in a dir on your $LOAD_PATH, and set RUBYOPT to reference
it, as shown below. *Note:* I haven't tested this widely yet. It may
break libraries that do their own hacking with require or use LOAD_ERROR
for their own devious purposes. I advise not using this hack in
production code without careful testing.

$ cat gem-fallback.rb
module Kernel
req = method :require
define_method :require do |*args|
begin
req.call(*args)
rescue LoadError
Kernel.module_eval do
define_method:)require, &req)
end
require 'rubygems'
require(*args)
end
end
end

$ time RUBYOPT='rgem-fallback' ruby -e 1
RUBYOPT='rgem-fallback' ruby -e 1 0.01s user 0.00s system 71% cpu 0.011
total

$ time RUBYOPT='rgem-fallback' ruby -e "require 'tagz'"
RUBYOPT='rgem-fallback' ruby -e "require 'tagz'" 0.60s user 0.07s
system 79% cpu 0.850 total

An update, in case anyone uses this: the sinatra gem uses some black
magic involving #caller, and the presence of this additional require
method on the call stack will confuse sinatra into thinking it is not in
"run" mode and it will not parse ARGV. You can fix this by setting a
constant when loading sinatra, as in below. (To reiterate, I don't
recommend this for production code. This is mostly for fast startup when
using ruby from the command line. For production code, I am using the
crown tool that I announced a few weeks ago[1].)

module Kernel
req = method :require
define_method :require do |*args|
begin
req.call(*args)
rescue LoadError => ex
Kernel.module_eval do
define_method:)require, &req)
end
require 'rubygems'
if args.grep(/sinatra/).any?
pat = /gem-fallback.rb/
if defined?(RUBY_IGNORE_CALLERS)
RUBY_IGNORE_CALLERS << pat
else
RUBY_IGNORE_CALLERS = [pat]
end
end
require(*args)
end
end
end

[1] http://github.com/vjoel/crown/tree/master
 
R

Robert Dober

Here's an extreme example where this makes a huge difference:

I have a dir tree with large numbers of small gps log files, in CSV forma= t,
and I want to use ruby -a (autosplit) to work with them.

With RUBYOPT=3Drgem-fallback (or of course RUBYOPT=3D''):

$ time find . -type f -exec ruby -F, -ane '$F' {} \;
RUBYOPT=3D'' find . -type f -exec ruby -F, -ane '$F' {} \; =A02.06s user = 1.67s
system 39% cpu 9.431 total

With RUBYOPT=3Drubygems:

$ time find . -type f -exec ruby -F, -ane '$F' {} \;
find . -type f -exec ruby -F, -ane '$F' {} \; =A0219.02s user 61.52s syst= em
93% cpu 4:59.26 total

Of course, awk would probably be even faster, but ...
... that would mean using the right tool for the right task ;)
Sorry couldn't resist. This however does not mean that your
contribution is not very valuable, because Ruby will be the right tool
often enough and even here, maybe you have a team where everybody
knows Ruby but few know awk....
Cheers
Robert


--=20
If you tell the truth you don't have to remember anything.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,770
Messages
2,569,584
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top