ruby in 50 milliseconds or less

Joel VanderWerf · Jul 18, 2009

If you use ruby 1.8 for quick command line tasks, and you use gems, you
may notice that the interpreter has an execution overhead that is small
but noticeable and irritating when repeated often enough.

$ time RUBYOPT='' ruby -e 1
ruby -e 1 0.01s user 0.00s system 105% cpu 0.011 total
$ time RUBYOPT='rubygems' ruby -e 1
RUBYOPT='rubygems' ruby -e 1 0.58s user 0.06s system 94% cpu 0.675 total

This is greatly improved in 1.9, which has gems built in.

$ time RUBYOPT='rubygems' ruby19 -e 1
RUBYOPT='rubygems' ruby19 -e 1 0.02s user 0.01s system 48% cpu 0.067 total

An order of magnitude improvement makes the delay much more acceptable,
but if you're working with 1.8, that's not an option.

So here's a hack for 1.8 that restores the speed of bare-metal ruby but
still lets you use gems. What it does is redefine Kernel#require to try
loading things without rubygems, but fall back to using rubygems when
there is a load failure.

Put the file in a dir on your $LOAD_PATH, and set RUBYOPT to reference
it, as shown below. *Note:* I haven't tested this widely yet. It may
break libraries that do their own hacking with require or use LOAD_ERROR
for their own devious purposes. I advise not using this hack in
production code without careful testing.

$ cat gem-fallback.rb
module Kernel
req = method :require
define_method :require do |*args|
begin
req.call(*args)
rescue LoadError
Kernel.module_eval do
define_method

require, &req)
end
require 'rubygems'
require(*args)
end
end
end

$ time RUBYOPT='rgem-fallback' ruby -e 1
RUBYOPT='rgem-fallback' ruby -e 1 0.01s user 0.00s system 71% cpu 0.011
total

$ time RUBYOPT='rgem-fallback' ruby -e "require 'tagz'"
RUBYOPT='rgem-fallback' ruby -e "require 'tagz'" 0.60s user 0.07s
system 79% cpu 0.850 total

Roger Pack · Jul 18, 2009

Joel said:
If you use ruby 1.8 for quick command line tasks, and you use gems, you
may notice that the interpreter has an execution overhead that is small
but noticeable and irritating when repeated often enough.

I've noticed this too.
My solution: a fake gem_prelude

Great minds think alike.
It would be interesting to time things tho.
http://github.com/rogerdpack/faster_rubygems/tree/master
Cheers!
=r

Brian Candler · Jul 18, 2009

Or simpler: only put "require 'rubygems'" at the top of scripts which
use rubygems.

(Obviously less convenient than using RUBYOPT of course, but your script
may be more portable)

Joel VanderWerf · Jul 18, 2009

Roger said:
http://github.com/rogerdpack/faster_rubygems/tree/master

Looks nice, but it's solving a different problem, isn't it? It appears
that you're actually speeding up the gem loading process. My hack only
makes a difference if you're running a script that doesn't use gems at all.

Put them together and it's a win in both cases!

Roger Pack · Jul 18, 2009

Looks nice, but it's solving a different problem, isn't it? It appears

that you're actually speeding up the gem loading process. My hack only
makes a difference if you're running a script that doesn't use gems at
all.

Put them together and it's a win in both cases!

It's genius!

=r

Joel VanderWerf · Jul 18, 2009

Brian said:
Or simpler: only put "require 'rubygems'" at the top of scripts which
use rubygems.

(Obviously less convenient than using RUBYOPT of course, but your script
may be more portable)

Except I'd rather not have to guess/remember which things are installed
as gems, and do it correctly on each host I'm running the script on. So
the require hack figures that out for me. (We do some embedded work on
smartphones, gumstix, and geode, so some of our systems don't use gems
at all.)

Joel VanderWerf · Jul 18, 2009

Here's an extreme example where this makes a huge difference:

I have a dir tree with large numbers of small gps log files, in CSV
format, and I want to use ruby -a (autosplit) to work with them.

With RUBYOPT=rgem-fallback (or of course RUBYOPT=''):

$ time find . -type f -exec ruby -F, -ane '$F' {} \;
RUBYOPT='' find . -type f -exec ruby -F, -ane '$F' {} \; 2.06s user
1.67s system 39% cpu 9.431 total

With RUBYOPT=rubygems:

$ time find . -type f -exec ruby -F, -ane '$F' {} \;
find . -type f -exec ruby -F, -ane '$F' {} \; 219.02s user 61.52s
system 93% cpu 4:59.26 total

Of course, awk would probably be even faster, but ...

Joel VanderWerf · Jul 18, 2009

Roger said:
http://github.com/rogerdpack/faster_rubygems/tree/master

On linux, faster_rubygems seems to have even more of an impact than on
the windows installation you benchmarked. With about 250 gems installed:

$ time ruby examples/require_rubygems_normal.rb
done
ruby examples/require_rubygems_normal.rb 0.57s user 0.05s system 85%
cpu 0.726 total

$ time ruby examples/require_fast_start.rb
done
ruby examples/require_fast_start.rb 0.04s user 0.02s system 46% cpu
0.121 total

Very nice!

I had been thinking of something along similar, locating all gem lib
dirs. But instead of pushing them all on $:, the idea was to set up a
single dir with symlinks to all the gem lib dirs. I expect it would be
faster because it would offload more of the path search to the
filesystem, rather than to ruby.

Joel VanderWerf · Sep 12, 2009

Joel said:
If you use ruby 1.8 for quick command line tasks, and you use gems, you
may notice that the interpreter has an execution overhead that is small
but noticeable and irritating when repeated often enough.

$ time RUBYOPT='' ruby -e 1
ruby -e 1 0.01s user 0.00s system 105% cpu 0.011 total
$ time RUBYOPT='rubygems' ruby -e 1
RUBYOPT='rubygems' ruby -e 1 0.58s user 0.06s system 94% cpu 0.675 total

This is greatly improved in 1.9, which has gems built in.

$ time RUBYOPT='rubygems' ruby19 -e 1
RUBYOPT='rubygems' ruby19 -e 1 0.02s user 0.01s system 48% cpu 0.067 total

An order of magnitude improvement makes the delay much more acceptable,
but if you're working with 1.8, that's not an option.

So here's a hack for 1.8 that restores the speed of bare-metal ruby but
still lets you use gems. What it does is redefine Kernel#require to try
loading things without rubygems, but fall back to using rubygems when
there is a load failure.

Put the file in a dir on your $LOAD_PATH, and set RUBYOPT to reference
it, as shown below. *Note:* I haven't tested this widely yet. It may
break libraries that do their own hacking with require or use LOAD_ERROR
for their own devious purposes. I advise not using this hack in
production code without careful testing.

$ cat gem-fallback.rb
module Kernel
req = method :require
define_method :require do |*args|
begin
req.call(*args)
rescue LoadError
Kernel.module_eval do
define_methodrequire, &req)
end
require 'rubygems'
require(*args)
end
end
end

$ time RUBYOPT='rgem-fallback' ruby -e 1
RUBYOPT='rgem-fallback' ruby -e 1 0.01s user 0.00s system 71% cpu 0.011
total

$ time RUBYOPT='rgem-fallback' ruby -e "require 'tagz'"
RUBYOPT='rgem-fallback' ruby -e "require 'tagz'" 0.60s user 0.07s
system 79% cpu 0.850 total

An update, in case anyone uses this: the sinatra gem uses some black
magic involving #caller, and the presence of this additional require
method on the call stack will confuse sinatra into thinking it is not in
"run" mode and it will not parse ARGV. You can fix this by setting a
constant when loading sinatra, as in below. (To reiterate, I don't
recommend this for production code. This is mostly for fast startup when
using ruby from the command line. For production code, I am using the
crown tool that I announced a few weeks ago[1].)

module Kernel
req = method :require
define_method :require do |*args|
begin
req.call(*args)
rescue LoadError => ex
Kernel.module_eval do
define_method

require, &req)
end
require 'rubygems'
if args.grep(/sinatra/).any?
pat = /gem-fallback.rb/
if defined?(RUBY_IGNORE_CALLERS)
RUBY_IGNORE_CALLERS << pat
else
RUBY_IGNORE_CALLERS = [pat]
end
end
require(*args)
end
end
end

[1] http://github.com/vjoel/crown/tree/master

Robert Dober · Sep 13, 2009

Here's an extreme example where this makes a huge difference:

I have a dir tree with large numbers of small gps log files, in CSV forma= t,
and I want to use ruby -a (autosplit) to work with them.

With RUBYOPT=3Drgem-fallback (or of course RUBYOPT=3D''):

$ time find . -type f -exec ruby -F, -ane '$F' {} \;
RUBYOPT=3D'' find . -type f -exec ruby -F, -ane '$F' {} \; =A02.06s user = 1.67s
system 39% cpu 9.431 total

With RUBYOPT=3Drubygems:

$ time find . -type f -exec ruby -F, -ane '$F' {} \;
find . -type f -exec ruby -F, -ane '$F' {} \; =A0219.02s user 61.52s syst= em
93% cpu 4:59.26 total

Of course, awk would probably be even faster, but ...

... that would mean using the right tool for the right task

Sorry couldn't resist. This however does not mean that your
contribution is not very valuable, because Ruby will be the right tool
often enough and even here, maybe you have a team where everybody
knows Ruby but few know awk....
Cheers
Robert

--=20
If you tell the truth you don't have to remember anything.

Joel VanderWerf · Sep 13, 2009

Robert said:
... that would mean using the right tool for the right task

and where's the fun in that!

Odd Ruby/Rubygems/gem path problem	6	Apr 24, 2009
About circular dependencies in RubyGems (the library). And about theorder in $".	12	Aug 1, 2008
Ruby sqlite/gem error: Could not load sqlite adapter	4	Apr 16, 2009
Ruby error	2	Sep 26, 2010
ruby-serialport with Ruby 1.9 error	7	Feb 4, 2010
Ruby-GNOME2	2	Dec 17, 2007
Changes for Ruby in Debian (and Ubuntu)	13	May 24, 2011
[ANN] frubygems -- quicker loading rubygems:"spooky" version	3	Oct 26, 2009

ruby in 50 milliseconds or less

Joel VanderWerf

Roger Pack

Brian Candler

Joel VanderWerf

Roger Pack

Joel VanderWerf

Joel VanderWerf

Joel VanderWerf

Joel VanderWerf

Robert Dober

Joel VanderWerf

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads