very slow IO (STDIN.gets and puts) on Linux, ruby 1.8.2_pre3

MiG · Mar 10, 2005

Why is Ruby 2x slower in IO than php or bash?

data.dat is 80 MB file with 5000000 lines. I use Linux, 2GB RAM (tested
on another pc with similar result).

--------------------

test.php:
#!/usr/bin/php
<? while (fgets(STDIN)); ?>

$ time ./test.php < data.dat
/test.php < data.dat 5,59s user 0,19s system 88% cpu 6,516 total

--------------------

test.rb:
#!/usr/bin/ruby
while gets
end

$ time ./test.rb < data.dat
/test.rb < data.dat 11,51s user 0,31s system 86% cpu 13,598 total

Florian Gross · Mar 10, 2005

MiG said:
Why is Ruby 2x slower in IO than php or bash?

data.dat is 80 MB file with 5000000 lines. I use Linux, 2GB RAM (tested
on another pc with similar result).

--------------------

test.php:
#!/usr/bin/php
<? while (fgets(STDIN)); ?>

$ time ./test.php < data.dat
/test.php < data.dat 5,59s user 0,19s system 88% cpu 6,516 total

--------------------

test.rb:
#!/usr/bin/ruby
while gets
end

$ time ./test.rb < data.dat
/test.rb < data.dat 11,51s user 0,31s system 86% cpu 13,598 total

Perhaps also try io.read() -- I think it will be faster.

Ben Giddings · Mar 10, 2005

MiG said:
Why is Ruby 2x slower in IO than php or bash?

data.dat is 80 MB file with 5000000 lines. I use Linux, 2GB RAM (tested
on another pc with similar result).

--------------------

test.php:
#!/usr/bin/php
<? while (fgets(STDIN)); ?>

$ time ./test.php < data.dat
./test.php < data.dat 5,59s user 0,19s system 88% cpu 6,516 total

--------------------

test.rb:
#!/usr/bin/ruby
while gets
end

$ time ./test.rb < data.dat
./test.rb < data.dat 11,51s user 0,31s system 86% cpu 13,598 total

English is so much worse than Japanese! When I try to count to one
million in English it takes me 3.42 days, but when I try it in Japanese,
it only takes me 3.12 days!

Obviously, that means English is the worse language. Why does English
suck so bad?!?

-----

In other words: your benchmark is really dumb. That isn't practical
code, and trying to draw any conclusions from it is silly. For Ruby to
be considered fast, how much time should it take to read and discard a
line of text 5 kagillion times? Btw, I found a way to optimize your code:

deleteme.rb
#!/usr/bin/ruby
exit(0)

ben% time ruby deleteme.rb
ruby deleteme.rb 0.00s user 0.00s system 102% cpu 0.006 total

I'm still working on getting it to run in less than 0.004 total.

Ben

Florian Frank · Mar 10, 2005

MiG said:
test.rb:
#!/usr/bin/ruby
while gets
end

$ time ./test.rb < data.dat
./test.rb < data.dat 11,51s user 0,31s system 86% cpu 13,598 total

Well, Ruby assigns the line string to $_, if you use gets that way. So
Ruby has to construct an object for every line. Perhaps PHP doesn't do that?

MiG · Mar 11, 2005

1. I have NOTHING against Ruby, it is my best language
2. Is it wrong-doing to ask?
3. My dumb benchmark: I used real data. If you have 2GB of free RAM and
use 80MB file, is it wrong? It's the same if you have 1MB RAM and use
smaller file. I used the real data I have, that's all. It behaves the
same way with smaller.
4. Thank you for excellent humour.

MiG

MiG · Mar 11, 2005

So the solution is maybe to use getc and parse lines on my own...

MiG

gabriele renzi · Mar 11, 2005

MiG ha scritto:

So the solution is maybe to use getc and parse lines on my own...

or maybe use one of the standard methods for iterating over lines, such as
open('file').each do |x| stuff(x) end
this would not set $_ (I don't think it slows down things that much, but
who knows).
Once you have stuff() in place you can re-check if there is a difference.

Navindra Umanee · Mar 11, 2005

MiG said:
So the solution is maybe to use getc and parse lines on my own...

Maybe you're missing the point.

The two programs aren't doing the same amount of work; your benchmarks
aren't equivalent. If you change the PHP benchmark slightly, you'll
likely see PHP is just as slow as Ruby.

[navindra@dot /tmp]$ time php -r 'while (fgets(STDIN));' < FILE
8.421u 2.334s 0:26.53 40.5% 0+0k 0+0io 2pf+0w
[navindra@dot /tmp]$ time ruby -e 'while gets;end' < FILE
11.676u 2.586s 0:39.44 36.1% 0+0k 0+0io 11pf+0w
[navindra@dot /tmp]$ time php -r 'while ($blah=fgets(STDIN));' < FILE
10.680u 2.372s 0:37.83 34.4% 0+0k 0+0io 10pf+0w

Cheers,
Navin.

Tom Willis · Mar 11, 2005

MiG said:
MiG said:

So the solution is maybe to use getc and parse lines on my own...

Click to expand...

Maybe you're missing the point.

The two programs aren't doing the same amount of work; your benchmarks
aren't equivalent. If you change the PHP benchmark slightly, you'll
likely see PHP is just as slow as Ruby.

[navindra@dot /tmp]$ time php -r 'while (fgets(STDIN));' < FILE
8.421u 2.334s 0:26.53 40.5% 0+0k 0+0io 2pf+0w
[navindra@dot /tmp]$ time ruby -e 'while gets;end' < FILE
11.676u 2.586s 0:39.44 36.1% 0+0k 0+0io 11pf+0w
[navindra@dot /tmp]$ time php -r 'while ($blah=fgets(STDIN));' < FILE
10.680u 2.372s 0:37.83 34.4% 0+0k 0+0io 10pf+0w

Cheers,
Navin.

Here's my results on a 14.5 mb file, ruby wins.

twillis:~$ time ruby -e 'while gets;end'< HL7Audit.csv

real 0m1.481s
user 0m0.924s
sys 0m0.095s
twillis:~$ time php -r 'while($blah=fgets(STDIN));'< HL7Audit.csv

real 0m2.327s
user 0m1.001s
sys 0m0.083s

Ben Giddings · Mar 11, 2005

MiG said:
1. I have NOTHING against Ruby, it is my best language
2. Is it wrong-doing to ask?
3. My dumb benchmark: I used real data. If you have 2GB of free RAM and
use 80MB file, is it wrong? It's the same if you have 1MB RAM and use
smaller file. I used the real data I have, that's all. It behaves the
same way with smaller.
4. Thank you for excellent humour.

I'm glad you see the humour. I was a little harsh, but I was having a
bad day, sorry.

Really, the benchmark really isn't meaningful. You need to do something
with the data you're reading. It doesn't matter if it's a 80MB file or
a 10 byte file. If you're simply reading the data and discarding it,
you aren't doing anything. For the measurement to be meaningful, you
actually need to *do something*.

Would you expect these two applications to take the same amount of time:

#!/bin/env ruby

1000.times do
# do nothing
end

------

#!/bin/env ruby

1000.times do
num = Math.sin(rand(1.0))
if num < 0.0
num += 1.0
else
num -= 1.0
end
end

Both programs are essentially equivalent. Neither actually *does*
anything. If the second one ran slower, could you really draw any
conclusions about the speed of Ruby's math operations?

In fact, it may be that Ruby's IO is slower than other languages. If
Ruby were even close to the speed of C I'd be stunned. Ruby has to
construct an object with every line it reads. C just stuffs things
blindly into an array. The problem is that your sample doesn't test
Ruby's IO capabilities. In the end, your sample code does absolutely
nothing.

If you want to benchmark Ruby's IO, try doing something like writing a
program to concatenate a number of files, or even just to copy a file.
Open one file for writing, and then open a file for reading, read
something from the input file, write to the output file.

In any case, until the slowness of Ruby's IO proves to be a problem in
actual use, why do you care how it fares on a benchmark?

Ben

OSX: require seems very slow	3	Oct 29, 2008
ruby 1.9.2, rubygems and gentoo linux	0	Oct 8, 2010
fastcgi performance problems and ruby	14	Nov 1, 2004
Cannot allocate memory (Errno::ENOMEM)	1	Mar 5, 2010
Ruby Versions and other languages performane comparison	1	Jul 12, 2006
[ANN] posix-spawn 0.3.0 -- first public release (codename, "tigersblood")	5	Mar 4, 2011
Optimization help - reading out of /proc on Solaris	4	Sep 16, 2008
[ANN] rq-3.0.0 : ruby queue gets gem'd	3	Mar 1, 2007

very slow IO (STDIN.gets and puts) on Linux, ruby 1.8.2_pre3

MiG

Florian Gross

Ben Giddings

Florian Frank

MiG

MiG

gabriele renzi

Navindra Umanee

Tom Willis

Ben Giddings

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads