Ruby/Odeum vs. Lucene Performance

Z

Zed A. Shaw

Hi All,

At the risk of starting a major flame war and giving Java player-haters
more fuel for their ire, I've done a performance comparison between
Ruby/Odeum and Lucene:

http://www.zedshaw.com/projects/ruby_odeum/performance.html

Please don't take this as a "Java sucks Ruby rulez" posting, or that
I've done any sort of scientific analysis here. I'm a professional Java
developer so I'm agnostic about the language wars. It's simply intended
to answer a common question I get related to Ruby/Odeum.

For the people who can't or won't read, the test is informal and shows
that Ruby/Odeum is about 10 times faster when doing a search.

Comments are welcome.

Zed A. Shaw
 
E

ES

Le 15/5/2005 said:
Hi All,

At the risk of starting a major flame war and giving Java player-haters
more fuel for their ire, I've done a performance comparison between
Ruby/Odeum and Lucene:

http://www.zedshaw.com/projects/ruby_odeum/performance.html

Please don't take this as a "Java sucks Ruby rulez" posting, or that
I've done any sort of scientific analysis here. I'm a professional Java
developer so I'm agnostic about the language wars. It's simply intended
to answer a common question I get related to Ruby/Odeum.

For the people who can't or won't read, the test is informal and shows
that Ruby/Odeum is about 10 times faster when doing a search.

Comments are welcome.

Woo hoo! Java sux0r, Ruby rules!
Zed A. Shaw

E

P.S. Good library :)
 
R

Robert Feldt

For the people who can't or won't read, the test is informal and showsEven though performance comparisons are really hard to get right/fair
this is a very nice indication. I will probably use Ruby/Odeum in an
ongoing project.

Thanks,

Robert
 
S

Stephen Kellett

Zed A. Shaw said:
For the people who can't or won't read, the test is informal and shows
that Ruby/Odeum is about 10 times faster when doing a search.

You should be able to compare the Ruby/JVM startup times by writing
minimal apps for each that are effectively

void main()
{
}

Run each 1000 times and compare.

I think 5 times is far to few when you are relying on the OS to load
stuff etc. You should discard the first time, as all subsequent times
will most likely bring your DLLs/SOs from cache.

For what its worth, on my 1GHz Athlon Windows XP box when I run Java
Performance Validator and Ruby Performance Validator, I get the
impression that Ruby startup time is longer than JVM startup time. But
then again there is all the time of the injected stub from JPV/RPV as
well, and may the RPV stub is taking longer not Ruby.

If Ruby startup time is longer than JVM startup time, that means you are
doing an even better job than you thought :). This wouldn't surprise me
as the Java String class is not built in, its JIT'd, whereas in Ruby the
string support is builtin.

Stephen
 
Z

Zed A. Shaw

You should be able to compare the Ruby/JVM startup times by writing
minimal apps for each that are effectively

void main()
{
}

Run each 1000 times and compare.
Actually, I have a confession to make in that I anticipated this and set
a trap. :)

The first thing is that there's not statistical basis for "1000 times".
You actually want to run the test several times in a series of sample
runs and then determine the common ramp-up time from a cold start.
Otherwise you'll never know if the few times you ran your "1000 times"
test were just flukes or not.

The second thing is that your simple main() for both systems actually
isn't the "start-up time" since there is complexity in the class loader,
hotspot JIT compilers, Ruby source translation, etc. All you are
testing is the time it takes to load your one little main function.

The actual way to test without the JVM and Ruby start-up times is to do
the timing inside the JVM rather than outside. In other words, have a
test case that just runs 1000 times and measure either the total time to
do the one run, or average and standard deviation of each measurement.
Again, when you do the test this way you have to figure out the common
ramp-up time for the system so that you can remove them from the test
case as outliers later.

But, of course all of this would take way too long. I'll let Lucene
folks go through that pain if they feel the need. :)
I think 5 times is far to few when you are relying on the OS to load
stuff etc. You should discard the first time, as all subsequent times
will most likely bring your DLLs/SOs from cache.
Your right in a way, but your idea that it is only the "first time"
isn't quite right as the ramp-up period can vary between runs.

FYI, I did the mean of 5 samples after running a few to get rid of
ramp-up. I just "eye-balled" the ramp-up, so don't quote me on the
validity at all.

Also, there's solid statistics behind only doing a few samples, but I
didn't use any of those techniques. I believe entire industries have
been founded on papers with only 3 samples. :)
For what its worth, on my 1GHz Athlon Windows XP box when I run Java
Performance Validator and Ruby Performance Validator, I get the
impression that Ruby startup time is longer than JVM startup time. But
then again there is all the time of the injected stub from JPV/RPV as
well, and may the RPV stub is taking longer not Ruby.
Interesting.

If Ruby startup time is longer than JVM startup time, that means you are
doing an even better job than you thought :). This wouldn't surprise me
as the Java String class is not built in, its JIT'd, whereas in Ruby the
string support is builtin.
I don't know, the JVM JIT really punishes command line tools to death,
and it's such a pain to turn it off. JIT rocks for long running
processes, but a test like this is probably being seriously punished.
That's why I was a bit sheepish about the 10 times faster claim without
specifically saying that I wanted to include start-up time since I'm
writing a CLI tool.

Thanks for the comments.

Zed
 
S

Stephen Kellett

Zed A. Shaw said:
The first thing is that there's not statistical basis for "1000 times".

There is. The error is smaller. If you don't believe me you need to
examine why pollsters always ask at least 1000 potential voters their
opinion. The error rate is +/- 3% with a sample size of approx 1000
voters. Ask 10 people and predict the election result and your error
will be much greater than 3%. The pollsters are in it to make money
predicting outcomes. If they could get away with 5 or 10 samples, they
would. It would be more profitable. They don't do it that way.

Also, having written timing analysis programs and deliberately written
in options to allow me to run the test once, 10 times, a million times
whatever, I notice the more tests you run, the errors from the fast one
and the slow one get averaged out and you get closer to what the real
result it.

I disagree with you.
You actually want to run the test several times in a series of sample
runs and then determine the common ramp-up time from a cold start.

Well if you want to do that, I hope you cold start includes a reboot of
the machine. You don't want anything in the cache.
Otherwise you'll never know if the few times you ran your "1000 times"
test were just flukes or not.

I think you misunderstand me. I mean you need to run your test 1000
times, not put you test in a loop for a 1000 times and run it a few
times. If you are doing that from a cold-start (after boot up) I can see
why you wouldn't want to do that :). If you are just doing it from the
command line, wrap it in a shell script to execute ruby/jvm 1000 times.

The number doesn't have to be 1000, it needs to be something large
enough to make the error small enough to be discountable. You decided
what is discountable for your purposes.
Also, there's solid statistics behind only doing a few samples, but I
didn't use any of those techniques. I believe entire industries have
been founded on papers with only 3 samples. :)

I think you are mistaken. Back to the pollsters again...
You are voting liberal, he is voting conservative and she is voting
Labour. So whats the result of the election? :)

Stephen
(For American readers, replace with Ralf Nader, Republican and
Democrat).
 
S

Steven Jenkins

Stephen said:
There is. The error is smaller. If you don't believe me you need to
examine why pollsters always ask at least 1000 potential voters their
opinion. The error rate is +/- 3% with a sample size of approx 1000
voters. Ask 10 people and predict the election result and your error
will be much greater than 3%. The pollsters are in it to make money
predicting outcomes. If they could get away with 5 or 10 samples, they
would. It would be more profitable. They don't do it that way.

True (mostly), but irrelevant. Those statistics apply to problems of
estimating proportions, but this isn't one.

Characterizing performance of systems like this can expressed as a
simple linear regression problem:

t = a + bx + e

where

t = runtime
a = fixed overhead (startup, teardown, etc.)
b = runtime per 'size' unit
x = size of request or returned data
e = random error

Choose N values of x and observe their corresponding t values. Estimate
a and b using standard regression techniques.

The "goodness" (i.e., the variance) of the estimates of a and b depends
on the variance of e and the value of N. If var(e) is small, you can get
good estimates of a and b with small N. In particular, if var(e) = 0,
you can get perfect estimates of a and b with N = 2.

If I needed 1000 samples to get good estimates of performance of an
information system, I'd stop trying to overcome that with large numbers
and figure out why randomness plays such a large role in the performance
of my system.

Steve
 
S

Stephen Kellett

Steven Jenkins said:
and figure out why randomness plays such a large role in the performance
of my system.

The execution of other programs running on your multi-process capable
computer. Hence my rationale.

Stephen
 
S

Steven Jenkins

Stephen said:
The execution of other programs running on your multi-process capable
computer. Hence my rationale.

Your rationale invoked an analogy to estimation of proportions, which,
as I noted, does not apply to the problem at hand. The mere existence of
random disturbances does not imply the need to run 1000 tests.

The right way to do it is to calculate the variances, or better still,
the confidence intervals of the performance estimates. Pardon my
bluntness, but anything else is hand-waving.

Steve
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top