For performance, write it in C

C

Chad Perrin

I suspect it depends what you're doing...

To clarify: I meant "on average" or "in general". Obviously, there will
be instances where Java will outperform Haskell or, for that matter,
even C -- just as there are times Perl can outperform C, for an
equivalent amount of invested programmer time, et cetera. I suspect the
same is true even of Ruby, despite its comparatively crappy execution
speed. That doesn't change the fact that in the majority of cases,
Haskell will outperform most other languages. It is, after all, the C
of functional programming.
I'm interested to know more about that.
Could you elaborate? A reference would do.

I'm having difficulty finding citations for this that actually explain
anything, but the short and sloppy version is as follows:

Because imperative style programming had "won" the programming paradigm
battle back in the antediluvian days of programming, processors have
over time been oriented more and more toward efficient execution of code
written in that style. When a new processor design and a new
instruction set for a processor is shown to be more efficient in code
execution, it is more efficient because it has been better architected
for the software that will run on it, to better handle the instructions
that will be given to it with alacrity. Since almost all programs
written today are written in imperative, rather than functional, style,
this means that processors are optimized for execution of imperative
code (or, more specifically, execution of binaries that are compiled
from imperative code).

As a result, functional programming languages operate at a slight
compilation efficiency disadvantage -- a disadvantage that has been
growing for decades. There are off-hand remarks all over the web about
how functional programming languages supposedly do not compile as
efficiently as imperative programming languages, but these statements
only tell part of the story: the full tale is that functional
programming languages do not compile as efficiently on processors
optimized for imperative-style programming.

We are likely heading into an era where that will be less strictly the
case, however, and functional languages will be able to start catching
up, performance-wise. Newer programming languages are beginning to get
further from their imperative roots, incorporating more characteristics
of functional-style languages (think of Ruby's convergence on Lisp, for
instance). For now, however, O'Caml and, even moreso, Haskell suffer at
a disadvantage because their most efficient execution environment isn't
available on our computers.
 
C

Chad Perrin

Something else to consider is the ease with which Ruby extensions can
be written in C. The first time I tried I has something running in 20
minutes.

Though if I was going to choose a (single) language for raw
performance I'd try to go with Pascal or Ada.

Pascal's sort of an iffy proposition for me, in comparison with C. I'm
simply not sure that it can be optimized as thoroughly as C, in any
current implementations. According to its spec, it can probably
outperform C if implemented well, and Borland Delphi does a reasonably
good job of that, but it has received considerably less attention from
compiler programmers over time and as such is probably lagging in
implementation performance. It's kind of a mixed bag, and I'd like to
get more data on comparative performance characteristics than I
currently have.

Ada, on the other hand -- for circumstances in which it is most commonly
employed (embedded systems, et cetera), it does indeed tend to kick C's
behind a bit. That may have more to do with compiler optimization than
language spec, though.
 
C

Chad Perrin

This is a great post, and should at least be posted to a blog somewhere so
the masses who don't know about USENET can still find it on Google!

This list is not only on USENET, for what it's worth.
 
A

Ashley Moran

This recent mania for VMs is irksome to me. The same benefits can be
had from a JIT compiler, without the attendant downsides of a VM (such
as greater persistent memory usage, et cetera).

Chad,

I'm late to this conversation but I've been interested in Ruby
performance lately. I just had to write a script to process about
1-1.5GB of CSV data (No major calculations, but it involves about 20
million rows, or something in that region). The Ruby implementation
I wrote takes about 2.5 hours to run - I think memory management is
the main issue as the manual garbage collection run I added after
each file goes into several minutes for the larger sets of data. As
you can imagine, I am more than eager for YARV/Rite.

Anyway, my question really is that I thought a VM was a prerequisite
or JIT? Is that not the case? And if the YARV VM is not the way to
go, what is?

Ashley
 
C

Chad Perrin

Peter said:
I will run your Ruby version and the Java version that I write and po= st
the results here. Give us a week or so as I have other things to be d=
oing.
=20
Hmm, in a week this discussion will be over (ok, it will reappear some = time
soon, but nevertheless) and everybody has swallowed your points.
=20
$ ruby -v
ruby 1.8.4 (2005-12-24) [i386-mingw32]
=20
$ time ruby latin.rb 5 > latin.txt
=20
real 0m4.703s
user 0m0.015s
sys 0m0.000s
=20
(this is a 2.13GHz PentiumM, 1GB RAM, forget the user and sys timings, = but
'real' is for real, this is WinXP)

Holy crap, that's fast.

--=20
CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ]
"A script is what you give the actors. A program
is what you give the audience." - Larry Wall
 
R

Ron M

Charles said:
I'll lob a couple of grenades and then duck for cover.

- Write it in C is as valid as write it in Java (as someone else
mentioned).

Not really. In C you can quite easily use inline assembly
to do use your chips MMX/SSE/VIS/AltiVec extensions and if
you need more, interface to your GPU if you want to use it
as a coprocessor.

I don't know of any good way of doing those in Java except
by writing native extensions in C or directly with an assembler.

Last I played with Java it didn't have a working cross-platform
mmap, and if that's still true, the awesome NArray+mmap Ruby
floating around is a good real-world example of this flexibility.
 
A

ara.t.howard

Just to show the beauty of ruby:
-----------------------------------------------------------
require 'rubygems'
require 'permutation'
require 'set'

$size = (ARGV.shift || 5).to_i

$perms = Permutation.new($size).map{|p| p.value}
$out = $perms.map{|p| p.map{|v| v+1}.join}
$filter = $perms.map do |p|
s = SortedSet.new
$perms.each_with_index do |o, i|
o.each_with_index {|v, j| s.add(i) if p[j] == v}
end && s.to_a
end

$latins = []
def search lines, possibs
return $latins << lines if lines.size == $size
possibs.each do |p|
search lines + [p], (possibs -
$filter[p]).subtract(lines.last.to_i..p)
end
end

search [], SortedSet[*(0...$perms.size)]

$latins.each do |latin|
$perms.each do |perm|
perm.each{|p| puts $out[latin[p]]}
puts
end
end
-----------------------------------------------------------
(does someone has a nicer/even faster version?)

would you please run that on your machine?
perhaps you have to do a "gem install permutation"
(no I don't think it's faster than your C code, but
it should beat the perl version)
If you really really want that performance boost then take
the following
advice very seriously - "Write it in C".

Agreed, 100%, for those who want speed, speed and nothing
else there is hardly a better way.

thanks

Simon

harp:~ > time ruby latin.rb 5 > 5.out
real 0m11.170s
user 0m10.840s
sys 0m0.040s

harp:~ > uname -srm
Linux 2.4.21-40.EL i686

harp:~ > cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 2
model name : Intel(R) Pentium(R) 4 CPU 2.40GHz
stepping : 7
cpu MHz : 2386.575
cache size : 512 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm
bogomips : 4757.91

harp:~ > ruby -v
ruby 1.8.4 (2005-12-01) [i686-linux]


not too shabby. definite not worth the hassle for 5 seconds of c.

-a
 
F

Francis Cianfrocca

Ashley said:
performance lately. I just had to write a script to process about
1-1.5GB of CSV data (No major calculations, but it involves about 20
million rows, or something in that region).

I've had tremendous results optimizing Ruby programs that process huge
piles of text. There is a range of "tricks" you can use to keep Ruby
from wasting memory, which is its real downfall. If it's possible, given
your application, to process your CSV text in such a way that you don't
store any transformations of the whole set in memory at once, you'll go
orders of magnitude faster. You can even try to break your computation
up into multiple stages, and stream the intermediate results out to
temporary files. As ugly as that sounds, it will be far faster.

In regard to the whole conversation on this thread: at the end of the
day, absolute performance only matters if you can put a dollar amount on
it. That makes the uncontexted language comparisons essentially
meaningless.

In regard to YARV: I get a creepy feeling about anything that is
considered by most of the world to be the prospective answer to all
their problems. And as a former language designer, I have some reasons
to believe that a VM will not be Ruby's performance panacea.
 
C

Chad Perrin

I'm late to this conversation but I've been interested in Ruby
performance lately. I just had to write a script to process about
1-1.5GB of CSV data (No major calculations, but it involves about 20
million rows, or something in that region). The Ruby implementation
I wrote takes about 2.5 hours to run - I think memory management is
the main issue as the manual garbage collection run I added after
each file goes into several minutes for the larger sets of data. As
you can imagine, I am more than eager for YARV/Rite.

Anyway, my question really is that I thought a VM was a prerequisite
or JIT? Is that not the case? And if the YARV VM is not the way to
go, what is?

The canonical example for comparison, I suppose, is the Java VM vs. the
Perl JIT compiler. In Java, the source is compiled to bytecode and
stored. In Perl, the source remains in source form, and is stored as
ASCII (or whatever). When execution happens with Java, the VM actually
interprets the bytecode. Java bytecode is compiled for a virtual
computer system (the "virtual machine"), which then runs the code as
though it were native binary compiled for this virtual machine. That
virtual machine is, from the perspective of the OS, an interpreter,
however. Thus, Java is generally half-compiled and half-interpreted,
which speeds up the interpretation process.

When execution happens in Perl 5.x, on the other hand, a compiler runs
at execution time, compiling executable binary code from the source. It
does so in stages, however, to allow for the dynamic runtime effects of
Perl to take place -- which is one reason the JIT compiler is generally
preferable to a compiler of persistent binary executables in the style
of C. Perl is, thus, technically a compiled language, and not an
interpreted language like Ruby.

Something akin to bytecode compilation could be used to improve upon the
execution speed of Perl programs without diverging from the
JIT-compilation execution it currently uses and also without giving up
any of the dynamic runtime capabilities of Perl. This would involve
running the first (couple of) pass(es) of the compiler to produce a
persistent binary compiled file with the dynamic elements still left in
an uncompiled form, to be JIT-compiled at execution time. That would
probably grant the best performance available for a dynamic language,
and would avoid the overhead of a VM implementation. It would, however,
require some pretty clever programmers to implement in a sane fashion.

I'm not entirely certain that would be appropriate for Ruby, considering
how much of the language ends up being dynamic in implementation, but it
bothers me that it doesn't even seem to be up for discussion. In fact,
Perl is heading in the direction of a VM implementation with Perl 6,
despite the performance successes of the Perl 5.x compiler. Rather than
improve upon an implementation that is working brilliantly, they seem
intent upon tossing it out and creating a different implementation
altogether that, as far as I can see, doesn't hold out much hope for
improvement. I could, of course, be wrong about that, but that's how it
looks from where I'm standing.

It just looks to me like everyone's chasing VMs. While the nontrivial
problems with Java's VM are in many cases specific to the Java VM (the
Smalltalk VMs have tended to be rather better designed, for instance),
there are still issues inherent in the VM approach as currently
envisioned, and as such it leaves sort of a bad taste in my mouth.

I think I've rambled. I'll stop now.
 
A

ara.t.howard

In regard to YARV: I get a creepy feeling about anything that is
considered by most of the world to be the prospective answer to all
their problems. And as a former language designer, I have some reasons
to believe that a VM will not be Ruby's performance panacea.

one of the reasons i've been pushing so hard for an msys based ruby is that
having a 'compilable' ruby on all platforms might open up developement on jit
type things like ruby inline - which is pretty dang neat.

2 cts.

-a
 
C

Chad Perrin

I've had tremendous results optimizing Ruby programs that process huge
piles of text. There is a range of "tricks" you can use to keep Ruby
from wasting memory, which is its real downfall. If it's possible, given
your application, to process your CSV text in such a way that you don't
store any transformations of the whole set in memory at once, you'll go
orders of magnitude faster. You can even try to break your computation
up into multiple stages, and stream the intermediate results out to
temporary files. As ugly as that sounds, it will be far faster.

One of these days, I'll actually know enough Ruby to be sure of what
language constructs work for what purposes in terms of performance. I
rather suspect there are prettier AND better-performing options than
using temporary files to store data during computation, however.
 
F

Francis Cianfrocca

Ron said:
Not really. In C you can quite easily use inline assembly
to do use your chips MMX/SSE/VIS/AltiVec extensions and if
you need more, interface to your GPU if you want to use it
as a coprocessor.

I don't know of any good way of doing those in Java except
by writing native extensions in C or directly with an assembler.

Last I played with Java it didn't have a working cross-platform
mmap, and if that's still true, the awesome NArray+mmap Ruby
floating around is a good real-world example of this flexibility.


Your point about Java here is very well-taken. I'd add that you don't
even really need to drop into asm to get most of the benefits you're
talking about. C compilers are really very good at optimizing, and I
think you'll get nearly all of the available performance benefits from
well-written C alone. (I've written at least a million lines of
production asm code in my life, as well as a pile of commercial
compilers for various languages.) It goes back to economics again. A
very few applications will gain so much incremental value from the extra
5-10% performance boost that you get from hand-tuned asm, that it's
worth the vastly higher cost (development, maintenance, and loss of
portability) of doing the asm. A tiny number of pretty unusual apps
(graphics processing, perhaps) will get a lot more than 10% from asm.

The performance increment in going from Ruby to C is in *many* cases a
lot more than 10%, in fact it can easily be 10,000%.
 
F

Francis Cianfrocca

Chad said:
One of these days, I'll actually know enough Ruby to be sure of what
language constructs work for what purposes in terms of performance. I
rather suspect there are prettier AND better-performing options than
using temporary files to store data during computation, however.

Ashley was talking about 1GB+ datasets, iirc. I'd love to see an
in-memory data structure (Ruby or otherwise) that can slug a few of
those around without breathing hard. And on most machines, you're going
through the disk anyway with a dataset that large, as it thrashes your
virtual-memory. So why not take advantage of the tunings that are built
into the I/O channel?

If I'm using C, I always handle datasets that big with the kernel vm
functions- generally faster than the I/O functions. I don't know how to
do that portably in Ruby (yet).
 
I

Isaac Gouy

Chad said:
On Wed, Jul 26, 2006 at 11:29:06PM +0900, Ryan McGovern wrote: -snip-
For those keen on functional programming syntax, Haskell is a better
choice than Java for performance: in fact, the only thing keeping
Haskell from performing as well as C, from what I understand, is the
current state of processor design. Similarly, O'Caml is one of the
fastest non-C languages available: it consistently, in a wide range of
benchmark tests and real-world anecdotal comparisons, executes "at least
half as quickly" as C, which is faster than it sounds.

For those keen on functional programming, Clean produces small fast
executables.
The OP is right, though: if execution speed is your top priority, use C.
Java is an also-ran -- what people generally mean when they say that
Java is almost as fast as C is that a given application written in both
C and Java "also runs in under a second" in Java, or something to that
effect. While that may be true, there's a significant difference
between 0.023 seconds and 0.8 seconds (for hypothetical example).

That sounds wrong to me - I hear positive comments about Java
performance for long-running programs, not for programs that run in
under a second.
 
H

Hal Fulton

Isaac said:
That sounds wrong to me - I hear positive comments about Java
performance for long-running programs, not for programs that run in
under a second.

JIT is the key to a lot of that. Performance depends greatly on
the compiler, the JVM, the algorithm, etc.

I won a bet once from a friend. We wrote comparable programs in
Java and C++ (some arbitrary math in a loop running a bazillion
times).

With defaults on both compiles, the Java was actually *faster*
than the C++. Even I didn't expect that. But as I said, this
sort of thing is highly dependent on many different factors.


Hal
 
A

Ashley Moran

Ashley was talking about 1GB+ datasets, iirc. I'd love to see an
in-memory data structure (Ruby or otherwise) that can slug a few of
those around without breathing hard. And on most machines, you're
going
through the disk anyway with a dataset that large, as it thrashes your
virtual-memory. So why not take advantage of the tunings that are
built
into the I/O channel?

If I'm using C, I always handle datasets that big with the kernel vm
functions- generally faster than the I/O functions. I don't know
how to
do that portably in Ruby (yet).


I think the total data size is about 1.5GB, but the individual files
are smaller, the largest being a few hundred GB. The most rows in a
file is ~15,000,000 I think. The server I run it on has 2GB RAM (an
Athlon 3500+ running FreeBSD/amd64, so the hardware is not really an
issue)... it can get all the way through without swapping (just!)

The processing is pretty trivial, and mainly involves incrementing
some ID columns so we can merge datasets together, adding a text
column to the start of every row, and eliminating a few duplicates.
The output file is gzipped (sending the output of CSV::Writer through
GzipWriter). I could probably rewrite it so that most files are
output a line at a time, and call out to the command line gzip. Only
the small files *need* to be stored in RAM for duplicate removal,
others are guaranteed unique. At the time I didn't think using RAM
would give such a huge performance hit (lesson learnt).

I might also look into Kirk's suggestion of FasterCSV. If all this
doesn't improve things, there's always the option of going dual-core
and forking to do independent files.

However... the script can be run at night so even in its current
state it's acceptable. It will only need serious work if we start
adding many more datasets into the routine (we're using two out of a
conceivable 4 or 5, I think). In that case we could justify buying a
faster CPU if it got out of hand, rather than rewrite it in C. But
that's more a reflection of hardware prices than my wages :)

I have yet to write anything in Ruby was less than twice as fast to
code as it would have been in bourne-sh/Java/whatever, never mind
twice as fun or maintainable. I recently rewrote an 830 line Java/
Hibernate web service client as 67 lines of Ruby, in about an hour.
With that kind of productivity, performance can go to hell!

Ashley
 
A

ara.t.howard

I think the total data size is about 1.5GB, but the individual files are
smaller, the largest being a few hundred GB. The most rows in a file is
~15,000,000 I think. The server I run it on has 2GB RAM (an Athlon 3500+
running FreeBSD/amd64, so the hardware is not really an issue)... it can get
all the way through without swapping (just!)

The processing is pretty trivial, and mainly involves incrementing some ID
columns so we can merge datasets together, adding a text column to the start
of every row, and eliminating a few duplicates. The output file is gzipped
(sending the output of CSV::Writer through GzipWriter). I could probably
rewrite it so that most files are output a line at a time, and call out to
the command line gzip. Only the small files *need* to be stored in RAM for
duplicate removal, others are guaranteed unique. At the time I didn't think
using RAM would give such a huge performance hit (lesson learnt).

I might also look into Kirk's suggestion of FasterCSV. If all this doesn't
improve things, there's always the option of going dual-core and forking to
do independent files.

However... the script can be run at night so even in its current state it's
acceptable. It will only need serious work if we start adding many more
datasets into the routine (we're using two out of a conceivable 4 or 5, I
think). In that case we could justify buying a faster CPU if it got out of
hand, rather than rewrite it in C. But that's more a reflection of hardware
prices than my wages :)

I have yet to write anything in Ruby was less than twice as fast to code as
it would have been in bourne-sh/Java/whatever, never mind twice as fun or
maintainable. I recently rewrote an 830 line Java/Hibernate web service
client as 67 lines of Ruby, in about an hour. With that kind of
productivity, performance can go to hell!

i process tons of big csv files and use this approach:

- parse the first line, remember cell count

- foreach line
- attempt parsing using simple split, iff that fails fall back to csv.rb
methods


something like

n_fields = nil

f.each do |line|
fields = lines.split %r/,/
n_fields ||= fields.size

if fields.size != n_fields
fields = parse_with_csv_lib line
end

...
end

this obviously won't work with csv files that have cells spanning lines, but
for simply stuff it can speed up parsing in a huge way.

-a
 
F

Francis Cianfrocca

Charles said:
I would challenge the Ruby
community at large to expect more from Ruby proper before giving up the
dream of highly-performant Ruby code and plunging into the C.

Much depends on what is wanted from the language. My friends know me for
a person who will gladly walk a very long way to get an incremental
performance improvement in any program. But I don't dream of
highly-performant Ruby code. I dream of highly-scalable applications
that can work with many different kinds of data seamlessly and link
business people and their customers together in newer, faster, more
secure ways than have ever been imagined before. I want to be able to
turn almost any kind of data, wherever it is, into actionable
information and combine it flexibly with any other data. I want to be
able to simply drop any piece of new code into a network and
automatically have it start working with other components in the
(global) network. I want a language system that can gracefully and
powerfully model all of these new kinds of interactions without
requiring top-down analysis of impossibly large problem domains and
rigid program-by-contract regimes. Ruby has unique characteristics,
among all other languages that I know, that qualify it for a first
approach to my prticular dream. Among these are the excellent
metaprogramming support, the open classes, the adaptability to tooling,
and (yes) the generally-acceptable performance.

If one's goal is to get a program that will take the least amount of
time to plow through some vector mathematics problem, then by all means
let's have the language-performance discussion. But to me, most of these
compute-intensive tasks are problems that have been being addressed by
smart people ever since Fortran came along. We don't necessarily need
Ruby to solve them.

We do need Ruby to solve a very different set of next-generation
problems, for which C and Java (and even Perl and Python) are very
poorly suited.
 
A

Ashley Moran

It just looks to me like everyone's chasing VMs. While the nontrivial
problems with Java's VM are in many cases specific to the Java VM (the
Smalltalk VMs have tended to be rather better designed, for instance),
there are still issues inherent in the VM approach as currently
envisioned, and as such it leaves sort of a bad taste in my mouth.

Chad...

Just out of curiosity (since I don't know much about this subject),
what do yo think of the approach Microsoft took with the CLR? From
what I read it's very similar to the JVM except it compiles directly
to native code, and makes linking to native libraries easier. I
assume this is closer to JVM behaviour than Perl 5 behaviour. Is
there anything to be learnt from it for Ruby?

Ashley
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,071
Latest member
MetabolicSolutionsKeto

Latest Threads

Top