compiler for ruby

N

n/a

Hi,

Complete newbie here; is there anything in the works as far as a compiler
for Ruby?

Thanks.
 
R

Rick DeNatale

I meant under windows and/or Linux environments.

And source code to machine code.

I had done a search of this group but I guess didn't download enough
headers to see the previous threads.

I see there are various forms of compilers that work at different levels of
code, e.g. XRuby to Java Bytecode, etc.

Yes there are a variety of approaches to compiling Ruby to either Java
bytecodes (e.g. XRuby) or bytecodes more specifically tuned to Ruby
semantics (e.g. YARV which is now in Ruby 1.9). I think most people
these days think that bytecode == Java bytecode, but the idea preceded
Java.

As of today YARV seems to be the best performing, at least according
to the benchmarks I've seen.

As for compiling directly to machine code, it could be done I suppose,
but it's not clear that it would be the best approach. Why?

* The dynamic nature of Ruby means that methods can be dynamically
created at run-time and would therefore need to be compiled at
run-time. Additional bookkeeping would be required to make all the
semantic effects on the compiled code would be properly implemented.

* Previous experience with compiling dynamic OO languages has shown
that the much smaller code representation of byte codes compared to
machine code can actually lead to better performance on machines with
virtual memory (almost all machines these days) due to the smaller
working set. Digitalk tried direct compilation of Smalltalk to
machine code, because they were sick of getting blasted for being
'interpreted' and found that the byte coded stuff ran significantly
faster. The practice these days is to do two-stage compilation, first
to byte-codes, and then to machine code for selected code when the
run-time detects that that code is frequently executed.
 
R

Robert Dober

Np I guess your question was very nicely answered by a nice and
competent member I do not agree with Phil's welcome message.

Welcome to the group
Robert
 
M

M. Edward (Ed) Borasky

Rick said:
Yes there are a variety of approaches to compiling Ruby to either Java
bytecodes (e.g. XRuby) or bytecodes more specifically tuned to Ruby
semantics (e.g. YARV which is now in Ruby 1.9). I think most people
these days think that bytecode == Java bytecode, but the idea preceded
Java.
I first encountered the idea of a "virtual machine" for reasons of
portability in the early 1960s. However, the idea probably predates that
and goes back to the very early days of computer languages.
As of today YARV seems to be the best performing, at least according
to the benchmarks I've seen.

As for compiling directly to machine code, it could be done I suppose,
but it's not clear that it would be the best approach. Why?

* The dynamic nature of Ruby means that methods can be dynamically
created at run-time and would therefore need to be compiled at
run-time. Additional bookkeeping would be required to make all the
semantic effects on the compiled code would be properly implemented.

* Previous experience with compiling dynamic OO languages has shown
that the much smaller code representation of byte codes compared to
machine code can actually lead to better performance on machines with
virtual memory (almost all machines these days) due to the smaller
working set. Digitalk tried direct compilation of Smalltalk to
machine code, because they were sick of getting blasted for being
'interpreted' and found that the byte coded stuff ran significantly
faster. The practice these days is to do two-stage compilation, first
to byte-codes, and then to machine code for selected code when the
run-time detects that that code is frequently executed.
The prototype for a lot of this is (most implementations of) Forth.
There is an "inner interpreter", which was originally indirect threaded
for portability. However, it can be direct threaded, which is faster,
subroutine threaded, which is still faster, or "token" threaded, which
is the most compact. This last corresponds most closely to what we think
of as "byte code".

Yes, compactness of code is indeed a virtue on "modern machines",
although I suspect it's more an issue of caching than virtual memory. By
the way, in "reality", I don't think Ruby is any more "dynamic" than
languages we normally think of as "static". Almost any decent-sized
program or collection of programs is going to have things that are bound
early and things that aren't bound till run time, regardless of what
languages the implementors used.
 
N

n/a

Np I guess your question was very nicely answered by a nice and
competent member I do not agree with Phil's welcome message.

Welcome to the group
Robert

Robert,
It's not always easy being brand new to programming AND to Ruby so
a bit of a friendly attitude such as yours goes a long way, IMHO. And is
very much appreciated.
 
N

n/a

Yes there are a variety of approaches to compiling Ruby to either Java
bytecodes (e.g. XRuby) or bytecodes more specifically tuned to Ruby
semantics (e.g. YARV which is now in Ruby 1.9). I think most people
these days think that bytecode == Java bytecode, but the idea preceded
Java.

As of today YARV seems to be the best performing, at least according
to the benchmarks I've seen.

As for compiling directly to machine code, it could be done I suppose,
but it's not clear that it would be the best approach. Why?

* The dynamic nature of Ruby means that methods can be dynamically
created at run-time and would therefore need to be compiled at
run-time. Additional bookkeeping would be required to make all the
semantic effects on the compiled code would be properly implemented.

* Previous experience with compiling dynamic OO languages has shown
that the much smaller code representation of byte codes compared to
machine code can actually lead to better performance on machines with
virtual memory (almost all machines these days) due to the smaller
working set. Digitalk tried direct compilation of Smalltalk to
machine code, because they were sick of getting blasted for being
'interpreted' and found that the byte coded stuff ran significantly
faster. The practice these days is to do two-stage compilation, first
to byte-codes, and then to machine code for selected code when the
run-time detects that that code is frequently executed.

Rick,
Thanks for the informative reply. Very helpful.
 
R

Robert Dober

Robert,
It's not always easy being brand new to programming AND to Ruby so
a bit of a friendly attitude such as yours goes a long way, IMHO. And is
very much appreciated.
Oh do not mention it if it had not been me somebody else would have
said the same, it is the group really...glad you feel well here.

Cheers
Robert
 
C

Clifford Heath

Rick said:
* Previous experience with compiling dynamic OO languages has shown
that the much smaller code representation of byte codes compared to
machine code can actually lead to better performance on machines with
virtual memory (almost all machines these days) due to the smaller
working set.

Absolutely correct... but there's even more to this
than meets the eye. The working set that matters most
is the cache, not the RAM. When you have an instruction
cycle time that's more than 50x the RAM cycle time, you
can do a lot of work on something that's already in the
cache while you're waiting for the next cache line to
fill.

Reducing the working set on boxes with GB of RAM typically
has more effect through decreasing cache spills than via
reductions in page faults. The byte-codes also go in
d-cache while the interpreter itself is in I-cache.

Clifford Heath.
 
M

M. Edward (Ed) Borasky

Clifford said:
Reducing the working set on boxes with GB of RAM typically
has more effect through decreasing cache spills than via
reductions in page faults. The byte-codes also go in
d-cache while the interpreter itself is in I-cache.
You need a very carefully designed inner interpreter for this to be
useful. See
http://dec.bournemouth.ac.uk/forth/euro/ef03/ertl-gregg03.pdf and
http://dec.bournemouth.ac.uk/forth/euro/ef02/ertl02.pdf for some
interesting ways this can be done with the inner interpreter still in C
(although it does exploit some features of GCC that not all C compilers
know about.
 
C

Clifford Heath

M. Edward (Ed) Borasky said:
You need a very carefully designed inner interpreter for this to be
useful.

Good stuff, Ed, but not really what I meant.
They're modifying direct-threaded code to
aggregate common sequences of functions AIUI,
where I wasn't really talking about threaded
code at all, but byte-code. I've used aggressive
inlining to build an interpreter with nearly all
the primitives in one function, leaving normal
C register variables available as registers, and
found that worked quite well (for emulating a
small microprocessor on a 386, rather than for
byte code). The interesting thing is what a good
compiler can do with such a large function if
it's built this way. You can avoid most call
overhead and have a compact switch table if you
have a well-designed byte-code. Even if the byte
code is highly dense, so that each code needs to
be looked at several times to be executed, that
isn't a problem once it's in cache, as the very
next thing you're often going to do is to fetch
more data or byte-code, and you'll have to wait
for that - so using some of those CPU cycles
decoding the byte-code doesn't hurt much.

Clifford Heath.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,051
Latest member
CarleyMcCr

Latest Threads

Top