Ruby 1.8 vs 1.9

P

Peter Pincus

Hi,

how much longer will Ruby 1.8(.7) be maintained ? Is it advisable to
dive into 1.9(.2) ? What are the immediate advantages of using 1.9
over 1.8 ?

Thanks,
..
Pete Pincus
 
C

Chuck Remes

Hi,

how much longer will Ruby 1.8(.7) be maintained ? Is it advisable to
dive into 1.9(.2) ? What are the immediate advantages of using 1.9
over 1.8 ?

I believe the guys at EngineYard are in charge of backporting fixes to the 1.8.7 branch. I also heard there was a 1.8.8 coming at some point to be the final release in the 1.8 series.

I use 1.9.2p0 daily and find it to be extremely stable and fast. I would say the biggest reason to use it is to get a performance boost. Most of your code from 1.8 will "just work." My code sees a 2-5x speedup on 1.9.2 versus 1.8.7.

Why not try it on your code and see for yourself? With tools like rvm (for unix) and pik (for windows) it's a breeze to have multiple rubies installed simultaneously.

cr
 
R

Ryan Davis

I use 1.9.2p0 daily and find it to be extremely stable and fast. I =
would say the biggest reason to use it is to get a performance boost. =
Most of your code from 1.8 will "just work." My code sees a 2-5x =
speedup on 1.9.2 versus 1.8.7.

That's really variable and depends on what you're doing.

All of my text processing code needed reworking, and text processing is =
(was?) noticeably slower in ruby 1.9 than it is in 1.8.
 
P

Philip Rhoades

That's really variable and depends on what you're doing.

All of my text processing code needed reworking, and text processing
is (was?) noticeably slower in ruby 1.9 than it is in 1.8.


Who do I talk to get 1.9 RPMs produced for Fedora?

Thanks,

Phil.
--
Philip Rhoades

GPO Box 3411
Sydney NSW 2001
Australia
E-mail: (e-mail address removed)
 
P

Phillip Gawlowski

Who do I talk to get 1.9 RPMs produced for Fedora?

Just a guess: The Ruby (or Programming/Script language) maintainers of
the Fedora project.

--
Phillip Gawlowski

Though the folk I have met,
(Ah, how soon!) they forget
When I've moved on to some other place,
There may be one or two,
When I've played and passed through,
Who'll remember my song or my face.
 
C

Chuck Remes

That's really variable and depends on what you're doing.

All of my text processing code needed reworking, and text processing is (was?) noticeably slower in ruby 1.9 than it is in 1.8.

Definitely true. That's why I was careful to say "My code sees a 2-5x speedup..." because I have seen a few instances where 1.9 is a tad pokier. But clearly 1.9 is the future so sticking with 1.8 seems like a bad long-term bet.

cr
 
B

Brian Candler

Chuck Remes wrote in post #963430:
I use 1.9.2p0 daily and find it to be extremely stable and fast. I would
say the biggest reason to use it is to get a performance boost. Most of
your code from 1.8 will "just work." My code sees a 2-5x speedup on
1.9.2 versus 1.8.7.

And just to give some balance: the biggest reason not to use 1.9 is
because of the incredible complexity which has been added to the String
class, and the ability it gives you to make programs which crash under
unexpected circumstances.

For example, an expression like

s1 = s2 + s3

where s2 and s3 are both Strings will always work and do the obvious
thing in 1.8, but in 1.9 it may raise an exception. Whether it does
depends not only on the encodings of s2 and s3 at that point, but also
their contents (properties "empty?" and "ascii_only?")

The encodings of strings you read may also be affected by the locale set
from the environment, unless you explicitly code against that. This
means the same program with the same data may work on your machine, but
crash on someone else's.

https://github.com/candlerb/string19/blob/master/string19.rb
https://github.com/candlerb/string19/blob/master/soapbox.rb
 
O

Oliver Schad

Brian said:
And just to give some balance: the biggest reason not to use 1.9 is
because of the incredible complexity which has been added to the
String class, and the ability it gives you to make programs which
crash under unexpected circumstances.

Sounds great. ;-) Can somebody else confirm this?

Regards
Oli
 
M

Michael Fellinger

Chuck Remes wrote in post #963430:

And just to give some balance: the biggest reason not to use 1.9 is
because of the incredible complexity which has been added to the String
class, and the ability it gives you to make programs which crash under
unexpected circumstances.

For example, an expression like

=C2=A0 s1 =3D s2 + s3

where s2 and s3 are both Strings will always work and do the obvious
thing in 1.8, but in 1.9 it may raise an exception. Whether it does
depends not only on the encodings of s2 and s3 at that point, but also
their contents (properties "empty?" and "ascii_only?")

The encodings of strings you read may also be affected by the locale set
from the environment, unless you explicitly code against that. This
means the same program with the same data may work on your machine, but
crash on someone else's.

And that's why I use and love 1.9.
The obvious thing isn't so obvious if you actually care about
encodings, and if you are mindful about what comes from where, it's
actually helpful to find otherwise hidden issues.
I hear nobody complain that 1 / 0 raises but 1.0 / 0.0 gives Infinity,
which I find pretty counter-intuitive, and makes me check for .nan?
and .infinite? (which also fails if I call it on Fixnum instead of
Float).

Many valid complaints there, but nothing that would make me long for
the everything-is-a-string-of-bytes approach of 1.8, which made
working with encodings very brittle.
I can see how this is just annoying to someone who has only dealt with
BINARY/ASCII/UTF-8 all their lives, but please consider that most of
the world actually still uses other encodings as well.
I also want to thank you for writing string19.rb, which is a very
helpful resource for me and others, along with the series from JEG II.

--=20
Michael Fellinger
CTO, The Rubyists, LLC
 
M

Michael Fellinger

Sounds great. ;-) Can somebody else confirm this?

iota ~ % echo =CA=98 | LC_ALL=3Dja_JP.UTF8 ruby -pe '$_[1,0] =3D "=CA=98"'
=CA=98=CA=98
iota ~ % echo =CA=98 | LC_ALL=3DC ruby -pe '$_[1,0] =3D "=CA=98"'
-e:1: invalid multibyte char (US-ASCII)
-e:1: invalid multibyte char (US-ASCII)

--=20
Michael Fellinger
CTO, The Rubyists, LLC
 
O

Oliver Schad

Michael said:
Sounds great. ;-) Can somebody else confirm this?

iota ~ % echo ? | LC_ALL=ja_JP.UTF8 ruby -pe '$_[1,0] = "?"'
??
iota ~ % echo ? | LC_ALL=C ruby -pe '$_[1,0] = "?"'
-e:1: invalid multibyte char (US-ASCII)
-e:1: invalid multibyte char (US-ASCII)

So working with strings in ruby v1.9 is not supported, right?

Regards
Oli
 
B

Brian Candler

Michael Fellinger wrote in post #963539:
And that's why I use and love 1.9.
The obvious thing isn't so obvious if you actually care about
encodings, and if you are mindful about what comes from where, it's
actually helpful to find otherwise hidden issues.

Y'know, I wouldn't mind so much if it *always* raised an exception.

For example, say I have s1 tagged UTF-8 and s2 tagged ISO-8859-1. If
"s1+s2" always raised an exception, it would be easy to find, and easy
to fix.

However the 'compatibility' rules mean that this is data-sensitive. In
many cases s1+s2 will work, if either s1 contains non-ASCII characters
but s2 doesn't, or vice-versa. It's really hard to get test coverage of
all the possible cases - rcov won't help you - or you just cross your
fingers and hope.

You also need test coverage for cases where the input data is invalid
for the given encoding. In fact s1+s2 won't raise an exception in that
case, nor will s1, but s1 =~ /./ will.
I hear nobody complain that 1 / 0 raises but 1.0 / 0.0 gives Infinity,

Well, IEEE floating point is a well-established standard that has been
around for donkeys years, so I think it's reasonable to follow it.

And yes, if I see code like "c = a / b", I do think to myself "what if b
is zero?" It's easy to decide if it's expected, and whether I need to do
something other than the default behaviour. Then I move onto the next
line.

For "s3 = s1 + s2" in 1.9 I need to think to myself: "what if s1 has a
different encoding to s2, and s1 is not empty or s2 is not empty and
s1's encoding is not ASCII-compatible or s2's encoding is not
ASCII-compatible or s1 contains non-ASCII characters or s2 contains
non-ASCII characters? And what does that give as the encoding for s3 in
all those possible cases?" And then I have to carry the possible
encodings for s3 forward to the next point where it is used.
 
P

Phillip Gawlowski

For example, say I have s1 tagged UTF-8 and s2 tagged ISO-8859-1. If
"s1+s2" always raised an exception, it would be easy to find, and easy
to fix.

However the 'compatibility' rules mean that this is data-sensitive. In
many cases s1+s2 will work, if either s1 contains non-ASCII characters
but s2 doesn't, or vice-versa. It's really hard to get test coverage of
all the possible cases - rcov won't help you - or you just cross your
fingers and hope.

Convert your strings to UTF-8 at all times, and you are done. You have
to check for data integrity anyway, so you can do that in one go.
Well, IEEE floating point is a well-established standard that has been
around for donkeys years, so I think it's reasonable to follow it.

Every natural number is an element of the set of rational numbers. For
all intents and purposes, 0 == 0.0 in mathematics (unless you limit
the set of numbers you are working on to natural numbers only, and
let's just ignore irrational numbers for now). And since the 0 is
around for a bit longer than the IEEE, and the rules of math are
taught in elementary school (including "you must not and cannot divide
by zero"), Ruby exhibits inconsistent behavior for pretty much anyone
who has a little education in maths. The IEEE standards deal with
representing floating point numbers in an inherently integer-based
numerical system, but they don't supersede the rules of maths.

Ruby's behavior of returning *infinity* is the proverbial icing on the
cake, since dividing something large by something infinitely small
results in something large (so, x / 0.000000...[ad infinitum]...1 = x
; a trick used in integrals, too).

Thus, you have to exercise due diligence in this area if you want to
keep your results in the sphere of what's possible and sane.
And yes, if I see code like "c = a / b", I do think to myself "what if b
is zero?" It's easy to decide if it's expected, and whether I need to do
something other than the default behaviour. Then I move onto the next
line.

It's easy? Take a look at integrals, and infinitesimal[0] numbers.
Infinitesimal are at the same time zero and *not* zero.
For "s3 = s1 + s2" in 1.9 I need to think to myself: "what if s1 has a
different encoding to s2, and s1 is not empty or s2 is not empty and
s1's encoding is not ASCII-compatible or s2's encoding is not
ASCII-compatible or s1 contains non-ASCII characters or s2 contains
non-ASCII characters? And what does that give as the encoding for s3 in
all those possible cases?" And then I have to carry the possible
encodings for s3 forward to the next point where it is used.

Then, as I suggested above, enforce a standard encoding in your code.
Convert everything into UTF-8, and you are pretty much done.

[0] http://en.wikipedia.org/wiki/Infinitesimal
--
Phillip Gawlowski

Though the folk I have met,
(Ah, how soon!) they forget
When I've moved on to some other place,
There may be one or two,
When I've played and passed through,
Who'll remember my song or my face.
 
J

James Edward Gray II

=20
=20
Convert your strings to UTF-8 at all times, and you are done. You have
to check for data integrity anyway, so you can do that in one go.

Thank you for being the voice of reason.

I've fought against Brian enough in the past over this issue, that I try =
to stay out of it these days. However, his arguments always strike me =
as wanting to unlearn what we have learned about encodings.

We can't go back. Different encodings exist. At least Ruby 1.9 allows =
us to work with them.

James Edward Gray II
 
B

Brian Candler

Phillip Gawlowski wrote in post #963602:
Convert your strings to UTF-8 at all times, and you are done.

But that basically is my point. In order to make your program
comprehensible, you have to add extra incantations so that strings are
tagged as UTF-8 everywhere (e.g. when opening files).

However this in turn adds *nothing* to your program or its logic, apart
from preventing Ruby from raising exceptions.
Every natural number is an element of the set of rational numbers. For
all intents and purposes, 0 == 0.0 in mathematics (unless you limit
the set of numbers you are working on to natural numbers only, and
let's just ignore irrational numbers for now). And since the 0 is
around for a bit longer than the IEEE, and the rules of math are
taught in elementary school (including "you must not and cannot divide
by zero"), Ruby exhibits inconsistent behavior for pretty much anyone
who has a little education in maths.

Maths and computation are not the same thing. Is there anything in the
above which applies only to Ruby and not to floating point computation
in another other mainstream programming language?

Yes, there are gotchas in floating point computation, as explained at
http://docs.sun.com/source/806-3568/ncg_goldberg.html
These are (or should be) well understood by programmers who feel they
need to use floating point numbers.

If you don't like IEEE floating point, Ruby also offers BigDecimal and
Rational.

If Ruby were to implement floating point following some different set of
rules other than IEEE, that would be (IMO) horrendous. The point of a
standard is that you only have to learn the gotchas once.
 
C

Chuck Remes

[snipped lots of arguments about string encodings that may or may not be relevant to the OP]

So... I am wondering if the original poster (Peter Pincus) has tried his code under 1.9 yet.

Peter?

cr
 
D

David Masover

If you don't like IEEE floating point, Ruby also offers BigDecimal and
Rational.

And if you don't like Ruby's strings, there's nothing stopping you from
rolling your own. There's certainly nothing stopping you from using binary
mode (whether it claims to be ASCII or not) for all strings.
 
P

Phillip Gawlowski

Phillip Gawlowski wrote in post #963602:

But that basically is my point. In order to make your program
comprehensible, you have to add extra incantations so that strings are
tagged as UTF-8 everywhere (e.g. when opening files).

However this in turn adds *nothing* to your program or its logic, apart
from preventing Ruby from raising exceptions.

s/apart from preventing Ruby from raising exceptions/but ensures
correctness of data across different systems/;

Maths and computation are not the same thing. Is there anything in the
above which applies only to Ruby and not to floating point computation
in another other mainstream programming language?

You conveniently left out that Ruby thinks dividing by 0.0 results in infinity.
That's not just wrong, but absurd to the extreme. S, we have to
safeguard against this. Just like having to safeguard against, say,
proper string encoding. If *anyone* is to blame, it's the ANSI and the
IT industry for having a) an extremely US-centric view of the world,
and b) being too damn shortsighted to create an international, capable
standard 30 years ago.

Further, you can't do any computations without proper maths. In Ruby,
you can't do computations since it cannot divide by zero properly, or
at least *consistently*.
Yes, there are gotchas in floating point computation, as explained at
http://docs.sun.com/source/806-3568/ncg_goldberg.html
These are (or should be) well understood by programmers who feel they
need to use floating point numbers.

If you don't like IEEE floating point, Ruby also offers BigDecimal and
Rational.

Works really well with irrational numbers, that are neither large
decimals, nor can they be expressed as a fraction x/x_0.

In a nutshell, Ruby cannot deal with floating points at all, and the
IEEE standard is a means to *represent* floating point numbers in
bits. It does *not* supersede natural laws, much less rules that are
in effect for hundreds of years.

And once the accuracy that the IEEE float represents isn't good enough
anymore (which happens once you have to simulate a particle system),
you move away from scalar CPUs, and move to vector CPUs / APUs (like
the MMX and SSE instruction sets for desktops, or a GPGPU via CUDA).
If Ruby were to implement floating point following some different set of
rules other than IEEE, that would be (IMO) horrendous. The point of a
standard is that you only have to learn the gotchas once.

Um, no. A standard is a means to avoid misunderstandings, and have a
well-defined system dealing with what the standard defines. You know,
like exchange text data in a standard that can cover as many of the
world's glyphs as possible.

And there is always room for improvement, otherwise I wonder why
engineers need Maple and mathematicians Mathematica.

--
Phillip Gawlowski

Though the folk I have met,
(Ah, how soon!) they forget
When I've moved on to some other place,
There may be one or two,
When I've played and passed through,
Who'll remember my song or my face.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,905
Latest member
Kristy_Poole

Latest Threads

Top