Ruby 1.8 vs 1.9

Yuri Tzara · Nov 25, 2010

Phillip Gawlowski wrote in post #963815:

The IEEE standard, however, does *not* define how mathematics work.
Mathematics does that. In math, x_0/0 is *undefined*. It is not
infinity...

What psychological anomaly causes creationists keep saying that there
are no transitional fossils even after having been shown transitional
fossils? We might pass it off as mere cult indoctrination or
brainwashing, but the problem is a more general one.

We also see it happening here in Mr. Gawlowski who, after being given
mathematical facts about infinity, simply repeats his uninformed
opinion.

"The Dunning-Kruger effect is a cognitive bias in which an unskilled
person makes poor decisions and reaches erroneous conclusions, but
their incompetence denies them the metacognitive ability to realize
their mistakes." (http://en.wikipedia.org/wiki/Dunning-Kruger_effect)

Here is my initial response to Mr. Gawlowski. Let's see if he ignores
it again (as a creationist ignores transitional fossils).

James Edward Gray II · Nov 25, 2010

James,
=20
=20

=20
=20
This is all really interesting but I don't understand what you mean by =

"code points" - is what you have said expressed diagrammatically =
somewhere?

Do these explanations help?

http://blog.grayproductions.net/articles/what_is_a_character_encoding

=
http://blog.grayproductions.net/articles/the_unicode_character_set_and_enc=
odings

James Edward Gray II=

Manuel Kiessling · Nov 25, 2010

Dear Yuri,

maybe being a bit more friendly and respecting would help this discussion.

Am 25.11.10 16:02, schrieb Yuri Tzara:

Phillip Gawlowski · Nov 25, 2010

Phillip Gawlowski wrote in post #963815:

What psychological anomaly causes creationists keep saying that there
are no transitional fossils even after having been shown transitional
fossils? We might pass it off as mere cult indoctrination or
brainwashing, but the problem is a more general one.

We also see it happening here in Mr. Gawlowski who, after being given
mathematical facts about infinity, simply repeats his uninformed
opinion.

"The Dunning-Kruger effect is a cognitive bias in which an unskilled
person makes poor decisions and reaches erroneous conclusions, but
their incompetence denies them the metacognitive ability to realize
their mistakes." (http://en.wikipedia.org/wiki/Dunning-Kruger_effect)

Your insult aside:
http://www.wolframalpha.com/input/?i=1/x

I'm quite aware that IEEE 754 defines the result of x_0/0 as infinity.
That is not, however, correct *in a mathematical sense*.
IEEE 754, for example, also defines the result of the square root of
-1 as an error. However, the result of, say the square root of -x is
x* j. That's called complex numbers, BTW.

Anyway:
Thanks, James, for correcting me on UTF-8, etc.

--
Phillip Gawlowski

Though the folk I have met,
(Ah, how soon!) they forget
When I've moved on to some other place,
There may be one or two,
When I've played and passed through,
Who'll remember my song or my face.

Oliver Schad · Nov 25, 2010

Robert said:
I tried to find more precise statement about this but did not really
succeed. I thought all UTF-x were just different encoding forms of
the same universe of code points.

Yes this is correct. Many people don't get the difference between a
charset and the corresponding encoding.

Unicode is a charset not with one encoding but with many encodings. So
we talk about the same characters and different mappings of this
characters to bits and bytes. This mapping is a simple table which you
can write down to a paper sheet and the side with the characters will
always be the same with UTF-8, UTF-16 and UTF-32.

The encodings UTF-8, UTF-16 and UTF-32 were build for different
purposes. The number after UTF says nothing about the maximum length in
first place it says something about the shortest length (and often about
the usual length if you use this encoding in that situation which it was
build for).

So if people coming from the ascii world use UTF-8, many encodings
(mapping of a character to a sequence of bits and bytes) of characters
will be inside one byte.

UTF-32 is a bit different, in this case 32 bits are a static size of
each encoded character.

Oh, so then ISO committee actually has a time machine? Wow! ;-)

Read as has much encoding space left and nobody of us knows how you
could fill the whole space. But humans tend to be wrong.

Regards
Oli

Oliver Schad · Nov 25, 2010

Phillip said:
I'm quite aware that IEEE 754 defines the result of x_0/0 as infinity.

The point is that you can't guarantee that you has a 0 with floiting
point aithmetics. Every value you have to read as "as close as possible
to the meant value for this machine and this number size".

The machines are not perfect they can't work mathematical correct in
many situations.

And you have to deal with this situation - all people doing numeric
things know that. Doing numeric calculations with a computer means to
calculate something as near as possible in the given environment and
requirements.

So in this sense is dividing a number through zero in real computers
dividing something which is close to the number, which I mean through
something which is close to zero.

And in fact it's not a bad idea to define if you have something which is
very close to zero (because you don't know if it's exactly zero), you
should treat it as very close to zero and not zero itself.

In a perfect world with perfect computers which has infinite registers,
infinite storage and infinite fast CPUs computers would know that zero
is equal to zero. But in this world computer knows only that zero is
close to zero.

Regards
Oli

Robert Klemme · Nov 25, 2010

Yes this is correct. Many people don't get the difference between a
charset and the corresponding encoding.

Btw, this happens all the time: for example, people often do not grasp
the difference between "point in time" and "representation of a point
in time in a particular time zone and locale". This becomes an issue
if you want to calculate with timestamps at or near the DST change
time.

Unicode is a charset not with one encoding but with many encodings. So
we talk about the same characters and different mappings of this
characters to bits and bytes. This mapping is a simple table which you
can write down to a paper sheet and the side with the characters will
always be the same with UTF-8, UTF-16 and UTF-32.

The encodings UTF-8, UTF-16 and UTF-32 were build for different
purposes. The number after UTF says nothing about the maximum length in
first place it says something about the shortest length (and often about
the usual length if you use this encoding in that situation which it was
build for).

More precisely the number indicates the "encoding unit" (see my quote
in an earlier posting). One could think up an encoding with encoding
unit of 1 octet (8 bits, 1 byte) where the shortest length would be 2
octets. Example

1st octet: number of octets to follow
2nd and subsequent octets: encoded character

The shortest length would be 2 octets, but the length would increase
by 1 octet so the encoding unit is 1 octet.

Cheers

robert

--=20
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

JÃ¶rg W Mittag · Nov 25, 2010

James said:
And even UTF-32 would have the complications of "combining characters."

... and zero-width characters and different representations of the same
character and ...

But that is a whole different can of worms.

jwm

Yuri Tzara · Nov 25, 2010

Phillip Gawlowski wrote in post #963922:

I'm quite aware that IEEE 754 defines the result of x_0/0 as
infinity. That is not, however, correct *in a mathematical sense*.

Yes, it is. The creationist analogy continues, I see. Even when given
facts which refute his position a second and third time, Mr. Gawlowski
continues to ignore them while simply repeating his opinion.

He twice ignored my suggestion to look up one-point
compactifications. He twice ignored my explanation of joining oo to a
line to make a circle, and of joining oo to a plane to make a
sphere. He also ignored the same information found in the link
provided by Adam.

It's not coincidence that the IEEE operations on oo match the rules
for the real projective line.

http://en.wikipedia.org/wiki/Real_projective_line#Arithmetic_operations

The only exception is the result of oo + oo, however in computing it
is convenient for that to be oo rather than undefined. Of course
there's the +/- distinction for oo (also convenient in computing)
which is eliminated by identifying +oo with -oo, as one would expect
for a circle. There are perhaps other technicalities but generally
IEEE operations are modeled after the real projective line. This is
useful.

http://en.wikipedia.org/wiki/One-point_compactification
http://en.wikipedia.org/wiki/Riemann_sphere

Phillip Gawlowski · Nov 25, 2010

The point is that you can't guarantee that you has a 0 with floiting
point aithmetics.

You can. In mathematics. The problem is, as you pointed out, that a 32
bit (or 64 bit, or n bit where n is finite) CPU isn't able to present
floating point numbers accurately enough.

However, 0 = 0.0 (no matter how much Yuri moves the goal posts).

You run into this issue once you leave the defined space for IEEE
Floats (~10^-44 for negative floats), *then* you enter very wonky
areas.

But a flat 0.0 is only non-zero for computers. But not in maths. Ergo,
from a mathematical standpoint, the IEEE standard is broken.

Every value you have to read as "as close as possible
to the meant value for this machine and this number size".

IOW: It's a limit (and an approximate one at that, but it's "good
enough" for pretty much all purposes).

The machines are not perfect they can't work mathematical correct in
many situations.

Only in floats, and with integers that are larger than the total
address space. But then we have the problem of the required CPU time
to consider.

And you have to deal with this situation - all people doing numeric
things know that. Doing numeric calculations with a computer means to
calculate something as near as possible in the given environment and
requirements.

Indeed. The problem is if the desired accuracy is much more exact than
the IEEE float defines.

So in this sense is dividing a number through zero in real computers
dividing something which is close to the number, which I mean through
something which is close to zero.

Not quite. Integer devision is behaving properly, Float isn't, even
with only one significant digit.

And in fact it's not a bad idea to define if you have something which is
very close to zero (because you don't know if it's exactly zero), you
should treat it as very close to zero and not zero itself.

Well, infinity isn't close to zero, either.

--
Phillip Gawlowski

Though the folk I have met,
(Ah, how soon!) they forget
When I've moved on to some other place,
There may be one or two,
When I've played and passed through,
Who'll remember my song or my face.

Phillip Gawlowski · Nov 25, 2010

Phillip Gawlowski wrote in post #963922:

Yes, it is. The creationist analogy continues, I see. Even when given
facts which refute his position a second and third time, Mr. Gawlowski
continues to ignore them while simply repeating his opinion.

He twice ignored my suggestion to look up one-point
compactifications. He twice ignored my explanation of joining oo to a
line to make a circle, and of joining oo to a plane to make a
sphere. He also ignored the same information found in the link
provided by Adam.

It's not coincidence that the IEEE operations on oo match the rules
for the real projective line.

http://en.wikipedia.org/wiki/Real_projective_line#Arithmetic_operations

The only exception is the result of oo + oo, however in computing it
is convenient for that to be oo rather than undefined. Of course
there's the +/- distinction for oo (also convenient in computing)
which is eliminated by identifying +oo with -oo, as one would expect
for a circle. There are perhaps other technicalities but generally
IEEE operations are modeled after the real projective line. This is
useful.

Quote Wikipedia:
"Unlike most mathematical models of the intuitive concept of 'number',
this structure allows division by zero [snip formula], for nonzero a.
This structure, however, is not a field, and *division does not retain
its original algebraic meaning in it*."

Emphasis mine. Your argument is also called "moving the goal posts".
But even if we consider it: non-algebraic systems are not something
99% of all non-professional-mathematicians engage in (so, we can toss
in a "no true Scotsman" fallacy into the bargain).

--
Phillip Gawlowski

Though the folk I have met,
(Ah, how soon!) they forget
When I've moved on to some other place,
There may be one or two,
When I've played and passed through,
Who'll remember my song or my face.

Yuri Tzara · Nov 25, 2010

Phillip, regarding defining 1/0 you said,

It cannot be infinity. It does, quite literally not compute. There's
no room for interpretation, it's a fact of (mathematical) life that
something divided by nothing has an undefined result. It doesn't
matter if it's 0, 0.0, or -0.0. Undefined is undefined.

Nonsense. These claims are roundly refuted by

http://en.wikipedia.org/wiki/Extended_real_number_line#Arithmetic_operations

IEEE floating point operations on oo and -oo match those *precisely*.
IEEE models the extended real number line.

I'm quite aware that IEEE 754 defines the result of x_0/0 as
infinity. That is not, however, correct *in a mathematical sense*.

Nonsense. Infinity defined this way has solid mathematical meaning and
is established on a firm foundation, described in the link above.

The IEEE standard, however, does *not* define how mathematics work.
Mathematics does that. In math, x_0/0 is *undefined*. It is not
infinity...

Right, IEEE does not define how mathematics works. IEEE took the
mathematical definition and properties of infinity and incorporated it
into the standard. Clearly, you were unaware of this and repeatedly
ignored the information offered to you about it.

Quote Wikipedia:
"Unlike most mathematical models of the intuitive concept of 'number',
this structure allows division by zero [snip formula], for nonzero a.
This structure, however, is not a field, and *division does not retain
its original algebraic meaning in it*."

Emphasis mine.

That sentence. You evidently do not understand what it means. It does
not mean what you think it means.

Your argument is also called "moving the goal posts". But even if
we consider it: non-algebraic systems are not something 99% of all
non-professional-mathematicians engage in (so, we can toss in a "no
true Scotsman" fallacy into the bargain).

Nonsense. Every person who has obtained a result of +oo or -oo from a
floating point calculation has engaged in it. A result of +oo or -oo
is often a meaningful answer and not an error. And even when it is an
error, it gives us information on what went wrong (and which direction
it went wrong in). It's entertainingly ironic that you attribute
"moving the goalposts" and the no true Scotsman fallacy to the wrong
person in this conversation.

Thanks for another great demonstration of the Dunning-Kruger effect.

Phillip Gawlowski · Nov 25, 2010

Phillip, regarding defining 1/0 you said,

Nonsense. These claims are roundly refuted by

Actually, they aren't I said for "x_0/0 the result is undefined", and
the Extended Real number has the caveat that x_0 must be !=3D 0.

Nonsense. Infinity defined this way has solid mathematical meaning and
is established on a firm foundation, described in the link above.

A firm foundation that is not used in algebraic math.

Right, IEEE does not define how mathematics works. IEEE took the
mathematical definition and properties of infinity and incorporated it
into the standard. Clearly, you were unaware of this and repeatedly
ignored the information offered to you about it.

It took *a* definition and *a* set of properties. If we are splitting
hairs, let's do it properly, at least.

Quote Wikipedia:
"Unlike most mathematical models of the intuitive concept of 'number',
this structure allows division by zero [snip formula], for nonzero a.
This structure, however, is not a field, and *division does not retain
its original algebraic meaning in it*."

Emphasis mine.

Click to expand...

That sentence. You evidently do not understand what it means. It does
not mean what you think it means.

You do know what algebra is, yes?

Nonsense. Every person who has obtained a result of +oo or -oo from a
floating point calculation has engaged in it. A result of +oo or -oo
is often a meaningful answer and not an error. And even when it is an
error, it gives us information on what went wrong (and which direction
it went wrong in). It's entertainingly ironic that you attribute
"moving the goalposts" and the no true Scotsman fallacy to the wrong
person in this conversation.

Pal, in algebraic maths, division by zero is undefined. End of story.
We are talking about algebraic math here (or we can extend this to
include complex numbers, which IEEE 754 doesn't deal with, either),
and not special areas of maths that aren't used in outside of research
papers. Not to mention that I established the set of Irrational
numbers as the upper bound quite early on.

The and your argument "if you use floats on a computer you use a
non-algebraic system, therefore you use a non-algebraic system when
using a computer" is circular.

The result of x_0/0.0 =3D infinity is as meaningful as "0/0 =3D NaN". Any
feedback by a computer system is meaningful (by definition), and can
be used to act on this output:

result =3D "Error: Division by zero" if a / 0.0 =3D=3D Infinity

Done.

Thanks for another great demonstration of the Dunning-Kruger effect.

Ah, the irony.

--=20
Phillip Gawlowski

Though the folk I have met,
(Ah, how soon!) they forget
When I've moved on to some other place,
There may be one or two,
When I've played and passed through,
Who'll remember my song or my face.

Jos Backus · Nov 25, 2010

[Note: parts of this message were removed to make it a legal post.]

This discussion has little to do with Ruby at this point. Maybe you folks
could take it offline, please?

Happy Thanksgiving, everybody!

Jos

Clifford Heath · Nov 25, 2010

James said:
UTF-8, UTF-16, and UTF-32 are encodings of Unicode code points. They are all capable of representing all code points. Nothing in this discussion is a subset of anything else.

To add to this, Unicode 3 uses the codespace from 0 to 0x10FFFF (not 0xFFFFFFFF),
so it does cover all the Oriental characters (unlike Unicode 2 as implemented in
earlier Java versions, which only covers 0..0xFFFF). It even has codepoints for
Klingon and Elvish!

UTF-8 requires four bytes to encode a 21-bit number (enough to encode 0x10FFFF)
though if you extend the pattern (as many implementations do) it has a 31-bit gamut.

UTF-16 encodes the additional codespace using surrogate pairs, which is a pair of
16-bit numbers each carrying a 10-bit payload. Because it's still a variable length
encoding, it's just as painful to work with as UTF-8, but less space-efficient.

Both UTF-8 and UTF-16 encodings allow you to look at any location in a string and step
forward or back to the nearest character boundary - a very important property that
was missing from Shift-JIS and other earlier encodings.

If you go back to 2003 in the archives, you'll see I engaged in a long and somewhat
heated discussion about this subject with Matz and others back then. I'm glad we
finally have a Ruby version that can at least do this stuff properly, even though
I think it's over-complicated.

Clifford Heath.

Clifford Heath · Nov 25, 2010

Robert said:
Btw, this happens all the time: for example, people often do not grasp
the difference between "point in time" and "representation of a point
in time in a particular time zone and locale".

.... on a particular relativistic trajectory ;-) Seriously though,
time dilation effects are accounted for in every GPS unit, because
the relative motion of the satellites gives each one its own timeline
which affects the respective position fixes.

Clifford Heath

James Edward Gray II · Nov 26, 2010

Both UTF-8 and UTF-16 encodings allow you to look at any location in a =

string and step forward or back to the nearest character boundary - a =
very important property that was missing from Shift-JIS and other =
earlier encodings.

This also provides a kind of simple checksum for validating the =
encoding. I love that feature.

James Edward Gray II

David Masover · Nov 26, 2010

=20
Actually, it's not.

Whoops, my mistake. I guess now I'm confused as to why they went with UTF-1=
6=20
=2D- I always assumed it simply truncated things which can't be represented=
in=20
16 bits.

You can produce corrupt strings and slice into a half-character in
Java just as you can in Ruby 1.8.

Wait, how?

I mean, yes, you can deliberately build strings out of corrupt data, but if=
=20
you actually work with complete strings and string concatenation, and you=20
aren't doing crazy JNI stuff, and you aren't digging into the actual bits o=
f=20
the string, I don't see how you can create a truncated string.

=20
There's also a lot of legacy data, even within the US. On IBM systems,
the standard encoding, even for greenfield systems that are being
written right now, is still pretty much EBCDIC all the way.

I'm really curious why anyone would go with an IBM mainframe for a greenfie=
ld=20
system, let alone pick EBCDIC when ASCII is fully supported.

And now there's a push for a One Encoding To Rule Them All in Ruby 2.
That's *literally* insane! (One definition of insanity is repeating
behavior and expecting a different outcome.)

Wait, what?

I've been out of the loop for awhile, so it's likely that I missed this, bu=
t=20
where are these plans?

Phillip Gawlowski · Nov 26, 2010

I'm really curious why anyone would go with an IBM mainframe for a greenfield
system, let alone pick EBCDIC when ASCII is fully supported.

Because that's how the other applications written on the mainframe the
company bought 20, 30, 40 years ago expect their data, and the same
code *still runs*.

Legacy systems like that have so much money invested in them, with
code poorly understood (not necessarily because it's *bad* code, but
because the original author has retired 20 years ago), and are so
mission critical, that a replacement in a more current design is out
of the question.

Want perpetual job security? Learn COBOL.

--
Phillip Gawlowski

Though the folk I have met,
(Ah, how soon!) they forget
When I've moved on to some other place,
There may be one or two,
When I've played and passed through,
Who'll remember my song or my face.

Yuri Tzara · Nov 26, 2010

The big picture is that IEEE floating point is solidly grounded in
mathematics regarding infinity. Phillip wants to convince us that this
is not the case. He wants us to believe that the design of floating
point regarding infinity is wrong and that he knows better. He is
mistaken. That is all you need to know. Details follow.

The most direct refutation of his claims comes from the actual reason
why infinity was included in floating point:

http://docs.sun.com/source/806-3568/ncg_goldberg.html#918

Infinity prevents wildly incorrect results. It also removes the need
to check certain special cases.

Now it happens that floating point is backed by a mathematical model:
the extended real line. Phillip tells us that the extended real line
is only useful for the 1% of programmers who are mathematicians. He is
wrong. It is used every time infinity prevents an incorrect result or
simplifies a calculation. The mathematics behind floating point design
is slightly more than elementary, but that does not mean every
programmer is required to have full knowledge of it.

What follows is an examination of Phillip's descent into absurdity,
apparently caused by a compelling need justify the mantras he learned
in high school. If you are interested in the psychological phenomenon
of cognitive dissonance, or if you still think that Phillip is being
coherent, then keep reading.

This conversation began when Phillip said about 1/0,

It cannot be infinity. It does, quite literally not compute. There's
no room for interpretation, it's a fact of (mathematical) life that
something divided by nothing has an undefined result. It doesn't
matter if it's 0, 0.0, or -0.0. Undefined is undefined.

That other languages have the same issue makes matters worse, not
better (but at least it is consistent, so there's that).

It's clear here that Phillip is unaware that IEEE floating point was
designed to approximate the affinely extended real numbers, which has
a rigorous definition of infinity along with operations upon it.

http://mathworld.wolfram.com/AffinelyExtendedRealNumbers.html

Floating point infinity obeys all the rules laid out there. Also
notice the last paragraph in that link.

The IEEE standard, however, does *not* define how mathematics work.
Mathematics does that. In math, x_0/0 is *undefined*. It is not
infinity (David kindly explained the difference between limits and
numbers), it is not negative infinity, it is undefined. Division by
zero *cannot* happen...

So, from a purely mathematical standpoint, the IEEE 754 standard is
wrong by treating the result of division by 0.0 any different than
dividing by 0...

Here Phillip further confirms that he is unaware that IEEE used the
mathematical definition of the extended reals. He thinks infinity was
defined on the whim of the IEEE designers. No, mathematics told them
how it worked.

This conversation only continues because Phillip is trying desperately
cover up his ignorance rather than simply acknowledging it and moving
on.

I was polite when I corrected him the first time, however when he
ignored this correction along with a similar one by Adam, obstinately
repeating his mistaken belief instead, that's when directness is
required. For whatever reason he is compelled to "fake" expertise in
this area despite being repeatedly exposed for doing so. To wit:

A firm foundation that is not used in algebraic math.

This sentence is not even meaningful. What is "algebraic math"? That
phrase makes no sense, especially to a mathematician. The extended
reals is of course an algebraic structure with algebraic properties,
so whatever "algebraic math" means here must apply to the extended
reals.

It took *a* definition and *a* set of properties. If we are
splitting hairs, let's do it properly, at least.

The affinely extended reals is the two-point compactification of the
real line. The real projective line is the one-point compactification
of the real line. These compactifications are _unique_. In a desperate
display of backpedaling, Phillip only succeeds confirming his
ignorance of the topic about which he claims expertise.

Pal, in algebraic maths, division by zero is undefined. End of
story. We are talking about algebraic math here...

More nonsensical "algebraic math" terminology. What is this? Do you
mean algebraic numbers? No, you can't mean that, since floating point
is used for approximating transcendentals as well. Again, the extended
reals is an algebraic structure with algebraic properties. Your
introduction of the term "algebraic math" is just more backpedaling
done in a manifestly incompetent way. In trying to move the goalposts,
the goalposts fell on your head. As if that wasn't bad enough, you
absurdly claim that I was moving goalposts.

For more on what happened to Phillip, see the Dunning-Kruger
effect. Don't let it happen to you.

http://en.wikipedia.org/wiki/Dunning-Kruger_effect

Serialization ruby 1.8 vs ruby 1.9	3	Jan 7, 2010
Ruby 1.8 vs. Ruby 1.9	16	May 20, 2009
Ruby 1.9 and rdoc/usage confusion	0	Oct 23, 2012
Having a gem for 1.8 and 1.9	5	Oct 2, 2009
Is Ruby 1.9 released	1	Jul 31, 2008
Do I need to upgrade to the latest version of Ruby	8	Mar 25, 2011
Ruby 1.9 Time parse question	4	Jun 7, 2011
Instance eval in 1.8 and 1.9	6	Jul 17, 2010

Ruby 1.8 vs 1.9

Yuri Tzara

James Edward Gray II

Manuel Kiessling

Phillip Gawlowski

Oliver Schad

Oliver Schad

Robert Klemme

JÃ¶rg W Mittag

Yuri Tzara

Phillip Gawlowski

Phillip Gawlowski

Yuri Tzara

Phillip Gawlowski

Jos Backus

Clifford Heath

Clifford Heath

James Edward Gray II

David Masover

Phillip Gawlowski

Yuri Tzara

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads