dirty ranges

D

Dirk van Deun

I'm a new Ruby user (currently at page 68 of Programming Ruby !) and
having found something weird, I wonder if either a) "you have already
found a bug, report it" or b) "yeah, yeah, we all know that this is
a bit weird, but it is not a problem in practice".

It seems that you can do destructive operations on the minimum element
of a range, but not on the maximum element (well, you can, but it
does not have any effect):

irb(main):001:0> rng="a".."z"
=> "a".."z"
irb(main):002:0> rng.min[0]="b"
=> "b"
irb(main):003:0> rng.max[0]="y"
=> "y"
irb(main):004:0> rng
=> "b".."z"

This just doesn't seem right.

Actually this was the second thing I found that doesn't seem right.
The first was that the first element is shared when you convert
a range into an array (again, the last one is different):

irb(main):005:0> arr=rng.to_a
=> ["b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z"]
irb(main):006:0> rng.min[0]="c"
=> "c"
irb(main):007:0> rng.max[0]="x"
=> "x"
irb(main):008:0> arr
=> ["c", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z"]

This is really dirty, but at least this was to be expected from the
specifications ("All you need to be able to make ranges is 'succ' and
'<=>'" -- there is no talk of a deep copy as a requirement.)

Comments ?

Dirk van Deun
 
D

David Vallner

D=C5=88a Pondelok 20 Febru=C3=A1r 2006 14:53 Dirk van Deun nap=C3=ADsal:
I'm a new Ruby user (currently at page 68 of Programming Ruby !) and
having found something weird, I wonder if either a) "you have already
found a bug, report it" or b) "yeah, yeah, we all know that this is
a bit weird, but it is not a problem in practice".

It seems that you can do destructive operations on the minimum element
of a range, but not on the maximum element (well, you can, but it
does not have any effect):

irb(main):001:0> rng=3D"a".."z"
=3D> "a".."z"
irb(main):002:0> rng.min[0]=3D"b"
=3D> "b"
irb(main):003:0> rng.max[0]=3D"y"
=3D> "y"
irb(main):004:0> rng
=3D> "b".."z"

This just doesn't seem right.

Actually this was the second thing I found that doesn't seem right.
The first was that the first element is shared when you convert
a range into an array (again, the last one is different):

irb(main):005:0> arr=3Drng.to_a
=3D> ["b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o= ",
"p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z"] irb(main):006:0>
rng.min[0]=3D"c"
=3D> "c"
irb(main):007:0> rng.max[0]=3D"x"
=3D> "x"
irb(main):008:0> arr
=3D> ["c", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o= ",
"p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z"]

This is really dirty, but at least this was to be expected from the
specifications ("All you need to be able to make ranges is 'succ' and
'<=3D>'" -- there is no talk of a deep copy as a requirement.)

Range code seems to enumerate and determine the maximum by generating all=20
successors lesser or equal, or lesser than the Range endpoint. For these=20
examples to "work", it would also require to check whether the last generat=
ed=20
element is equal to the endpoint, and return the endpoint object if it was.

That said, I think this situation is similar to the one with String keys in=
a=20
hash. Where immutable objects would be enforced otherwise, Ruby gives you t=
he=20
responsibility for if you choose to change them. Maybe this behaviour shoul=
d=20
instead be documented as with the Hash class, if it isn't already; I can't=
=20
imagine where this could only be worked around with a noticeable kludge.

David Vallner
 
G

George Ogata

It seems that you can do destructive operations on the minimum element
of a range, but not on the maximum element (well, you can, but it
does not have any effect):

irb(main):001:0> rng="a".."z"
=> "a".."z"
irb(main):002:0> rng.min[0]="b"
=> "b"
irb(main):003:0> rng.max[0]="y"
=> "y"
irb(main):004:0> rng
=> "b".."z"

You can modify the endpoints if you reference them using #begin and
#end:

irb(main):001:0> r = 'a'..'z'
=> "a".."z"
irb(main):002:0> r.begin << '!'
=> "a!"
irb(main):003:0> r.end << '!'
=> "z!"
irb(main):004:0> r
=> "a!".."z!"

#begin and #end are direct accessors to the endpoint objects, whereas
#max is calculated, taking into account end-closedness. It should not
be surprising, then, that #max and #end return different objects.

But I suggest not modifying range endpoints at all; Ranges themselves
are immutable, so changing their value indirectly like that is kinda
going against the grain. And of course, if you ever modify your #end
while iterating over the range, it's nasal demons.
 
D

David Vallner

D=C5=88a Utorok 21 Febru=C3=A1r 2006 04:38 George Ogata nap=C3=ADsal:
But I suggest not modifying range endpoints at all; Ranges themselves
are immutable, so changing their value indirectly like that is kinda
going against the grain. And of course, if you ever modify your #end
while iterating over the range, it's nasal demons.

Pffft. Doesn't even flinch.

ruby <<EOF
rng1 =3D ("a".."g")
rng1.each { |char|
if char =3D=3D "d"
rng1.end[0] =3D "j"
end
puts char
}
rng2 =3D ("a".."j")
rng2.each { |char|
if char =3D=3D "g"
rng2.end[0] =3D "d"
end
puts char
}
END

Outputs:

a
b
c
d
e
f
g
a
b
c
d
e
f
g
h
i
j


Of course, I have absolutely NO idea at all why, and don't particularly fee=
l=20
like reading Ruby core source.

David Vallner
 
R

Robert Klemme

David Vallner said:
That said, I think this situation is similar to the one with String
keys in a hash. Where immutable objects would be enforced otherwise,
Ruby gives you the responsibility for if you choose to change them.
Maybe this behaviour should instead be documented as with the Hash
class, if it isn't already; I can't imagine where this could only be
worked around with a noticeable kludge.

Having said that I can't see where modifying range members like this is
actually needed. IMHO it's a bad idea to do so.

Kind regards

robert
 
D

Dirk van Deun

For the people who remarked (quite sensibly, of course) that you just
shouldn't do that, tinker with the endpoints of a range: it works
the other way too, of course:

irb(main):001:0> rng="a".."z"
=> "a".."z"
irb(main):002:0> arr=rng.to_a
=> ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m",
"n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z"]
irb(main):003:0> arr[0][0]="z"
=> "z"
irb(main):004:0> rng
=> "z".."z"

Something like this may be more likely to be a real problem.

The begin/end versus min/max I obviously didn't know, but then again,
if max is an ad-hoc constructed object, shouldn't min be too, if
only for symmetry ?

Hacker-Dirk likes Ruby, but Computer-Scientist-Dirk tends to be wary
of systems with irregularities like these...

Dirk van Deun
 
T

Tony Mobily

Hi,

I followed the range discussion. Before now, I had honestly thought:
c'mon, it's not _that_ much of a problem!
Then I saw:
irb(main):001:0> rng="a".."z"
=> "a".."z"
irb(main):002:0> arr=rng.to_a
=> ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m",
"n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z"]
irb(main):003:0> arr[0][0]="z"
=> "z"
irb(main):004:0> rng
=> "z".."z"

OUCH!!!
I agree 1000000% with this statement:
Something like this may be more likely to be a real problem.

Oh yes.

Merc.
 
G

George Ogata

David Vallner said:
Dňa Utorok 21 Február 2006 04:38 George Ogata napísal:
But I suggest not modifying range endpoints at all; Ranges themselves
are immutable, so changing their value indirectly like that is kinda
going against the grain. And of course, if you ever modify your #end
while iterating over the range, it's nasal demons.

Pffft. Doesn't even flinch.

ruby <<EOF
rng1 = ("a".."g")
rng1.each { |char|
if char == "d"
rng1.end[0] = "j"
end
puts char
}
rng2 = ("a".."j")
rng2.each { |char|
if char == "g"
rng2.end[0] = "d"
end
puts char
}
END

Outputs:

a
b
c
d
e
f
g
a
b
c
d
e
f
g
h
i
j


Of course, I have absolutely NO idea at all why, and don't particularly feel
like reading Ruby core source.

David Vallner

You're right; modifying #end is okay. Not documented, though. Also
note that modifying #begin can break a loop. Whether or not this is
counterintuitive depends on the person, I guess.

g@crash:~$ irb
irb(main):001:0> r = 'a'..'z'
=> "a".."z"
irb(main):002:0> r.each{|s| s << '!'; puts s}
a!
=> "a!".."z"

This is another face of Dirk/Tony's concern. Perhaps the first
element should be cloned for symmetry and safety. But how: #dup,
#clone, something else...? You'd also be adding a new requirement
that non-immediate range elements must be copyable. Hmmm...
 
D

Dirk van Deun

: This is another face of Dirk/Tony's concern. Perhaps the first
: element should be cloned for symmetry and safety. But how: #dup,
: #clone, something else...? You'd also be adding a new requirement
: that non-immediate range elements must be copyable. Hmmm...

The requirement could be weakened a bit, because you do not need to
clone range elements immediately. The begin and end could stay
uncloned; and cloning could be delayed until a min is asked for; so
that the min would be an ad hoc calculated value like the max.

Methods like to_a and each would then need to use min and max, not
begin and end, but they probably already are. "Safe" methods,
like ===, could use begin and end in their implementation, so that
ranges of non-copyable elements would still be possible and
useful. (But only to be used in "safe" circumstances.)

Of course, the weakened solution would not prevent the following
from happening, but this is really "asking for it":

irb(main):001:0> a="a"
=> "a"
irb(main):002:0> z="z"
=> "z"
irb(main):003:0> rng=a..z
=> "a".."z"
irb(main):004:0> a[0]="b"
=> "b"
irb(main):005:0> rng
=> "b".."z"

Accidents that happen indirectly via innocuous-looking to_a and each
calls would be prevented.

Dirk van Deun
 
G

George Ogata

: This is another face of Dirk/Tony's concern. Perhaps the first
: element should be cloned for symmetry and safety. But how: #dup,
: #clone, something else...? You'd also be adding a new requirement
: that non-immediate range elements must be copyable. Hmmm...

The requirement could be weakened a bit, because you do not need to
clone range elements immediately. The begin and end could stay
uncloned; and cloning could be delayed until a min is asked for; so
that the min would be an ad hoc calculated value like the max.

That's how I figured you would do it. One of the most common
operations to do on a Range, though, is to traverse it, so it'd be a
limited-use range if you use non-copyable objects. Perhaps I
should've said "requirement for traversal" -- you could still include?
and friends.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,787
Messages
2,569,629
Members
45,332
Latest member
LeesaButts

Latest Threads

Top