[RCR] New [] Semantics

Bill Atkins · Oct 4, 2004

Currently, the following code

a = [1, 2, 3, 4, 5]
a[0, 3]

returns

[1, 2, 3]

This is somewhat counter-intuitive. Since Ruby has a built-in range
type, [] ought to take advantage of it. I propose that the []
operators be redefined so that this behavior can only be achieved by
explicitly providing a Range, e.g. a[0...3]. The original code would
then work like #values_at and return [1, 3].

Also, I don't know what happened with the earlier mention about the
confusion between .. / ... but I'm a supporter of getting rid of '...'
and just making .. inclusive. Exclusive ranges can be represented
with 0..(n + 1) if necessary. I don't know if this is appropriate for
an RCR.

I think the above [] behavior is more in keeping with POLS and is
slightly more intuitive than the current default.

Obviously this suggestion (and the sub-suggestion about .. and ...)
would break existing code. I don't know if RCR's are allowed to do
that, but I'm just throwing this idea out there.

Bill Atkins

Yukihiro Matsumoto · Oct 4, 2004

Hi,

In message "Re: [RCR] New [] Semantics"

|I think the above [] behavior is more in keeping with POLS and is
|slightly more intuitive than the current default.
|
|Obviously this suggestion (and the sub-suggestion about .. and ...)
|would break existing code. I don't know if RCR's are allowed to do
|that, but I'm just throwing this idea out there.

RCR's are allowed to break exisiting codes, but NOT ALLOWED to mention
POLS in them. Being intuitive is a weak reason to break
compatibility.

matz.

Ara.T.Howard · Oct 4, 2004

Being intuitive is a weak reason to break compatibility.

that's NOT what my wife says! ;-)

-a
--
===============================================================================
| EMAIL :: Ara [dot] T [dot] Howard [at] noaa [dot] gov
| PHONE :: 303.497.6469
| A flower falls, even though we love it;
| and a weed grows, even though we do not love it.
| --Dogen
===============================================================================

Florian Gross · Oct 4, 2004

Bill said:
Currently, the following code

a = [1, 2, 3, 4, 5]
a[0, 3]

returns

[1, 2, 3]

This is somewhat counter-intuitive. Since Ruby has a built-in range
type, [] ought to take advantage of it. I propose that the []
operators be redefined so that this behavior can only be achieved by
explicitly providing a Range, e.g. a[0...3]. The original code would
then work like #values_at and return [1, 3].

I've also thought about this circa one and a half year ago. (And I
didn't even realize that I'm using Ruby for that long a time already. I
still feel like I've only barely scratched the surface of what this
language is able to do and there's so many unexplored libraries,
concepts, mind sets that I haven't explored properly yet...)

That aside, my solution was to use ary[[0, 1..3, 4]] for this.

I coded up an implementation of it, but please note that this is quite
old code. I didn't know about values_at at the time I wrote it and I'm
not even sure if the code is correct in all cases. (Had I written this
now, I would want to have a handful of test cases, even though I might
hesitate to write them -- I really need to make this a habit.)

I have attached the code, because it is a quick way of testing this
alternate syntax from irb -- it might be especially interesting to see
what happens with other Enumerables in action -- it is hard to think
about such border cases without an implementation.

So, what do you think about using this syntax instead? It is slightly
more complex than the one you proposed of course, but it is also
consistent with ary[1..3] and could in theory be useful in more cases.

Regards,
Florian Gross

# this changes Array#[] so that it will also take arrays
# e. g.:
# ary = ["foo", "bar", "qux", "quz", "quv"]
# ary[[0..1, 0..2, 0..3]]
# => ["foo", "bar", "foo", "bar", "qux", "foo", "bar", "qux", "quz"]

# it also changes Array#[]=
# e. g.:
# ary = (100..110).to_a
# ary[[0..3, 6..9]] = (0..3).to_a + (6..9).to_a
# ary
# => [0, 1, 2, 3, 104, 105, 6, 7, 8, 9, 110]

class Array
alias

ld_get :[]
def [](index, *more)
if !more.empty?
return old_get(index, *more)
# Ranges are likely handled more efficently internally therefore we ignore them
elsif index.respond_to?

each) and !index.is_a? Range
result = []
index.each { |subindex|
element = self[subindex]
element = [element] unless element.is_a? Array
result += element
}
return result
else
return old_get(index)
end
end

alias

ld_set :[]=
def []=(index, *more)
args = more.clone
to = more.pop
old_to = to.clone
raise ArgumentError unless to

if !more.empty?
return old_set(index, *args)
# Ranges are likely handled more efficently internally therefore we ignore them
elsif index.respond_to?

each) and !index.is_a? Range
index.each { |subindex|
self[subindex] = to.slice!(0 ... subindex.length)
}
return old_to - to
else
return old_set(index, *args)
end
end
end

Yukihiro Matsumoto · Oct 4, 2004

Hi,

In message "Re: [RCR] New [] Semantics"
on Tue, 5 Oct 2004 08:54:53 +0900, (e-mail address removed) writes:

|> Being intuitive is a weak reason to break compatibility.
|
|that's NOT what my wife says! ;-)

Don't tell her. She would notice we are weird people.
Maybe it's too late.

matz.

Charles Comstock · Oct 4, 2004

Bill said:
Currently, the following code

a = [1, 2, 3, 4, 5]
a[0, 3]

returns

[1, 2, 3]

This is somewhat counter-intuitive. Since Ruby has a built-in range
type, [] ought to take advantage of it. I propose that the []
operators be redefined so that this behavior can only be achieved by
explicitly providing a Range, e.g. a[0...3]. The original code would
then work like #values_at and return [1, 3].

Yes but [start,length] is capable of expressing ranges that make no
sense using [range]. For instance:

a = [:a,:b,:c,:d,:e,:f,:g,:h]
a[-2,2] # => [:g,:h]
a[-2..2] # => []
a[-2..-1] # => [:g, :h]
a[0..-1] # => [:a,:b,:c,:d,:e,:f,:g,:h]

Obviously it isn't too hard to convert between the two formats, but many
times it makes far more sense to express it in the [start,length] format
as opposed to the range format of start..end.

Also, I don't know what happened with the earlier mention about the
confusion between .. / ... but I'm a supporter of getting rid of '...'
and just making .. inclusive. Exclusive ranges can be represented
with 0..(n + 1) if necessary. I don't know if this is appropriate for
an RCR.

This makes perfect sense when the range is over numeric values, but to
me ("a"..("c".succ)) is ugly and not easily read. Or any other range of
objects like that for that matter.

Charles Comstock

Bill Atkins · Oct 4, 2004

This makes perfect sense when the range is over numeric values, but to
me ("a"..("c".succ)) is ugly and not easily read. Or any other range of
objects like that for that matter.

Good point. I hadn't thought about that. I just think the difference
between .. / ... is sort of confusing.

Bill Atkins

David A. Black · Oct 4, 2004

Hi --

Good point. I hadn't thought about that. I just think the difference
between .. / ... is sort of confusing.

There are a couple of mneumonics for remembering which is which.

.. has two letters: in
... has three letters: out

Or you can think of the three dots as kind of pushing the end value
out of reach, so that the included values of the range can't quite get
there.

David

trans. (T. Onoma) · Oct 4, 2004

Yes but [start,length] is capable of expressing ranges that make no
sense using [range]. For instance:

a = [:a,:b,:c,:d,:e,:f,:g,:h]
a[-2,2] # => [:g,:h]
a[-2..2] # => []
a[-2..-1] # => [:g, :h]
a[0..-1] # => [:a,:b,:c,:d,:e,:f,:g,:h]

Obviously it isn't too hard to convert between the two formats, but many
times it makes far more sense to express it in the [start,length] format
as opposed to the range format of start..end.

Hmm... It's still signifies a range. So if there were just a notation, then it
might be nice. I'm not sure what that would be though.

Really, in looking over Ruby's Range class, it is bit limited. You can't
exclude the start element, and it doesn't provide a way to specify a
increment so you can't iterate over floats. A more complete range would have
the initializer something like:

Range.new(start, end, start_exclude=false, end_exclude=false, inc=1)

Again, sure the best way to make a nice neat literal notation for all that.
Although one simple suggestion is to have a '+' method to set the increment.

(1.0 .. 3.0 + 0.5).to_a #=> [1.0, 1.5, 2.0, 2.5, 3.0]

T.

Joe Cheng · Oct 5, 2004

Bill said:
Also, I don't know what happened with the earlier mention about the
confusion between .. / ... but I'm a supporter of getting rid of '...'
and just making .. inclusive. Exclusive ranges can be represented
with 0..(n + 1) if necessary. I don't know if this is appropriate for
an RCR.

I find it surprising that several people have expressed so much aversion
to '...' that they would actually favor removing it from the language.
In all earnestness, why don't you just forget it exists?

While you may find inclusive ranges less confusing than exclusive
ranges, for some people (including me and, apparently[1], Ara),
exclusive ranges are more useful and intuitive.

Using exclusive ranges eliminates error-prone "n+1" calculations. As a
case in point, you said "0..(n + 1)" but I think you actually meant
"0..(n - 1)". This may have just been a typo on your part, but I
personally make a lot of mistakes of this kind when I have to use
inclusive ranges, especially when converting from start/end indexes to
offset/length.

[1]http://groups-beta.google.com/group/comp.lang.ruby/msg/a6cbf0ee261fc9b4

Markus · Oct 5, 2004

Also, I don't know what happened with the earlier mention about the
confusion between .. / ... but I'm a supporter of getting rid of '...'

I think I killed it (admittedly, the thread was on it's last legs
in any case) by pointing out an idiom (which everyone had seemed to
approve of) from the cypher-quiz that would be very hard to replicate as
concisely without exclusive ranges:

Here's a simple example of where ... is very nice to have. You want
to cut a deck and card x, so that x is on the top after the cut:

deck = deck.values_at(x..-1,0...x)

*grin* Try doing that as concisely without "..."

-- Markus

Markus · Oct 5, 2004

The ideas I'm (slowly) playing with for ranges:

Extend the Range so that either or both ends can be
inclusive, exclusive or unbounded (i.e., open, closed, or
infinite)

Define construction operators '<..<', '<=..<', '<..<=' and '<=..<='
Likewise '<.._', '<=.._', '_..<=' and '_..<'
(the last two being unary prefixes)
Keep '..' as an alias for '<=..<='
Keep '...' as an alias for '<=..<'
Define construction operator '..+' for the start/length-1 case
Define construction operator '..<+' for the start/length case

Add Range#by(step)

Defining a related class for "disordered" ranges like "2..-1" which
are handy but semantically disjoint for pure ranges. I'm
thinking something that water would roll off the back of in
a duck typing world, but that would raise reasonable error
messages in preference to producing unexpected behaviour.

Typically, the versions of ruby I produce in these experiments are
killed by angry villages before they can show their essential
kindheartedness. But I still hope.

-- Markus

P.S.

Yes but [start,length] is capable of expressing ranges that make no
sense using [range]. For instance:

a = [:a,:b,:c,:d,:e,:f,:g,:h]
a[-2,2] # => [:g,:h]
a[-2..2] # => []
a[-2..-1] # => [:g, :h]
a[0..-1] # => [:a,:b,:c,:d,:e,:f,:g,:h]

Obviously it isn't too hard to convert between the two formats, but many
times it makes far more sense to express it in the [start,length] format
as opposed to the range format of start..end.

Click to expand...

Hmm... It's still signifies a range. So if there were just a notation, then it
might be nice. I'm not sure what that would be though.

Really, in looking over Ruby's Range class, it is bit limited. You can't
exclude the start element, and it doesn't provide a way to specify a
increment so you can't iterate over floats. A more complete range would have
the initializer something like:

Range.new(start, end, start_exclude=false, end_exclude=false, inc=1)

Again, sure the best way to make a nice neat literal notation for all that.
Although one simple suggestion is to have a '+' method to set the increment.

(1.0 .. 3.0 + 0.5).to_a #=> [1.0, 1.5, 2.0, 2.5, 3.0]

trans. (T. Onoma) · Oct 5, 2004

On Tuesday 05 October 2004 01:25 am, Markus wrote:

Hmm... a bit of a touch up (btw unbound can be represented by it's own object,
so no special syntax required):

r = 0<..<43
r = 0<..<=42
r = 0<=..<43
r = 0<=..<=42

r = 0<..+<42
r = 0<..+<=42
r = 0<=..+<42
r = 0<=..+<=42

Basically tie-fighter ranges ( and x-wing ranges

F'ugly!

Alhtough I
like your direction. Alternative is just to use standard-like notation:

0 < r < 43
0 < r <= 42
0 <= r < 43
0 <= r <= 42

0 < r +< 43
0 < r +<= 42
0 <= r +< 43
0 <= r +<= 42

Who said assignment always had to be 'r =' ? Of course it would be nice if we
could just do like:

r =

0,43)
r =

0,42]
r = :[0,43)
r = :[0,42]

r =

0:43)
r =

0:42]
r = :[0:43)
r = :[0:42]

T.

trans. (T. Onoma) · Oct 5, 2004

Typically, the versions of ruby I produce in these experiments are
killed by angry villages before they can show their essential
kindheartedness. But I still hope.

I occurs to me that the angry villagers might be confused. The example of the
never ending

(0..(10.0/0)).member?(4)

comes to mind. Why would this be an infinite loop? It must be trying to
generate the list before looking to see if 4 is in it (?) Are these ranges
that stupid? Even so, if it used succ to test this then it would take a while
to find out:

time ruby -e '(0..1000000000).member?(999999999)'

real 7m10.971s
user 6m56.150s
sys 0m0.617s

Yuk. But there is nothing one can do about it as long as one depends on #succ.
I suppose it's awfully clever and OOP and all to have any object supporting
<=> and succ work with ranges, but I wonder how much use they get outside of
numbers and occasional character ranges. In other words perhaps succ isn;t
the way to go (or perhaps a fallback) and a simple increment/decrement in the
Range itself would be more usable --then the above 7 minutes would be about 7
milliseconds.

T.

Brian Candler · Oct 5, 2004

I occurs to me that the angry villagers might be confused. The example of the
never ending

(0..(10.0/0)).member?(4)

comes to mind. Why would this be an infinite loop? It must be trying to
generate the list before looking to see if 4 is in it (?)

Yes, see adjacent thread. What it actually does is iterate all value from
start to end using succ, and set a flag to true when it finds a match (but
it doesn't break out of the loop when a match is found)

Are these ranges
that stupid?

Yes, but what you probably want is 'include?' rather than 'member?'

include? just checks the given value against the start and end values.

Both Range#include? and Range#member? override the methods mixed in from
Enumerable, where those methods are just synonyms for each other.

This is certainly confusing!

I'd say if you want to iterate over the range, then use Enumerable#find or
Enumerable#find_all as appropriate, then get rid of this distinction.

Yuk. But there is nothing one can do about it as long as one depends on #succ.
I suppose it's awfully clever and OOP and all to have any object supporting
<=> and succ work with ranges, but I wonder how much use they get outside of
numbers and occasional character ranges. In other words perhaps succ isn;t
the way to go (or perhaps a fallback) and a simple increment/decrement in the
Range itself would be more usable --then the above 7 minutes would be about 7
milliseconds.

If you're going to rely on increment/decrement then I'm pretty sure you can
also rely on the mathematic properties of < and >, i.e. just compare the
boundary values as 'include?' does.

Besides,
a = a.succ
and
a = a + 1
take almost identical amounts of time, since even '+ 1' involves a method
dispatch:
a = a.send

+,1)

Regards,

Brian.

trans. (T. Onoma) · Oct 5, 2004

Yes, but what you probably want is 'include?' rather than 'member?'

Certainly helps to know the distinction (which in unintuitive btw)

Both Range#include? and Range#member? override the methods mixed in from
Enumerable, where those methods are just synonyms for each other.

This is certainly confusing!

Amazing how even the simple things get that way!

I'd say if you want to iterate over the range, then use Enumerable#find or
Enumerable#find_all as appropriate, then get rid of this distinction.

Understandable. #member? should just be an alias for #find then and use
#between? for other need. At least, that seems the most consistant.

If you're going to rely on increment/decrement then I'm pretty sure you can
also rely on the mathematic properties of < and >, i.e. just compare the
boundary values as 'include?' does.

Besides,
a = a.succ
and
a = a + 1
take almost identical amounts of time, since even '+ 1' involves a method
dispatch:
a = a.send+,1)

With inc/dec modulo can be used. Something like:

def member?(e)
return ( (((e + self.begin) % @increment) == 0) && self.between?(e) )
end

T.

Florian Frank · Oct 5, 2004

Yes, see adjacent thread. What it actually does is iterate all value
from
start to end using succ, and set a flag to true when it finds a match
(but
it doesn't break out of the loop when a match is found)

This seems the wrong thing to do. The documentation says:

---------------------------------------------------------- Range#member?
rng.member?(val) => true or false
------------------------------------------------------------------------
Return +true+ if _val_ is one of the values in _rng_ (that is if
+Range#each+ would return _val_ at some point).

I would think that it should return true, instead of going round in
circles.
I can understand that include? was overridden, because there is a more
efficient way to check for inclusion in ranges and it makes a lot of
sense for
ranges of Floats.

The Enumerable version works ok:
=> true

I would prefer getting rid of the overriden member? implementation
in range.c.

Florian Frank

Brian Candler · Oct 5, 2004

This seems the wrong thing to do. The documentation says:

---------------------------------------------------------- Range#member?
rng.member?(val) => true or false
------------------------------------------------------------------------
Return +true+ if _val_ is one of the values in _rng_ (that is if
+Range#each+ would return _val_ at some point).

I would think that it should return true, instead of going round in
circles.

I agree, although it's really only an optimisation, because other cases will
still give an infinite loop: in particular

i = 1.0/0
(2..i).member?(1)

So you have a halting problem instead

The Enumerable version works ok:

=> true

Yes, it breaks out of the loop.

I would prefer getting rid of the overriden member? implementation
in range.c.

Or fixing it, but I'd also be happy to see it fall back to Enumerable, as
there's not much efficiency gain in implementing basically the same thing a
second time within range.c

Regards,

Brian.

Brian Candler · Oct 5, 2004

With inc/dec modulo can be used. Something like:

def member?(e)
return ( (((e + self.begin) % @increment) == 0) && self.between?(e) )
end

Erm, no I don't get that. You're declaring a 'member?' test for a class of
objects which support the methods '+', '%', 'between?' and '==' with some
particular semantics (they form a group?? IANAM), but presumably not
numbers.

Can you give an example of a class of objects for which your 'member?'
function is useful, but this one is not:

def member?(e)
return e >= self.begin and e <= self.end
end
#or
def member?(e)
return e.between(self.begin, self.end)
end

I also can't see what "x.between?(y)" is supposed to do, with a single
argument (y).

Regards,

Brian.

Florian Frank · Oct 5, 2004

I agree, although it's really only an optimisation, because other
cases will
still give an infinite loop: in particular

i = 1.0/0
(2..i).member?(1)

So you have a halting problem instead

True. At least it would behave like the documentation describes it.

Or fixing it, but I'd also be happy to see it fall back to Enumerable,
as
there's not much efficiency gain in implementing basically the same
thing a
second time within range.c

Hmm, i just got another idea, how it would be able to always return
true or false
for non-Float ranges:

def member?(x)
include?(x) and
Enumerable.instance_method

member?).bind(self).call(x)
end

Florian Frank

RCR 13	0	Jun 27, 2007
New CSS features for smooth entry and exit animations	0	May 28, 2024
Suggestions for new RCR process	0	Jan 12, 2007
C program: memory leak/ segmentation fault/ memory limit exceeded	0	Nov 12, 2022
Take indices of non zero elements of matrix	1	Jun 15, 2022
The Semantics of 'volatile'	73	Jun 1, 2009
Tic Tac Toe Game	2	Mar 10, 2024
Potential RCR: method_missing convention, opinions?	2	Jan 3, 2006

[RCR] New [] Semantics

Bill Atkins

Yukihiro Matsumoto

Ara.T.Howard

Florian Gross

Yukihiro Matsumoto

Charles Comstock

Bill Atkins

David A. Black

trans. (T. Onoma)

Joe Cheng

Markus

Markus

trans. (T. Onoma)

trans. (T. Onoma)

Brian Candler

trans. (T. Onoma)

Florian Frank

Brian Candler

Brian Candler

Florian Frank

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads