negative indices for sequence types

D

dan

I was recently surprised, and quite shocked in fact, to find that
Python treats negative indices into sequence types as if they were
mod(length-of-sequence), at least up to -len(seq).

This fact is *deeply* buried in the docs, and is not at all intuitive.
One of the big advantages of a high-level language such as Python is
the ability to provide run-time bounds checking on array-type
constructs. To achieve this I will now have to subclass my objects
and add it myself, which seems silly and will add significant
overhead. If you want this behavior, how hard is it to say a = b[x %
len(b)] ??

Can anyone explain why this anomaly exists, and why it should continue
to exist?
 
P

Peter Otten

dan said:
I was recently surprised, and quite shocked in fact, to find that
Python treats negative indices into sequence types as if they were
mod(length-of-sequence), at least up to -len(seq).

This fact is *deeply* buried in the docs, and is not at all intuitive.
One of the big advantages of a high-level language such as Python is
the ability to provide run-time bounds checking on array-type
constructs. To achieve this I will now have to subclass my objects
and add it myself, which seems silly and will add significant
overhead. If you want this behavior, how hard is it to say a = b[x %
len(b)] ??

Can anyone explain why this anomaly exists, and why it should continue
to exist?

After you have recovered from the shock, you probably will admit that
(1) the most common "out of bounds" case is caught:
Traceback (most recent call last):
File "<stdin>", line 1, in ?
IndexError: list index out of range

and
(2) that accessing elements from the end of the list is something you will
soon appreciate:

I think that more code enjoys the beauty of accessing the end of a list than
suffers from uncaught <0 index errors. See the possibilities rather than
the danger :)

Peter
 
M

Martin v. =?iso-8859-15?q?L=F6wis?=

This fact is *deeply* buried in the docs, and is not at all intuitive.

I find it highly intuitive and very convenient.
If you want this behavior, how hard is it to say a = b[x %
len(b)] ??

*This* I would call un-intuitive. It is also much slower.

To get the last element, you currently write b[-1]. If that was not
available, you would have to write b[len(b)-1], which is still
significantly slower. Also, you might not have a variable name, so try
rewriting foo()[-1].

Regards,
Martin
 
B

Bengt Richter

I was recently surprised, and quite shocked in fact, to find that
Python treats negative indices into sequence types as if they were
mod(length-of-sequence), at least up to -len(seq).

This fact is *deeply* buried in the docs, and is not at all intuitive.
One of the big advantages of a high-level language such as Python is
the ability to provide run-time bounds checking on array-type
constructs. To achieve this I will now have to subclass my objects
and add it myself, which seems silly and will add significant
overhead. If you want this behavior, how hard is it to say a = b[x %
len(b)] ??

That isn't really the exact behavior. E.g.,
>>> range(5) [0, 1, 2, 3, 4]
>>> range(5)[-4] 1
>>> range(5)[-5] 0
>>> range(5)[-6]
Traceback (most recent call last):
>>> range(5)[4] 4
>>> range(5)[5]
Traceback (most recent call last):
Can anyone explain why this anomaly exists, and why it should continue
to exist?
It has apparently proven more useful to have it so than not, though I sympathize
with your frustration in for your use.

Perhaps a .no_negative_indexing attribute or something could be added to the C implementation,
so that you could specify your desired checking without a performance hit.

Meanwhile, maybe an assert i>=0 in the index-supplier side of the contract might work too?

Regards,
Bengt Richter
 
T

Terry Reedy

dan said:
I was recently surprised, and quite shocked in fact, to find that
Python treats negative indices into sequence types as if they were
mod(length-of-sequence), at least up to -len(seq).

No, it adds len(seq). Changing + to % would be slower and more
obscure.
This fact is *deeply* buried in the docs,

No more so than everything else in chapter subsections. From the Ref
Man table of contents I went directly to the most obvious place 5.3.2
Subscriptions, and found
'''
If the primary is a sequence, the expression (list) must evaluate to a
plain integer. If this value is negative, the length of the sequence
is added to it (so that, e.g., x[-1] selects the last item of x.) The
resulting value must be a nonnegative integer less than the number of
items in the sequence, and the subscription selects the item whose
index is that value (counting from zero).
'''
Translated to Python, letting idex be result of index expression:

if not isinstance(idex, (int,long)): raise TypeError()
if idex < 0: idex += seqlen
if idex < 0 or idex >= seqlen: raise IndexError()
and is not at all intuitive.

Phrases like 'third from the end' are idiomatic English ;-)
One of the big advantages of a high-level language such as Python is
the ability to provide run-time bounds checking on array-type
constructs. To achieve this I will now have to subclass my objects
and add it myself, which seems silly and will add significant
overhead. If you want this behavior, how hard is it to say a = b[x %
len(b)] ??

Again, your innovation of using '% obscures rather than clarify.
Can anyone explain why this anomaly exists, and why it should continue
to exist?

Being able to abbreviate seq(len(seq)-1] as seq[-1] is quite handy and
faster executing, , especially if seq is calculated from an
expression. Same for -2, etc. (And, of course, a change now would
break a noticeable fraction of existing programs.)

Terry J. Reedy
 
E

Erik Max Francis

dan said:
I was recently surprised, and quite shocked in fact, to find that
Python treats negative indices into sequence types as if they were
mod(length-of-sequence), at least up to -len(seq).

That is not the behavior of negative indices. Negative indices mean
index from the end of the sequence. So -1 means the _last_ element in
the list, -2 means the second to last element in the list, and so on.
-n (for n = len(seq) is the first element in the list.
This fact is *deeply* buried in the docs, and is not at all intuitive.

It's mentioned prominently (and early) in all the tutorials and books on
Python I've read, and it's a very common and convenient convention, so
I'm not sure how far you could have gotten through learning Python and
never been exposed to it.
One of the big advantages of a high-level language such as Python is
the ability to provide run-time bounds checking on array-type
constructs. To achieve this I will now have to subclass my objects
and add it myself, which seems silly and will add significant
overhead. If you want this behavior, how hard is it to say a = b[x %
len(b)] ??

That's simply not true. Negative indices have similar bounds
requirements. If you have a sequence of length n, then indices 0
through (n - 1) map to the elements of the sequence in order from left
to right, and indices -1 through -n map to the elements in order from
right to left. Indices greater than n or less than -n generate
IndexErrors. Bounds checking is always done, whether on positive or
negative indices.
 
I

Istvan Albert

dan said:
This fact is *deeply* buried in the docs, and is not at all intuitive.
One of the big advantages of a high-level language such as Python is
the ability to provide run-time bounds checking on array-type
constructs.

Bounds checking means that the size is tracked for you and
an exception is thrown if you are trying to access an
element *beyond* that size. That's the natural way
of thinking about it, and not "checking wether there
is an index like this in the list".

The python way of using negative numbers in indices is
extremly handy as many have pointed out. It would be
silly to forego all that expressivness just to save an
if test in some rare cases.
> To achieve this I will now have to subclass my objects
> and add it myself, which seems silly and will add significant
> overhead.

I would guess that instead of paying for this every time,
as you want to (subclassing), you could just as simply
check the index at the time when you generate it and verify
that it is correct. This way using the same list in differnt
context will not make it less efficient.

Istvan.
 
M

Michael Peuser

dan said:
I was recently surprised, and quite shocked in fact, to find that
Python treats negative indices into sequence types as if they were
mod(length-of-sequence), at least up to -len(seq).

This fact is *deeply* buried in the docs, and is not at all intuitive.

I think it is addressed even in most tutorials because it is quite handy as
others already pointed out. There is the same fetaure in Perl.

Kindly
Michael P
 
F

Fernando Perez

dan said:
I was recently surprised, and quite shocked in fact, to find that
Python treats negative indices into sequence types as if they were
mod(length-of-sequence), at least up to -len(seq).

This fact is *deeply* buried in the docs, and is not at all intuitive.

Very deeply indeed: section 3.1.4 of the beginner's tutorial:

http://www.python.org/doc/current/tut/node5.html#SECTION005140000000000000000

Of all places, this is the section on lists:
a = ['spam', 'eggs', 100, 1234]

[... snip ...]
['eggs', 100]
Can anyone explain why this anomaly exists, and why it should continue
to exist?

Because this 'anomaly' is incredibly useful in many contexts, as many others
have already pointed out. Rest assured that it will continue to exist,
probably for as long as the language is around. Better get to like it :)

Cheers,

f.
 
D

dan

As is often the case, I think this comes down to documentation. While
the behavior is mentioned early in the tutorial, I found it difficult
to find it in the reference -- but whatever, we can chalk this up to
RTFM on my part.

My explanation of the behavior is correct however. list[a] always
equals list[a % len(list)]. A negative number mod N = its absolute
value subtracted from N:

a % n == n - abs(a) # where -n <= a <= 0

However if I want to count from the end of the list, I would of course
write
list[len(list)-a]. I wasn't really considering that the purpose of
this feature was to count from the end of a list, which I admit could
come in handy.

Thanks for the responses.

Fernando Perez said:
dan said:
I was recently surprised, and quite shocked in fact, to find that
Python treats negative indices into sequence types as if they were
mod(length-of-sequence), at least up to -len(seq).

This fact is *deeply* buried in the docs, and is not at all intuitive.

Very deeply indeed: section 3.1.4 of the beginner's tutorial:

http://www.python.org/doc/current/tut/node5.html#SECTION005140000000000000000

Of all places, this is the section on lists:
a = ['spam', 'eggs', 100, 1234]

[... snip ...]
['eggs', 100]
Can anyone explain why this anomaly exists, and why it should continue
to exist?

Because this 'anomaly' is incredibly useful in many contexts, as many others
have already pointed out. Rest assured that it will continue to exist,
probably for as long as the language is around. Better get to like it :)

Cheers,

f.
 
C

Chad Netzer

My explanation of the behavior is correct however. list[a] always
equals list[a % len(list)].

Many people pointed out to you that this is NOT true. In particular,
your version gives you NO bounds checking at all; every 'a' is a valid
index (for len(list) > 0). The Python behavior DOES give IndexError for
an out of bound a, and that difference is very significant, IMO.
 
B

bigdog

As is often the case, I think this comes down to documentation. While
the behavior is mentioned early in the tutorial, I found it difficult
to find it in the reference -- but whatever, we can chalk this up to
RTFM on my part.

My explanation of the behavior is correct however. list[a] always
equals list[a % len(list)]. A negative number mod N = its absolute
value subtracted from N:

a % n == n - abs(a) # where -n <= a <= 0

However if I want to count from the end of the list, I would of course
write
list[len(list)-a]. I wasn't really considering that the purpose of
this feature was to count from the end of a list, which I admit could
come in handy.

Thanks for the responses.

Fernando Perez said:
dan said:
I was recently surprised, and quite shocked in fact, to find that
Python treats negative indices into sequence types as if they were
mod(length-of-sequence), at least up to -len(seq).

This fact is *deeply* buried in the docs, and is not at all intuitive.

Very deeply indeed: section 3.1.4 of the beginner's tutorial:

http://www.python.org/doc/current/tut/node5.html#SECTION005140000000000000000

Of all places, this is the section on lists:
a = ['spam', 'eggs', 100, 1234]

[... snip ...]
a[-2] 100
a[1:-1]
['eggs', 100]
Can anyone explain why this anomaly exists, and why it should continue
to exist?

Because this 'anomaly' is incredibly useful in many contexts, as many others
have already pointed out. Rest assured that it will continue to exist,
probably for as long as the language is around. Better get to like it :)

Cheers,

f.

Heck, I like it simply because I can read lines from files and easily
chop off the newline.

myStr = f.readline()[0:-1]

That alone is worth it's wait in gold to me, never mind all the other
things it makes easy.
 
L

Lukasz Pankowski

myStr = f.readline()[0:-1]

this may eat you last character in the file (if last line does not end
with new line which happens, but this will not ::

myStr = f.readline().rstrip('\n')

but is 6 character longer :)
 
J

Jacek Generowicz

(e-mail address removed) (dan) hypothesizes:
My explanation of the behavior is correct however. list[a] always
equals list[a % len(list)]. A negative number mod N = its absolute
value subtracted from N:

Proof by counterexample:

Python 2.2.2 (#1, Feb 8 2003, 12:11:31)
[GCC 3.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
s = '0123'
s[-20 % len(s)] '0'
s[-20]
Traceback (most recent call last):
File "<stdin>", line 1, in ?
IndexError: string index out of range


Your explanation of the behaviour is incorrect.

QED.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top