Why Python does *SLICING* the way it does??

S

seberino

Many people I know ask why Python does slicing the way it does.....

Can anyone /please/ give me a good defense/justification???

I'm referring to why mystring[:4] gives me
elements 0, 1, 2 and 3 but *NOT* mystring[4] (5th element).

Many people don't like idea that 5th element is not invited.

(BTW, yes I'm aware of the explanation where slicing
is shown to involve slices _between_ elements. This
doesn't explain why this is *best* way to do it.)

Chris
 
R

Raymond Hettinger

Many people I know ask why Python does slicing the way it does.....

Half open intervals are just one way of doing things. Each approach has its own
merits and issues.

Python's way has some useful properties:

* s == s[:i] + s[i:]

* len(s[i:j]) == j-i # if s is long enough


OTOH, it has some aspects that bite:

* It is awkward with negative strides such as with s[4:2:-1]. This was the
principal reason for introducing the reversed() function.

* It makes some people cringe when they first see it (you're obviously in that
group).


I suspect that whether it feels natural depends on your previous background and
whether you're working in an environment with arrays indexed from one or from
zero. For instance, C programmers are used to seeing code like: for(i=0 ;
i<n; i++) a=f(i); In contrast, a BASIC programmer may be used to FOR I = 1
to N: a=f(I); NEXT. Hence, the C coders may find Python's a[:n] to be
more natural than BASIC programmers.

As long as a language is consistent about its approach, you just get used to it
and it stops being an issue after a few days.


Raymond Hettinger
 
J

John Bokma

Raymond said:
to seeing code like: for(i=0 ; i<n; i++) a=f(i); In contrast, a
BASIC programmer may be used to FOR I = 1 to N: a=f(I); NEXT.


Afaik, at least BBC BASIC uses zero based arrays :) Maybe ZX Spectrum
Basic too (too long ago to remember).
 
N

Nick Efford

Many people I know ask why Python does slicing the way it does.....
Can anyone /please/ give me a good defense/justification???
I'm referring to why mystring[:4] gives me
elements 0, 1, 2 and 3 but *NOT* mystring[4] (5th element).

mystring[:4] can be read as "the first four characters of mystring".
If it included mystring[4], you'd have to read it as "the first
five characters of mystring", which wouldn't match the appearance
of '4' in the slice.

Given another slice like mystring[2:4], you know instantly by
looking at the slice indices that this contains 4-2 = 2 characters
from the original string. If the last index were included in the
slice, you'd have to remember to add 1 to get the number of
characters in the sliced string.

It all makes perfect sense when you look at it this way!


Nick
 
T

Torsten Bronger

Hallöchen!

Many people I know ask why Python does slicing the way it does.....
Can anyone /please/ give me a good defense/justification???
I'm referring to why mystring[:4] gives me elements 0, 1, 2 and 3
but *NOT* mystring[4] (5th element).

mystring[:4] can be read as "the first four characters of
mystring". If it included mystring[4], you'd have to read it as
"the first five characters of mystring", which wouldn't match the
appearance of '4' in the slice.

[...]

It all makes perfect sense when you look at it this way!

Well, also in my experience every variant has its warts. You'll
never avoid the "i+1" or "i-1" expressions in your indices or loops
(or your mind ;).

It's interesting to muse about a language that starts at "1" for all
arrays and strings, as some more or less obsolete languages do. I
think this is more intuitive, since most people (including
mathematicians) start counting at "1". The reason for starting at
"0" is easier memory address calculation, so nothing for really high
level languages.

But most programmers are used to do it the Python (and most other
languages) way, so this opportunity has been missed for good.

Tschö,
Torsten.
 
A

Antoon Pardon

Op 2005-04-20 said:
Hallöchen!

Many people I know ask why Python does slicing the way it does.....
Can anyone /please/ give me a good defense/justification???
I'm referring to why mystring[:4] gives me elements 0, 1, 2 and 3
but *NOT* mystring[4] (5th element).

mystring[:4] can be read as "the first four characters of
mystring". If it included mystring[4], you'd have to read it as
"the first five characters of mystring", which wouldn't match the
appearance of '4' in the slice.

[...]

It all makes perfect sense when you look at it this way!

Well, also in my experience every variant has its warts. You'll
never avoid the "i+1" or "i-1" expressions in your indices or loops
(or your mind ;).

It's interesting to muse about a language that starts at "1" for all
arrays and strings, as some more or less obsolete languages do. I
think this is more intuitive, since most people (including
mathematicians) start counting at "1". The reason for starting at
"0" is easier memory address calculation, so nothing for really high
level languages.

Personnaly I would like to have the choice. Sometimes I prefer to
start at 0, sometimes at 1 and other times at -13 or +7.
 
S

Sion Arrowsmith

Raymond Hettinger said:
Many people I know ask why Python does slicing the way it does.....
Python's way has some useful properties:

* s == s[:i] + s[i:]

* len(s[i:j]) == j-i # if s is long enough

The latter being particularly helpful when i = 0 -- the first n
elements are s[:n] . (Similarly elegantly, although of no
practical significance, s == s[0:len(s)] .)
 
T

Torsten Bronger

Hallöchen!

Antoon Pardon said:
Op 2005-04-20 said:
[...]

It's interesting to muse about a language that starts at "1" for
all arrays and strings, as some more or less obsolete languages
do. I think this is more intuitive, since most people (including
mathematicians) start counting at "1". The reason for starting
at "0" is easier memory address calculation, so nothing for
really high level languages.

Personnaly I would like to have the choice. Sometimes I prefer to
start at 0, sometimes at 1 and other times at -13 or +7.

In HTBasic you have the choice between 0 and 1; there is a global
source code directive for it. However, hardly anybody really wants
to use HTBasic.

Tschö,
Torsten.
 
T

Terry Hancock

Many people I know ask why Python does slicing the way it does.....
[...]
Python's way has some useful properties: [...]
OTOH, it has some aspects that bite: [...]
I suspect that whether it feels natural depends on your previous background and
whether you're working in an environment with arrays indexed from one or from
zero. For instance, C programmers are used to seeing code like: for(i=0 ;
i<n; i++) a=f(i); In contrast, a BASIC programmer may be used to FOR I = 1
to N: a=f(I); NEXT. Hence, the C coders may find Python's a[:n] to be
more natural than BASIC programmers.


Well, I learned Basic, Fortran, C, Python --- more or less. And I first found
Python's syntax confusing as it didn't follow the same rules as any of the
previous ones.

However, I used to make "off by one" errors all the time in both C and Fortran,
whereas I hardly ever make them in Python.

So I like Python's slicing because it "bites *less*" than intervals in C or Fortran.

Cheers,
Terry
 
B

Bill Mill

Op 2005-04-20 said:
Hallöchen!

Many people I know ask why Python does slicing the way it does.....

Can anyone /please/ give me a good defense/justification???

I'm referring to why mystring[:4] gives me elements 0, 1, 2 and 3
but *NOT* mystring[4] (5th element).

mystring[:4] can be read as "the first four characters of
mystring". If it included mystring[4], you'd have to read it as
"the first five characters of mystring", which wouldn't match the
appearance of '4' in the slice.

[...]

It all makes perfect sense when you look at it this way!

Well, also in my experience every variant has its warts. You'll
never avoid the "i+1" or "i-1" expressions in your indices or loops
(or your mind ;).

It's interesting to muse about a language that starts at "1" for all
arrays and strings, as some more or less obsolete languages do. I
think this is more intuitive, since most people (including
mathematicians) start counting at "1". The reason for starting at
"0" is easier memory address calculation, so nothing for really high
level languages.

Personnaly I would like to have the choice. Sometimes I prefer to
start at 0, sometimes at 1 and other times at -13 or +7.

-1. You can start arrays at 0 or 1 (and arbitrary bases? I don't
recall) in VB, and it's an unmitigated disaster. It adds needless
complexity. What our slicing system loses in elegance in a few cases,
it more than makes up for in consistency throughout all programs.

Peace
Bill Mill
bill.mill at gmail.com
 
B

beliavsky

Terry Hancock wrote:

So I like Python's slicing because it "bites *less*" than intervals
in C or Fortran.

I disagree. Programming languages should not needlessly surprise
people, and a newbie to Python probably expects that x[1:3] =
[x[1],x[2],x[3]] . Array-oriented languages, such as Fortran 90/95,
Matlab/Octave/Scilab, and S-Plus/R do not follow the Python convention,
and I don't know of Fortran or R programmers who complain (don't follow
Matlab enough to say). There are Python programmers, such as the OP and
me, who don't like the Python convention. What languages besides Python
use the Python slicing convention?

Along the same lines, I think the REQUIREMENT that x[0] rather than
x[1] be the first element of list x is a mistake. At least the
programmer should have a choice, as in Fortran or VBA. In C starting at
0 may be justified because of the connection between array subscripting
and pointer arithmetic, but Python is a higher-level language where
such considerations are less relevant.
 
R

Roy Smith

Antoon Pardon said:
Personnaly I would like to have the choice. Sometimes I prefer to
start at 0, sometimes at 1 and other times at -13 or +7.

Argggh. Having two (or more!) ways to do it, would mean that every time I
read somebody else's code, I would have to figure out which flavor they are
using before I could understand what their code meant. That would be evil.

What would actually be cool is if Python were to support the normal math
notation for open or closed intervals. Any of the following would make
sense:

foo = bar (1, 2)
foo = bar (1, 2]
foo = bar [1, 2)
foo = bar [1, 2]

That would certainly solve this particular problem, but the cost to the
rest of the language syntax would be rather high :)
 
A

Antoon Pardon

Op 2005-04-20 said:
Op 2005-04-20 said:
Hallöchen!

(e-mail address removed) (Nick Efford) writes:

Many people I know ask why Python does slicing the way it does.....

Can anyone /please/ give me a good defense/justification???

I'm referring to why mystring[:4] gives me elements 0, 1, 2 and 3
but *NOT* mystring[4] (5th element).

mystring[:4] can be read as "the first four characters of
mystring". If it included mystring[4], you'd have to read it as
"the first five characters of mystring", which wouldn't match the
appearance of '4' in the slice.

[...]

It all makes perfect sense when you look at it this way!

Well, also in my experience every variant has its warts. You'll
never avoid the "i+1" or "i-1" expressions in your indices or loops
(or your mind ;).

It's interesting to muse about a language that starts at "1" for all
arrays and strings, as some more or less obsolete languages do. I
think this is more intuitive, since most people (including
mathematicians) start counting at "1". The reason for starting at
"0" is easier memory address calculation, so nothing for really high
level languages.

Personnaly I would like to have the choice. Sometimes I prefer to
start at 0, sometimes at 1 and other times at -13 or +7.

-1. You can start arrays at 0 or 1 (and arbitrary bases? I don't
recall) in VB, and it's an unmitigated disaster. It adds needless
complexity.

Complexity that is now put on the programmers shoulders.

If I have a table with indexes going from -13 to +7, I have to
add the offset myself if I want to use a list for that.
What our slicing system loses in elegance in a few cases,
it more than makes up for in consistency throughout all programs.

You write this af if other solutions can't be consistent.
 
A

Antoon Pardon

Op 2005-04-20 said:
Argggh. Having two (or more!) ways to do it, would mean that every time I
read somebody else's code, I would have to figure out which flavor they are
using before I could understand what their code meant. That would be evil.

This is nonsens. table = j, just associates value j with key i.
That is the same independend from whether the keys can start from
0 or some other value. Do you also consider it more ways because
the keys can end in different values?
 
B

Bill Mill

Op 2005-04-20 said:
Op 2005-04-20, Torsten Bronger schreef <[email protected]>:
Hallöchen!

(e-mail address removed) (Nick Efford) writes:

Many people I know ask why Python does slicing the way it does......

Can anyone /please/ give me a good defense/justification???

I'm referring to why mystring[:4] gives me elements 0, 1, 2 and 3
but *NOT* mystring[4] (5th element).

mystring[:4] can be read as "the first four characters of
mystring". If it included mystring[4], you'd have to read it as
"the first five characters of mystring", which wouldn't match the
appearance of '4' in the slice.

[...]

It all makes perfect sense when you look at it this way!

Well, also in my experience every variant has its warts. You'll
never avoid the "i+1" or "i-1" expressions in your indices or loops
(or your mind ;).

It's interesting to muse about a language that starts at "1" for all
arrays and strings, as some more or less obsolete languages do. I
think this is more intuitive, since most people (including
mathematicians) start counting at "1". The reason for starting at
"0" is easier memory address calculation, so nothing for really high
level languages.

Personnaly I would like to have the choice. Sometimes I prefer to
start at 0, sometimes at 1 and other times at -13 or +7.

-1. You can start arrays at 0 or 1 (and arbitrary bases? I don't
recall) in VB, and it's an unmitigated disaster. It adds needless
complexity.

Complexity that is now put on the programmers shoulders.

If I have a table with indexes going from -13 to +7, I have to
add the offset myself if I want to use a list for that.
What our slicing system loses in elegance in a few cases,
it more than makes up for in consistency throughout all programs.

You write this af if other solutions can't be consistent.

Propose one, and I won't write it off without thinking, but my bias is
way against it from experience. Knowledge gets scattered across the
program, unless you're defining the start index every time you use the
list, which seems no better than adding an offset to me.

Peace
Bill Mill
bill.mill at gmail.com
 
D

Diez B. Roggisch

Personnaly I would like to have the choice. Sometimes I prefer to
start at 0, sometimes at 1 and other times at -13 or +7.


Subclass from builtin list - and make the necessary adjustmenst yourself in
an overloaded __getitem__.
 
R

Roy Smith

Antoon Pardon said:
Op 2005-04-20 said:
Argggh. Having two (or more!) ways to do it, would mean that every time I
read somebody else's code, I would have to figure out which flavor they are
using before I could understand what their code meant. That would be evil.

This is nonsens. table = j, just associates value j with key i.
That is the same independend from whether the keys can start from
0 or some other value. Do you also consider it more ways because
the keys can end in different values?


There are certainly many examples where the specific value of the
first key makes no difference. A good example would be

for element in myList:
print element

On the other hand, what output does

myList = ["spam", "eggs", "bacon"]
print myList[1]

produce? In a language where some lists start with 0 and some start
with 1, I don't have enough information just by looking at the above
code.
 
B

Bernhard Herzog

Torsten Bronger said:
It's interesting to muse about a language that starts at "1" for all
arrays and strings, as some more or less obsolete languages do. I
think this is more intuitive, since most people (including
mathematicians) start counting at "1". The reason for starting at
"0" is easier memory address calculation, so nothing for really high
level languages.

There are very good reasons for half-open intervals and starting at 0
apart from memory organization. Dijkstra explained this quite well in
http://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF

Bernhard
 
T

Terry Hancock

Personnaly I would like to have the choice. Sometimes I prefer to
start at 0, sometimes at 1 and other times at -13 or +7.

Although I would classify that as a "rare use case". So, it "ought
to be possible to do it, but not necessarily easy". Which
Python obliges you on --- you can easily create a relocatable list
class that indexes and slices anyway you see fit, e.g.:
.... def __init__(self, start, contents=()):
.... self.start = start
.... for item in contents:
.... self.append(item)
.... def __getitem__(self, index):
.... if isinstance(index, slice):
.... if (not (self.start <= index.start < self.start + len(self)) or
.... not (self.start <= index.stop < self.start + len(self))):
.... raise IndexError
.... return self.__class__.__base__.__getitem__(self, slice(index.start-self.start, index.stop-self.start, index.step))
.... else:
.... if not (self.start <= index < self.start + len(self)):
.... raise IndexError
.... return self.__class__.__base__.__getitem__(self, index - self.start)
....
....
r = reloc(-13, ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q'])
r[0] 'n'
r[-12] 'b'
r[-13] 'a'
r[-14]
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File said:
Traceback (most recent call last):
File "<stdin>", line 1, in ?

Note that I added the IndexError to avoid the bizarre results you
get due to "negative indexing" in the list base class (now they
would change behavior at -13 instead of at 0, which is counter-
intuitive to say the least). Better to cry foul if an index out of
range is called for in this case, because it's gotta be a bug.

What do you think, should I send it to Useless Python? ;-)

Cheers,
Terry
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,011
Latest member
AjaUqq1950

Latest Threads

Top