Possible improvement to slice opperations.

R

Ron Adam

Michael said:
With current slicing and a negative step...

[ 1 2 3 4 5 6 7 8 9 ]
-9 -8 -7 -6 -5 -4 -3 -2 -1 -0

r[-3:] -> [7, 8, 9] # as expected
r[-3::-1] -> [7, 6, 5, 4, 3, 2, 1, 0] # surprise

The seven is include in both cases, so it's not a true inverse
selection either.

Did you read what Magnus said: "a is the index for the first item to
include"? How could r[-3::x] for any x not include the 7?

Cheers,
mwh

Yes, and I agreed this is how it works. The point above was the 7 was
outside the "in-between" index's. Not that it shouldn't be included.
And yes, I wasn't looking at it quite correctly as well, but my point is
still valid in the context it was given.

I originally posted that negative index's were one's based, but then I
got multiple responses that I was looking at it incorrectly (using the
wrong model) and I should look at index's as being the places between
the items. (Or on the left side of the items) This is also mentioned in
the tutorial as follows.


from 2.4.1 Docs, Section 3.1.2, Strings:
----------------------------------------
The best way to remember how slices work is to think of the indices as
pointing between characters, with the left edge of the first character
numbered 0. Then the right edge of the last character of a string of n
characters has index n, for example:

+---+---+---+---+---+
| H | e | l | p | A |
+---+---+---+---+---+
0 1 2 3 4 5
-5 -4 -3 -2 -1

The first row of numbers gives the position of the indices 0...5 in the
string; the second row gives the corresponding negative indices. The
slice from i to j consists of all characters between the edges labeled i
and j, respectively.
-----------------------------------------


But you need a different pair of index's for this to work with negative
step values as you do with positive step values. (or switch to right
side indexing.)

So I tried to find a consistent way for in-between index's to work and
solution I found "and asked if it might be a possibility" wasn't
desirable. (And I was told again I was using the wrong model.) ;-)

So we get back to how it actually works, (base 1 for negative index's),
Or more precisely .. (len(L)+i) indexing.) which is a little more
complex visually, but I think is better to teach it correctly from the
start and avoid the confusions that the in-between index's lead to
later. And also avoid the confusion of having two different way of
looking at it.

Here's the correct way....


From 2.4.1 Docs, section 2.3.6, Sequence Types:
-----------------------------------------------
s i'th item of s, origin 0 (3)
s[i:j] slice of s from i to j (3), (4)
s[i:j:k] slice of s from i to j with step k (3), (5)

(3)
If i or j is negative, the index is relative to the end of the string:
len(s) + i or len(s) + j is substituted. But note that -0 is still 0.

(4)
The slice of s from i to j is defined as the sequence of items with
index k such that i <= k < j. If i or j is greater than len(s), use
len(s). If i is omitted, use 0. If j is omitted, use len(s). If i is
greater than or equal to j, the slice is empty.

(5)
The slice of s from i to j with step k is defined as the sequence of
items with index x = i + n*k such that . In other words, the indices are
i, i+k, i+2*k, i+3*k and so on, stopping when j is reached (but never
including j). If i or j is greater than len(s), use len(s). If i or j
are omitted then they become ``end'' values (which end depends on the
sign of k). Note, k cannot be zero.
-------------------------------------


This could be a bit clearer.

Notice this doesn't say to use index's between items. Looking at the
index's as being either on the left or right creates the 'apparent'
indexing inconsistency I was referring to. It looks good at first, but
breaks down later, so the examples using in-between index's in the docs
and tutorials should probably be rewritten to show how it actually works
right from the start. Something like the following examples with a walk
through would do.

['a', 'b', 'c', 'd', 'e', 'f', 'g']
0 1 2 3 4 5 6 <--- i
-7 -6 -5 -4 -3 -2 -1 <--- len(L)+i

example: i = -7 -> len(L) + -7 -> 7 + -7 -> 0

L[3] = 3
L[:3] = ['a','b','c'] # L[0:3]
L[3:] = ['d','e','f','g'] # L[3:7]

L[-3] -> 'e'
L[:-3] -> ['a','b','c','d'] # L[-7:-3]
L[-3:] -> ['e','f','g'] # L[-3:len(L)] (use len(L), not 0)

L[-3::-1] -> ['e','d','c','b','a'] # L[-3:-8:-1]
L[:-3:-1] -> ['g','f'] # L[-1:-3:-1]

L[3::-1] -> ['d','c','b','a'] # L[3:-Len(L)-1:-1](-1 wont' work here)
L[:3:-1] -> ['g','f','e'] # L[0:3:-1]

L[3:4:1] i-> ['d']
L[3:2:-1] -> ['d']

L[-4:-3: 1] -> ['e']
L[-4:-5:-1] -> ['e']



(*) The problems stituations are the cases where start and stop are on
each side of the -1/0 boundary. This causes difficulty reaching end
points in some situations and also requires having to do extra bounds
checking if you move the stop index around.



So maybe we can improve things in some situations when that occurs?
Or at least try. So here's another (last by me) attempt. The following
will break current code, but it is a stable and predictable method.

It has both advantages and disadvantages over the current method. (as
usual)

+ Bound checking the stop index isn't needed to avoid situations
where it crosses -1/0 . (works in both directions)

+ The order is always preserved from the starting position. No ugly
switches if the stop in incrimented past -1/0.

+ Only need to check start position if you are moving it.

- can't index from both ends using both positive and negative
indexing at the same time.


With the below function, L[1:-1] does not remove an item off each end,
but instead returns [] as the stop is less than the start. So you have
to do it in two slices.

r = L[1:][:-1]

Or use all positive indexing.

r = L[1:len(L)-1]

Or all negative indexing.

r = r[-len(L):-1]

This is because in order to resolve the -1/0 indexing situations you
have to disallow indexing from both sides at the same time. I would
guess that a lot of people wouldn't want to give that up now that
they've gotten used to it. (? shrug)


Anyway, this might give someone else an idea or two. ;-)

Cheers,
Ron




Given: L[i:j:k] as start,stop,step.

And the following diagrams:

|--------i------0-------->
j<=i [i......] j>0


<------+0-----i---------|
j<-1 [......i] j>=i
-1

This function could probably be cleaned up a bit. I think there may be
several possible short cuts that would speed this up and reduce the
number of comparisons.

def Slice(L, start=None, stop=None, step=1):
if step == None:
step = 1
if start != None \
and stop != None \
and (type(start) != int \
or type(stop) != int \
or type(step) != int):
raise TypeError("slice indices must be integers")
if step == 0:
raise ValueError("slice step cannot be zero")
if step >= 1:
if start == None:
start = 0
if stop == None:
stop = len(L)
if start < -len(L):
start = -len(L)
if start > len(L) or stop <= start:
return []
if start < 0 and stop > 0:
start = start+len(L)
stop = stop+len(L)
if stop > len(L):
stop = len(L)
elif step <= -1:
if start == None:
start = -1
if stop == None:
stop = -len(L)-1
if start >= len(L):
start = len(L)-1
if stop >= start or start < -len(L):
return []
if start >= 0 and stop < 0 :
start = start - len(L)
stop = stop - len(L)
if stop < -len(L)-1:
stop = -len(L)-1
new = []
for i in range(start,stop,step):
new.append(L)
return new


L = range(10)
data = [
(0,-0,1),
(None, None, None),
(None, None, 1),
(3, 6, 1),
(3, 100, 1),
(100,6,-1),
(3, -1, -1),
(-3, 0, 1), # this works
(5, -1, -1), # this too
(None, -100, -1),
(-100, 100, 1),
(6, 3, -1),
(-3, None, 1),
(-6, 5, 1),
(-50, -6, 1),
(-3, -100, -1),
(-3, -6, -1),
(-3, -100, 0),
(-3, -6, .5),
(-3, 34.5, 1),
(2.0, -6, -1),
]


import sys
for start,stop,step in data:
print 'L[%r:%r:%r]='%(start,stop,step),
try:
print Slice(L,start,stop,step)
except:
print sys.exc_type, ":", sys.exc_value
print



---------- Output ------------

L[0:0:1]= []
L[None:None:None]= [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
L[None:None:1]= [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
L[3:6:1]= [3, 4, 5]
L[3:100:1]= [3, 4, 5, 6, 7, 8, 9]
L[100:6:-1]= [9, 8, 7]
L[3:-1:-1]= [3, 2, 1, 0]
L[-3:0:1]= [7, 8, 9]
L[5:-1:-1]= [5, 4, 3, 2, 1, 0]
L[None:-100:-1]= [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
L[-100:100:1]= [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
L[6:3:-1]= [6, 5, 4]
L[-3:None:1]= [7, 8, 9]
L[-6:5:1]= [4, 5, 6, 7, 8, 9]
L[-50:-6:1]= [0, 1, 2, 3]
L[-3:-100:-1]= [7, 6, 5, 4, 3, 2, 1, 0]
L[-3:-6:-1]= [7, 6, 5]
L[-3:-100:0]= exceptions.ValueError : slice step cannot be zero
L[-3:-6:0.5]= exceptions.TypeError : slice indices must be integers
L[-3:34.5:1]= exceptions.TypeError : slice indices must be integers
L[2.0:-6:-1]= exceptions.TypeError : slice indices must be integers
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,776
Messages
2,569,602
Members
45,183
Latest member
OrderGlycoEase

Latest Threads

Top