Tuple slices

J

Jeff Shannon

George said:
An iterator is perfectly ok if all you want is to iterate over the
elements of a view, but as you noted, iterators are less flexible than
the underlying sequence. The view should be (or at least appear)
identical in functionality (i.e. public methods) with its underlying
sequence.

So, what problem is it, exactly, that you think you'd solve by making
tuple slices a view rather than a copy?

As I see it, you get the *possibility* of saving a few bytes (which
may go in the other direction) at a cost of complexity and speed. You
have greater dependence of internal objects on each other, you can't
get rid of the original tuple while any slice views of it exist, you
gain nothing in the way of syntax/usage simplicity... so what's the
point?

To my mind, one of the best things about Python is that (for the most
part) I don't have to worry about actual memory layout of objects. I
don't *care* how tuples are implemented, they just work. It seems to
me that you're saying that they work perfectly fine as-is, but that
you have a problem with the implementation that the language tries its
best to let you not care about. Is this purely abstract philosophy?

Jeff Shannon
Technician/Programmer
Credit International
 
G

George Sakkis

Terry Reedy said:
Unless you are GvR or Tim Peters,

Actually I am the OP. I posted my previous mail from Google groups, which for some reason put my
email instead of my name; it should be ok now.
throwing 'pythonic' at me doesn't cut it
with me, especially when you use it so shallowly. The current Pythonic
meaning of 'slice', as defined by GvR and implemented in core Python
sequence objects and documented in the reference manuals and reiterated by
GvR on PyDev in just the last day, is to make an independent in-memory
#copy# of the indicated part of a sequence. Both aspects are intentional
and desired features, not accidents.

Thanks for the info; a citation that supports your claim that the in-memory copy is part of the
*specification* of a tuple slice -- and not a choice that is subject to the implementation -- would
be useful. Note that I'm talking only about slices of *tuples* (or any immutable sequence for that
matter) here, not all slices. As for the "pythonic", I mentioned it as a loosely speaking synonym to
"simpler" or "more intuitive"; I apologize if this term has religious connotations in cl.py.
Yes, slicing, even in the Pythonic sense, may well simplify the OP's
algorithm (of which he gave almost no detail), but the whole point of this
thread is that he does not want to do that (make copy slices). While he
might shortsightedly think that he wants Guido to redefine slice and
replace the current implementation of tuple with a more spacious, possibly
slower one that would allow that definition, that will not solve his
current problem, if indeed he has one.

I fail to understand where does your strongly negative tone come from; certainly not from my posts.
I asked a simple question and I was expecting a simple answer, not defending myself from a
hypothetical shortsighted suggestion to Guido. Thankfully I got one (and only so far) good reason
for the current implementation from Fredrik Lundh, namely the reference to the original object.
As George Sakkis the OP noted, the essential data constituting a contiguous
section view are the underlying sequence and two position markers. Whether
one works with these directly or packages them into a tuple or user class
instance is a matter of relative conveniences. As it turns out, I was

Honestly, I can't imagine a case where supplying these three associated data packaged is *less*
convenient than spelling them out explicitly.
thinking about the design choices involved in a generic sequence view class
just the morning before reading the original post. But I have no idea
whether GS's foo function would justify the added overhead of such a thing.

This is not the point; that function was just the motivation for questioning the current tuple slice
implementation. I wouldn't start this thread in the first place if I didn't have the impression that
tuple views would be beneficial for many (most?) cases.
It partly depends on what he wishes to optimize, which I asked about, but
have not yet seen an answer about.

Are you sure you read the whole thread ? I replied explicitly on this to Jeff Shannon:

"You're probably right about the allocation time, but my main concern is the memory required for
each slice, which can be O(n) wrt the length of the whole tuple. I would like this to be constant
(at least if there was a way around to the problem of deleting the underlying sequence)."
So I suggested the simplest approach that would work. And that, to me, *is* pythonic!

Simplest to whom ? The user or the py-dev guy that implements tuples ? It sounds as if you have the
second in mind.
Terry J. Reedy

George
 
G

George Sakkis

Jeff Shannon said:
So, what problem is it, exactly, that you think you'd solve by making
tuple slices a view rather than a copy?

As I see it, you get the *possibility* of saving a few bytes (which

It all comes down on what you mean by "a few bytes". Since many (most?) slices are linear wrt to the
original sequence's length, it is not hard to think of algorithms that involve the creation of
*many* slices (e.g. most recursive divide-and-conquer algorithms). Implementing these using slices
simply does not scale as the input sequence gets larger. Of course, you can always use the standard
C/C++ approach and pass the original sequence along with the (start,stop,step) indices of the slice,
as Terry Reedy mentioned, but then you lose in expressiveness.
may go in the other direction) at a cost of complexity and speed. You
have greater dependence of internal objects on each other, you can't
get rid of the original tuple while any slice views of it exist, you
gain nothing in the way of syntax/usage simplicity... so what's the
point?

To my mind, one of the best things about Python is that (for the most
part) I don't have to worry about actual memory layout of objects. I
don't *care* how tuples are implemented, they just work. It seems to
me that you're saying that they work perfectly fine as-is, but that
you have a problem with the implementation that the language tries its
best to let you not care about. Is this purely abstract philosophy?

No, it's called "scalability", and it's not purely abstract philosophy AFAIK. I fully agree that for
the most part you don't have to care about it, and I'm grateful that python does all its magic
trasparently. However if you do care about it, and at the same time you are unwilling to sacrifice
the elegance of slices, the current implementation is not ideal.
Jeff Shannon
Technician/Programmer
Credit International

George
 
B

Bengt Richter

It all comes down on what you mean by "a few bytes". Since many (most?) slices are linear wrt to the
original sequence's length, it is not hard to think of algorithms that involve the creation of
*many* slices (e.g. most recursive divide-and-conquer algorithms). Implementing these using slices
simply does not scale as the input sequence gets larger. Of course, you can always use the standard
C/C++ approach and pass the original sequence along with the (start,stop,step) indices of the slice,
as Terry Reedy mentioned, but then you lose in expressiveness.
I didn't see the leadup to this, but what is the problem with just subclassing tuple to
give you the views you want?


Regards,
Bengt Richter
 
B

Bengt Richter

I think views are good for
1) saving memory
2) saving time (as you don't have to copy the elements into the new tuple)

And they are worth it. However, (as in other cases with slicing), it is
very easy and fast to create a view for a slice with the default step
'1', while it's a PITA and totally not worth it to create a view for a
slice with non default step. I think it would be good to:

if slice_step == 1
create_view
else
create_new_tuple

Actually, i think that slices with step, is a bad feature in general
and i think I will write a PEP to suggest their removal in python3k.
What's the big deal with other than 1 steps? It is just adjusting a few numbers
so that you can either index the new virtual slice with an integer and return the
element, in which case the index into the original tuple will be
someoffset+i*somefactor once you get past the limit checks for the virtual
slice. By the same token, transforming a few numbers of one virtual slice
into similar numbers for a a new virtual slice of that shouldn't be rocket science.
And it wouldn't have to be done more than once. Don't have time to do it now,
but there are plenty around here that could, I'm sure.

Regards,
Bengt Richter
 
N

Nick Coghlan

jfj said:
I think views are good for
1) saving memory
2) saving time (as you don't have to copy the elements into the new tuple)

1. Applies only if you are making large slices, or a lot of slices with each
containing at least 3 elements.
A view can also *cost* memory, when it looks at a small piece of a large
item. The view will keep the entire item alive, even though it needs only a
small piece.

2. Hell no. The *elements* aren't copied, pointers to the elements are. If you
*don't* copy the pointers, then every item access through the view involves an
indirection as the index into the original sequence gets calculated.

So views *may* save memory in some applications, but are unlikely to save time
in any application (except any saving resulting from the memory saving).

Cheers,
Nick.
 
N

Nick Coghlan

jfj said:
Actually, i think that slices with step, is a bad feature in general
and i think I will write a PEP to suggest their removal in python3k.

I wouldn't bother. Extended slicing was added to support those doing serious
numerical work in Python, and it won't get removed for all the reasons it was
added in the first place.

Cheers,
Nick.
 
G

George Sakkis

Bengt Richter said:
What's the big deal with other than 1 steps? It is just adjusting a few numbers
so that you can either index the new virtual slice with an integer and return the
element, in which case the index into the original tuple will be
someoffset+i*somefactor once you get past the limit checks for the virtual
slice. By the same token, transforming a few numbers of one virtual slice
into similar numbers for a a new virtual slice of that shouldn't be rocket science.
And it wouldn't have to be done more than once. Don't have time to do it now,
but there are plenty around here that could, I'm sure.

Regards,
Bengt Richter

Here's my (undocumented) version of it: http://rafb.net/paste/results/HkxmHp37.html
and its unit test: http://rafb.net/paste/results/2LIInT68.html

And some useless timing comparisons (I know it's a stupid example, don't flame me for this):

$ python /usr/lib/python2.3/timeit.py \
-s "x=tuple(xrange(10000))" \
"[x[1:-1] for n in xrange(100)]"
10 loops, best of 3: 3.84e+04 usec per loop

$ python /usr/lib/python2.3/timeit.py \
-s "from immutableseq import ImmutableSequence" \
-s "x=ImmutableSequence(xrange(10000))" \
"[x[1:-1] for n in xrange(100)]"
100 loops, best of 3: 5.85e+03 usec per loop

Feel free to comment or suggest improvements.

George
 
J

jfj

Jeff said:
So, what problem is it, exactly, that you think you'd solve by making
tuple slices a view rather than a copy?

I think views are good for
1) saving memory
2) saving time (as you don't have to copy the elements into the new tuple)

And they are worth it. However, (as in other cases with slicing), it is
very easy and fast to create a view for a slice with the default step
'1', while it's a PITA and totally not worth it to create a view for a
slice with non default step. I think it would be good to:

if slice_step == 1
create_view
else
create_new_tuple

Actually, i think that slices with step, is a bad feature in general
and i think I will write a PEP to suggest their removal in python3k.


Gerald
 
J

jfj

Nick said:
1. Applies only if you are making large slices, or a lot of slices with
each containing at least 3 elements.
A view can also *cost* memory, when it looks at a small piece of a
large item. The view will keep the entire item alive, even though it
needs only a small piece.

That is correct.
2. Hell no. The *elements* aren't copied, pointers to the elements are.
If you *don't* copy the pointers, then every item access through the
view involves an indirection as the index into the original sequence
gets calculated.

If you have

x=(1,2,...100001)
y=x[:-1]

then you copy 100000 pointers AND you INCREF them AND you DECREF them
when y dies.

The unfortunate case by (1) would be:

x=(1,2,...100001)
x=x[:1]
So views *may* save memory in some applications, but are unlikely to
save time in any application (except any saving resulting from the
memory saving).

They do. If tp_dealloc of a tuple view doesn't decref the pointers.

We should look for what is the most common case.


Gerald

-PS: the PEP for the removal ought to have a ":)" at the end.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top