Tuple slices

Discussion in 'Python' started by George Sakkis, Jan 24, 2005.

  1. Why does slicing a tuple returns a new tuple instead of a view of the existing one, given that
    tuples are immutable ? I ended up writing a custom ImmutableSequence class that does this, but I
    wonder why it is not implemented for tuples.

    George
    George Sakkis, Jan 24, 2005
    #1
    1. Advertising

  2. George Sakkis wrote:

    > Why does slicing a tuple returns a new tuple instead of a view of the existing one, given that
    > tuples are immutable ?


    really?

    >>> a = 1, 2, 3
    >>> b = a[:]
    >>> a is b

    True

    </F>
    Fredrik Lundh, Jan 24, 2005
    #2
    1. Advertising

  3. Fredrik Lundh wrote:
    > George Sakkis wrote:
    >
    >>Why does slicing a tuple returns a new tuple instead of a view of the existing one, given that
    >>tuples are immutable ?

    >
    > really?
    >
    >
    >>>>a = 1, 2, 3
    >>>>b = a[:]
    >>>>a is b

    > True


    My impression was that full tuple copies didn't actually copy, but that
    slicing a subset of a tuple might. Not exactly sure how to test this, but:

    py> a = 1, 2, 3
    py> a[:2] is a[:2]
    False

    So _something_ at least is different between the two slices...

    Steve
    Steven Bethard, Jan 24, 2005
    #3
  4. On Mon, 24 Jan 2005 18:45:46 +0100
    "Fredrik Lundh" <> wrote:

    > George Sakkis wrote:
    >
    > > Why does slicing a tuple returns a new tuple instead of a view of
    > > the existing one, given that tuples are immutable ?

    >
    > really?


    Well... seems like this case is optimized to return the original tuple
    just incrementing its reference count and returning

    tupleobject.c, 330-335

    if (ilow == 0 && ihigh == a->ob_size && PyTuple_CheckExact(a)) {
    Py_INCREF(a);
    return (PyObject *)a;
    }


    >
    > >>> a = 1, 2, 3
    > >>> b = a[:]
    > >>> a is b

    > True
    >
    > </F>
    >
    >
    >
    > --
    > http://mail.python.org/mailman/listinfo/python-list
    Pedro Werneck, Jan 24, 2005
    #4
  5. On Mon, 24 Jan 2005 18:45:46 +0100
    "Fredrik Lundh" <> wrote:

    > George Sakkis wrote:
    >
    > > Why does slicing a tuple returns a new tuple instead of a view of
    > > the existing one, given that tuples are immutable ?

    >
    > really?



    Well... seems like this case (slicing the whole tuple) is optimized to
    return the original tuple just incrementing its reference count and returning


    tupleobject.c, 330-335

    if (ilow == 0 && ihigh == a->ob_size && PyTuple_CheckExact(a)) {
    Py_INCREF(a);
    return (PyObject *)a;
    }


    >
    > >>> a = 1, 2, 3
    > >>> b = a[:]
    > >>> a is b

    > True
    >
    > </F>
    >
    >
    >
    > --
    > http://mail.python.org/mailman/listinfo/python-list
    Pedro Werneck, Jan 24, 2005
    #5
  6. Steven Bethard wrote:

    >>>>>a = 1, 2, 3
    >>>>>b = a[:]
    >>>>>a is b

    >> True

    >
    > My impression was that full tuple copies didn't actually copy, but that slicing a subset of a
    > tuple might. Not exactly sure how to test this, but:
    >
    > py> a = 1, 2, 3
    > py> a[:2] is a[:2]
    > False


    yup. and to figure out why things are done this way, consider this case:

    >>> a = give_me_a_huge_tuple()
    >>> len(a)

    (a rather large number)
    >>> b = a[:2]
    >>> del a


    (IIRC, I proposed to add "substrings" when I implemented the Unicode string
    type, but that idea was rejected, for the very same "and how do you get rid of
    the original object" reason)

    </F>
    Fredrik Lundh, Jan 24, 2005
    #6
  7. Fredrik Lundh wrote:
    > Steven Bethard wrote:
    >>
    >>My impression was that full tuple copies didn't actually copy, but that slicing a subset of a
    >>tuple might. Not exactly sure how to test this, but:
    >>
    >>py> a = 1, 2, 3
    >>py> a[:2] is a[:2]
    >>False

    >
    > yup. and to figure out why things are done this way, consider this case:
    >
    > >>> a = give_me_a_huge_tuple()
    > >>> len(a)

    > (a rather large number)
    > >>> b = a[:2]
    > >>> del a

    >
    > (IIRC, I proposed to add "substrings" when I implemented the Unicode string
    > type, but that idea was rejected, for the very same "and how do you get rid of
    > the original object" reason)


    Ahh. Yeah, that seems sensible. I don't think I've ever written code
    like that, but if I did, I'd almost certainly want it to work as it does
    now...

    Steve
    Steven Bethard, Jan 24, 2005
    #7
  8. George Sakkis

    Jeff Epler Guest

    The cpython implementation stores tuples in memory like this:
    [common fields for all Python objects]
    [common fields for all variable-size python objects, including tuple size]
    [fields specific to tuple objects, if any]
    [array of PyObject*, one for each item in the tuple]
    This way of storing variable-size Python objects was chosen in part
    because it reuqires only one allocation for an object, not two.
    However, there is no way for one tuple to point to a slice of another
    tuple.

    there's no reason that some other python implementation couldn't make a
    different choice.

    Jeff

    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.2.1 (GNU/Linux)

    iD8DBQFB9V1/Jd01MZaTXX0RAnDMAJ0f2v26tba9j46KsYV3SkylB51KlQCfeTtK
    YYuTzz1nulvDc8Ad9p78AGo=
    =+SO0
    -----END PGP SIGNATURE-----
    Jeff Epler, Jan 24, 2005
    #8
  9. "Fredrik Lundh" <> wrote in message
    news:...
    > Steven Bethard wrote:
    >
    > >>>>>a = 1, 2, 3
    > >>>>>b = a[:]
    > >>>>>a is b
    > >> True

    > >
    > > My impression was that full tuple copies didn't actually copy, but that slicing a subset of a
    > > tuple might. Not exactly sure how to test this, but:
    > >
    > > py> a = 1, 2, 3
    > > py> a[:2] is a[:2]
    > > False

    >
    > yup. and to figure out why things are done this way, consider this case:
    >
    > >>> a = give_me_a_huge_tuple()
    > >>> len(a)

    > (a rather large number)
    > >>> b = a[:2]
    > >>> del a

    >
    > (IIRC, I proposed to add "substrings" when I implemented the Unicode string
    > type, but that idea was rejected, for the very same "and how do you get rid of
    > the original object" reason)
    >
    > </F>


    Fair enough. So perhaps the question is whether such cases are more regular than something like:
    a = give_me_a_huge_tuple()
    slices = [a[i:j] for i in xrange(len(a)) for j in xrange(i+1, len(a)+1)]

    George
    George Sakkis, Jan 24, 2005
    #9
  10. George Sakkis

    Peter Hansen Guest

    George Sakkis wrote:
    > Fair enough. So perhaps the question is whether such cases are more regular than something like:
    > a = give_me_a_huge_tuple()
    > slices = [a[i:j] for i in xrange(len(a)) for j in xrange(i+1, len(a)+1)]


    I believe the general usage of tuples tends to mean that
    "give_me_a_huge_tuple()" just doesn't happen except with
    those who insist on using tuples as though they were nothing
    other than read-only lists.

    If you use a tuple the way it was apparently intended, you
    are extraordinarily unlikely to find yourself with a
    huge one requiring slicing in such a way that you care
    whether it is a "view" or a new object.

    -Peter
    Peter Hansen, Jan 24, 2005
    #10
  11. George Sakkis

    Terry Reedy Guest

    "George Sakkis" <> wrote in message
    news:...
    > Why does slicing a tuple returns a new tuple instead of a
    > view of the existing one, given that
    > tuples are immutable ? I ended up writing a custom
    > ImmutableSequence class that does this, but I
    > wonder why it is not implemented for tuples.


    Numpy and Numarray both do this -- generate views into arrays -- because
    they are specifically aimed at large datasets for which the memory savings
    overrides the complications.

    Aside from the problem of not being able to delete the underlying object,
    the view object for a tuple would have to be a new type of object with a
    new set of methods. So there would be one more thing to learn. And so
    'type(o) is tuple' would generally have to be replaced by
    'isinstance(type(o), (tuple,tupleview)).

    For slices that are used within expressions and then discarded,
    a= (1,2,3)
    b = a(1:) + a:)1)
    an internal tuple view might be an interesting optimization.

    Terry J. Reedy
    Terry Reedy, Jan 24, 2005
    #11
  12. "Peter Hansen" <> wrote in message news:...
    > George Sakkis wrote:
    > > Fair enough. So perhaps the question is whether such cases are more regular than something like:
    > > a = give_me_a_huge_tuple()
    > > slices = [a[i:j] for i in xrange(len(a)) for j in xrange(i+1, len(a)+1)]

    >
    > I believe the general usage of tuples tends to mean that
    > "give_me_a_huge_tuple()" just doesn't happen except with
    > those who insist on using tuples as though they were nothing
    > other than read-only lists.
    >
    > If you use a tuple the way it was apparently intended, you
    > are extraordinarily unlikely to find yourself with a
    > huge one requiring slicing in such a way that you care
    > whether it is a "view" or a new object.
    >
    > -Peter


    Actually my initial motivation was not a huge tuple I had to slice many times. It was something much
    less extraordinarily unlikely, a recursive function with a sequence parameter:

    def foo(sequence):
    # base_case
    # do_stuff()
    combine(foo(sequence[:n]),
    foo(sequence[n:]))

    Having each slice be a view of the original sequence instead of a fresh copy would be a Good Thing
    (tm).

    George
    George Sakkis, Jan 24, 2005
    #12
  13. "Terry Reedy" <> wrote in message
    news:...
    >
    > Aside from the problem of not being able to delete the underlying object,
    > the view object for a tuple would have to be a new type of object with a
    > new set of methods.


    It *could*, but it doesn't have to. One can represent a view as essentially an object with a pointer
    to a memory buffer and a (start,stop,step) triple. Then a "real tuple" is just a "view" with the
    triple being (0, len(sequence), 1).

    George
    George Sakkis, Jan 24, 2005
    #13
  14. George Sakkis

    Jeff Shannon Guest

    [My newsreader crapped out on sending this; apologies if it appears
    twice.]

    George Sakkis wrote:

    > "Terry Reedy" <> wrote in message
    > news:...
    >
    >>Aside from the problem of not being able to delete the underlying object,
    >>the view object for a tuple would have to be a new type of object with a
    >>new set of methods.

    >
    > It *could*, but it doesn't have to. One can represent a view as essentially an object with a pointer
    > to a memory buffer and a (start,stop,step) triple. Then a "real tuple" is just a "view" with the
    > triple being (0, len(sequence), 1).


    Except that that's not how Python tuples *are* constructed, and it'd
    be a pretty big deal to redo the architecture of all tuples to support
    this relatively special case.

    Even if this re-architecting were done, you're still constructing a
    new object -- the difference is that you're creating this
    (start,stop,step) triple instead of duplicating a set of PyObject*
    pointers, and then doing math based on those values instead of
    straightforward pointer access. I'm not at all convinced that this
    really saves you a significant amount for tuple slices (really, you're
    still constructing a new tuple anyhow, aren't you?), and it's going to
    cost a bit in both execution time and complexity in the common case
    (accessing tuples without slicing). If there's a big cost in object
    construction, it's probably going to be the memory allocation, and for
    a reasonable tuple the size of the memory required is not going to
    significantly affect the allocation time.

    Jeff Shannon
    Technician/Programmer
    Credit International
    Jeff Shannon, Jan 25, 2005
    #14
  15. "Jeff Shannon" <> wrote in message news:...
    > [My newsreader crapped out on sending this; apologies if it appears
    > twice.]
    >
    > George Sakkis wrote:
    >
    > > "Terry Reedy" <> wrote in message
    > > news:...
    > >
    > >>Aside from the problem of not being able to delete the underlying object,
    > >>the view object for a tuple would have to be a new type of object with a
    > >>new set of methods.

    > >
    > > It *could*, but it doesn't have to. One can represent a view as essentially an object with a

    pointer
    > > to a memory buffer and a (start,stop,step) triple. Then a "real tuple" is just a "view" with the
    > > triple being (0, len(sequence), 1).

    >
    > Except that that's not how Python tuples *are* constructed, and it'd
    > be a pretty big deal to redo the architecture of all tuples to support
    > this relatively special case.
    >
    > Even if this re-architecting were done, you're still constructing a
    > new object -- the difference is that you're creating this
    > (start,stop,step) triple instead of duplicating a set of PyObject*
    > pointers, and then doing math based on those values instead of
    > straightforward pointer access. I'm not at all convinced that this
    > really saves you a significant amount for tuple slices (really, you're
    > still constructing a new tuple anyhow, aren't you?), and it's going to
    > cost a bit in both execution time and complexity in the common case
    > (accessing tuples without slicing). If there's a big cost in object
    > construction, it's probably going to be the memory allocation, and for
    > a reasonable tuple the size of the memory required is not going to
    > significantly affect the allocation time.
    >
    > Jeff Shannon
    > Technician/Programmer
    > Credit International
    >


    You're probably right about the allocation time, but my main concern is the memory required for each
    slice, which can be O(n) wrt the length of the whole tuple. I would like this to be constant (at
    least if there was a way around to the problem of deleting the underlying sequence).

    George
    George Sakkis, Jan 25, 2005
    #15
  16. George Sakkis

    Terry Reedy Guest

    "George Sakkis" <> wrote in message
    news:...
    > Actually my initial motivation was not a huge tuple I had to slice many
    > times. It was something much
    > less extraordinarily unlikely, a recursive function with a sequence
    > parameter:
    >
    > def foo(sequence):
    > # base_case
    > # do_stuff()
    > combine(foo(sequence[:n]),
    > foo(sequence[n:]))
    >
    > Having each slice be a view of the original sequence instead of a fresh
    > copy would be a Good Thing


    Why? To save time? memory? Either would require more that a few bytes per
    slice. If they are, you can probably use virtual slices as follows.

    def foo(sequence):
    def _foo(seq, start, stop)
    # base_case
    # do_stuff()
    combine(_foo(seq, start, n), _foo(seq, n, stop))
    _foo(sequence, 0, len(sequence)

    In other words, if you don't really want slices copied out of the sequence,
    then don't slice! Just use 2 ints to indicate the working region or view.
    Both this and using a nested function with additional params are standard
    techniques. This also works when the seq is mutable and you want changes
    to the 'slice' to change the original, as in quicksort.

    Terry J. Reedy
    Terry Reedy, Jan 25, 2005
    #16
  17. George Sakkis

    Nick Coghlan Guest

    George Sakkis wrote:
    > You're probably right about the allocation time, but my main concern is the memory required for each
    > slice, which can be O(n) wrt the length of the whole tuple. I would like this to be constant (at
    > least if there was a way around to the problem of deleting the underlying sequence).


    If you really want a view into a tuple (or any sequence for that matter):

    from itertools import islice
    islice(iter(seq), start, end, step)

    Then use the various functions in itertools to work with the resulting iterator
    (since the syntactic support for working with iterators is currently quite
    limited - slicing, concatenation and repetition are spelt with
    itertools.islice(), itertools.chain() and itertools.repeat(), rather than with
    their standard syntactic equivalents "[x:y:z]", "+" and "*").

    The above approach does suffer from the problem of holding a reference to the
    original sequence, though.

    Cheers,
    Nick.

    --
    Nick Coghlan | | Brisbane, Australia
    ---------------------------------------------------------------
    http://boredomandlaziness.skystorm.net
    Nick Coghlan, Jan 25, 2005
    #17
  18. George Sakkis

    Guest

    Terry Reedy wrote:
    > "George Sakkis" <> wrote in message
    > news:...
    > > Actually my initial motivation was not a huge tuple I had to slice

    many
    > > times. It was something much
    > > less extraordinarily unlikely, a recursive function with a sequence


    > > parameter:
    > >
    > > def foo(sequence):
    > > # base_case
    > > # do_stuff()
    > > combine(foo(sequence[:n]),
    > > foo(sequence[n:]))
    > >
    > > Having each slice be a view of the original sequence instead of a

    fresh
    > > copy would be a Good Thing

    >
    > Why? To save time? memory? Either would require more that a few

    bytes per
    > slice. If they are, you can probably use virtual slices as follows.
    >
    > def foo(sequence):
    > def _foo(seq, start, stop)
    > # base_case
    > # do_stuff()
    > combine(_foo(seq, start, n), _foo(seq, n, stop))
    > _foo(sequence, 0, len(sequence)
    >
    > In other words, if you don't really want slices copied out of the

    sequence,
    > then don't slice! Just use 2 ints to indicate the working region or

    view.
    > Both this and using a nested function with additional params are

    standard
    > techniques. This also works when the seq is mutable and you want

    changes
    > to the 'slice' to change the original, as in quicksort.
    >
    > Terry J. Reedy



    I would say these are standard *C/C++* techniques; slicing simplifies
    things and thus seems more 'pythonic' to me.

    George
    , Jan 25, 2005
    #18
  19. An iterator is perfectly ok if all you want is to iterate over the
    elements of a view, but as you noted, iterators are less flexible than
    the underlying sequence. The view should be (or at least appear)
    identical in functionality (i.e. public methods) with its underlying
    sequence.

    George
    George Sakkis, Jan 25, 2005
    #19
  20. George Sakkis

    Terry Reedy Guest

    <> wrote in message
    news:...
    >
    > Terry Reedy wrote:
    >> In other words, if you don't really want slices copied out of the
    >> sequence,
    >> then don't slice! Just use 2 ints to indicate the working region or
    >> view.
    >> Both this and using a nested function with additional params are
    >> standard
    >> techniques. This also works when the seq is mutable and you want
    >> changes
    >> to the 'slice' to change the original, as in quicksort.

    >
    > I would say these are standard *C/C++* techniques; slicing simplifies
    > things and thus seems more 'pythonic' to me.


    Unless you are GvR or Tim Peters, throwing 'pythonic' at me doesn't cut it
    with me, especially when you use it so shallowly. The current Pythonic
    meaning of 'slice', as defined by GvR and implemented in core Python
    sequence objects and documented in the reference manuals and reiterated by
    GvR on PyDev in just the last day, is to make an independent in-memory
    #copy# of the indicated part of a sequence. Both aspects are intentional
    and desired features, not accidents.

    Yes, slicing, even in the Pythonic sense, may well simplify the OP's
    algorithm (of which he gave almost no detail), but the whole point of this
    thread is that he does not want to do that (make copy slices). While he
    might shortsightedly think that he wants Guido to redefine slice and
    replace the current implementation of tuple with a more spacious, possibly
    slower one that would allow that definition, that will not solve his
    current problem, if indeed he has one.

    As George Sakkis the OP noted, the essential data constituting a contiguous
    section view are the underlying sequence and two position markers. Whether
    one works with these directly or packages them into a tuple or user class
    instance is a matter of relative conveniences. As it turns out, I was
    thinking about the design choices involved in a generic sequence view class
    just the morning before reading the original post. But I have no idea
    whether GS's foo function would justify the added overhead of such a thing.
    It partly depends on what he wishes to optimize, which I asked about, but
    have not yet seen an answer about. So I suggested the simplest approach
    that would work. And that, to me, *is* pythonic!

    Terry J. Reedy
    Terry Reedy, Jan 25, 2005
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Michal Mikolajczyk
    Replies:
    1
    Views:
    792
    Larry Bates
    Apr 20, 2004
  2. Jeff Epler
    Replies:
    0
    Views:
    923
    Jeff Epler
    Apr 20, 2004
  3. Bill Scherer
    Replies:
    0
    Views:
    602
    Bill Scherer
    Apr 20, 2004
  4. Gregor Horvath

    Why tuple with one item is no tuple

    Gregor Horvath, Mar 15, 2005, in forum: Python
    Replies:
    37
    Views:
    793
    Antoon Pardon
    Mar 30, 2005
  5. Steve
    Replies:
    1
    Views:
    945
    Fredrik Lundh
    Dec 13, 2005
Loading...

Share This Page