Specify start and length, beside start and end, in slices

Discussion in 'Python' started by Noam Raphael, May 21, 2004.

  1. Noam Raphael

    Noam Raphael Guest

    Hello,
    Many times I find myself asking for a slice of a specific length, and
    writing something like l[12345:12345+10].
    This happens both in interactive use and when writing Python programs,
    where I have to write an expression twice (or use a temporary variable).

    Wouldn't it be nice if the Python grammar had supported this frequent
    use? My idea is that the expression above might be expressed as
    l[12345:>10].

    This change, as far as I can see, is quite small: it affects only the
    grammar and byte-compiling, and has no side effects.

    The only change in syntax is that short_slice would be changed from
    [lower_bound] ":" [upper_bound]
    to
    ([lower_bound] ":" [upper_bound]) | ([lower_bound] ":>" [slice_length])

    Just to show what will happen to the byte code: l[12345:12345+10] is
    compiled to:
    LOAD_GLOBAL 0 (l)
    LOAD_CONST 1 (12345)
    LOAD_CONST 1 (12345)
    LOAD_CONST 2 (10)
    BINARY_ADD
    SLICE+3

    I suggest that l[12345:>10] would be compiled to:
    LOAD_GLOBAL 0 (l)
    LOAD_CONST 1 (12345)
    DUP_TOP
    LOAD_CONST 2 (10)
    BINARY_ADD
    SLICE+3

    Well, what do you think? I would like to hear your comments.

    Have a good day (or night),
    Noam Raphael
     
    Noam Raphael, May 21, 2004
    #1
    1. Advertising

  2. On 2004-05-21, Noam Raphael <> wrote:

    > Many times I find myself asking for a slice of a specific length, and
    > writing something like l[12345:12345+10].


    [...]

    > Wouldn't it be nice if the Python grammar had supported this frequent
    > use? My idea is that the expression above might be expressed as
    > l[12345:>10].


    It's a bit less efficient, but you can currently spell that as

    l[12345:][:10]

    --
    Grant Edwards grante Yow! We just joined the
    at civil hair patrol!
    visi.com
     
    Grant Edwards, May 21, 2004
    #2
    1. Advertising

  3. Noam Raphael

    Noam Raphael Guest

    Grant Edwards wrote:
    > On 2004-05-21, Noam Raphael <> wrote:
    >
    >
    >>Many times I find myself asking for a slice of a specific length, and
    >>writing something like l[12345:12345+10].

    >
    >
    > [...]
    >
    >
    >>Wouldn't it be nice if the Python grammar had supported this frequent
    >>use? My idea is that the expression above might be expressed as
    >>l[12345:>10].

    >
    >
    > It's a bit less efficient, but you can currently spell that as
    >
    > l[12345:][:10]
    >

    That is true, but if the list is long, it's *much* less efficient.

    Thanks for your comment,
    Noam
     
    Noam Raphael, May 21, 2004
    #3
  4. Noam Raphael

    Peter Hansen Guest

    Noam Raphael wrote:

    > Grant Edwards wrote:
    >> It's a bit less efficient, but you can currently spell that as
    >>
    >> l[12345:][:10]
    >>

    > That is true, but if the list is long, it's *much* less efficient.


    Considering that the interpreter special-cases some integer math
    including the BINARY_ADD, it likely wouldn't take a very long list
    to pass the point where they're the same.

    I like the idea of the optimization, in a sense, but I don't
    like the syntax and doubt that there is much performance gain to be
    had. There are probably better places for people to hack on the
    interpreter, and which don't need syntax changes.

    -Peter
     
    Peter Hansen, May 21, 2004
    #4
  5. Noam Raphael

    Noam Raphael Guest

    Peter Hansen wrote:
    > Noam Raphael wrote:
    >
    >> Grant Edwards wrote:
    >>
    >>> It's a bit less efficient, but you can currently spell that as
    >>>
    >>> l[12345:][:10]
    >>>

    >> That is true, but if the list is long, it's *much* less efficient.

    >
    >
    > Considering that the interpreter special-cases some integer math
    > including the BINARY_ADD, it likely wouldn't take a very long list
    > to pass the point where they're the same.
    >


    I don't understand: If the list is of length 1000000, wouldn't Grant
    Edwards' suggestion make 1000000-12345 new references, and then take
    only the first ten of them?
     
    Noam Raphael, May 21, 2004
    #5
  6. On 2004-05-21, Noam Raphael <> wrote:

    >>>> It's a bit less efficient, but you can currently spell that as
    >>>>
    >>>> l[12345:][:10]
    >>>>
    >>> That is true, but if the list is long, it's *much* less efficient.

    >>
    >> Considering that the interpreter special-cases some integer math
    >> including the BINARY_ADD, it likely wouldn't take a very long list
    >> to pass the point where they're the same.


    I'm afraid I don't understand either. Where do integer math
    shortcuts enter the picture? It seems to me it's all about
    building a (possibly long new list) which you're going to throw
    away after you build another list from the front it.

    Unless the compiler is smart enough to figure out what you're
    aiming at and skip the intermediate list entirely.

    > I don't understand: If the list is of length 1000000, wouldn't
    > Grant Edwards' suggestion make 1000000-12345 new references,
    > and then take only the first ten of them?


    Yes, according to my understanding of how things work, that's
    what happens (my spelling is pretty inefficient for pulling
    small chunks from the beginnings of long lists), so if you do
    a lot of that, it may be worth worrying about.

    --
    Grant Edwards grante Yow! Civilization is
    at fun! Anyway, it keeps
    visi.com me busy!!
     
    Grant Edwards, May 21, 2004
    #6
  7. Noam Raphael

    Peter Hansen Guest

    Noam Raphael wrote:

    > Peter Hansen wrote:
    >
    >> Noam Raphael wrote:
    >>
    >>> Grant Edwards wrote:
    >>>
    >>>> It's a bit less efficient, but you can currently spell that as
    >>>>
    >>>> l[12345:][:10]
    >>>>
    >>> That is true, but if the list is long, it's *much* less efficient.

    >>
    >>
    >>
    >> Considering that the interpreter special-cases some integer math
    >> including the BINARY_ADD, it likely wouldn't take a very long list
    >> to pass the point where they're the same.
    >>

    >
    > I don't understand: If the list is of length 1000000, wouldn't Grant
    > Edwards' suggestion make 1000000-12345 new references, and then take
    > only the first ten of them?


    Sorry, it was perhaps unclear that I was agreeing with you. For
    an extremely short list, it's possible that it would be faster
    to do Grant's method, but what I was trying to say is that even
    if that's true, I expect that for a list of more than a few dozen
    elements it would not be faster. Looking at it again, I suspect
    that it would actually never be faster, given that probably
    about as many bytecode instructions are executed, and then there's
    the extra memory allocation for the temporary list, the copying,
    etc.

    -Peter
     
    Peter Hansen, May 21, 2004
    #7
  8. Noam Raphael

    Peter Hansen Guest

    Peter Hansen wrote:

    > For an extremely short list, it's possible that it would be faster
    > to do Grant's method, but what I was trying to say is that even
    > if that's true, I expect that for a list of more than a few dozen
    > elements it would not be faster. Looking at it again, I suspect
    > that it would actually never be faster, given that probably
    > about as many bytecode instructions are executed, and then there's
    > the extra memory allocation for the temporary list, the copying,


    timeit confirms this with variations on this:

    c:\>python -c "import timeit as t; t = t.Timer('x[y:][:10]', 'y=10000;
    x=range(y)'); print t.timeit()"

    and this:

    c:\>python -c "import timeit as t; t = t.Timer('x[y:y+10]', 'y=10000;
    x=range(y)'); print t.timeit()"

    -Peter
     
    Peter Hansen, May 21, 2004
    #8
  9. Noam Raphael

    Terry Reedy Guest

    "Noam Raphael" <> wrote in message
    news:c8l3s3$27o$...

    > Many times I find myself asking for a slice of a specific length, and
    > writing something like l[12345:12345+10].
    > This happens both in interactive use and when writing Python programs,
    > where I have to write an expression twice (or use a temporary variable).


    With an expression, I'd go for the temp var.

    > Wouldn't it be nice if the Python grammar had supported this frequent
    > use?


    I take this as 'directly support' versus the current indirect support via
    start+len.
    My answer: superficially (in isolation) yes, but overall, in the context of
    Python's somewhat minimalistic grammar/syntax, no. Two ways to slice might
    easily be seen as one too many. In addition, the rationale for this, your
    favorite little addition, would admit perhaps 50 others like it.

    > My idea is that the expression above might be expressed as l[12345:>10].


    Sorry, this strike me as ugly, too much like and easily confused with
    l[12345:-10], and too much looking like a syntax error.

    Given that some other languages slice with (start,len) arguments (but not
    then, that I remember or know of, also with a start,stop option), I am
    *sure* that Guido thought carefully about the issue. A plus with his
    choice is ability to offset (index) from the end *without* calling the len
    function.

    > This change, as far as I can see, is quite small: it affects only the
    > grammar and byte-compiling, and has no side effects.


    Except the cognitive dissonance of two *almost* identical syntaxes and the
    flood of other 'small', 'no side effect' change requests.

    > Well, what do you think? I would like to hear your comments.


    Your wish ...

    Terry J. Reedy
     
    Terry Reedy, May 21, 2004
    #9
  10. Noam Raphael

    Larry Bates Guest

    I think it is odd that I have never encounter
    many of these types of constructs repeatedly in
    my code. Perhaps you could share a little more
    of where you see this type of think popping up
    a lot? I suspect that there is another method
    for solving the problem that might be faster
    and easier to read/program.

    Larry Bates,
    Syscon, Inc.


    "Noam Raphael" <> wrote in message
    news:c8l3s3$27o$...
    > Hello,
    > Many times I find myself asking for a slice of a specific length, and
    > writing something like l[12345:12345+10].
    > This happens both in interactive use and when writing Python programs,
    > where I have to write an expression twice (or use a temporary variable).
    >
    > Wouldn't it be nice if the Python grammar had supported this frequent
    > use? My idea is that the expression above might be expressed as
    > l[12345:>10].
    >
    > This change, as far as I can see, is quite small: it affects only the
    > grammar and byte-compiling, and has no side effects.
    >
    > The only change in syntax is that short_slice would be changed from
    > [lower_bound] ":" [upper_bound]
    > to
    > ([lower_bound] ":" [upper_bound]) | ([lower_bound] ":>" [slice_length])
    >
    > Just to show what will happen to the byte code: l[12345:12345+10] is
    > compiled to:
    > LOAD_GLOBAL 0 (l)
    > LOAD_CONST 1 (12345)
    > LOAD_CONST 1 (12345)
    > LOAD_CONST 2 (10)
    > BINARY_ADD
    > SLICE+3
    >
    > I suggest that l[12345:>10] would be compiled to:
    > LOAD_GLOBAL 0 (l)
    > LOAD_CONST 1 (12345)
    > DUP_TOP
    > LOAD_CONST 2 (10)
    > BINARY_ADD
    > SLICE+3
    >
    > Well, what do you think? I would like to hear your comments.
    >
    > Have a good day (or night),
    > Noam Raphael
     
    Larry Bates, May 22, 2004
    #10
  11. Noam Raphael

    Noam Raphael Guest

    Hello,

    Terry Reedy wrote:
    > "Noam Raphael" <> wrote in message
    > news:c8l3s3$27o$...
    >
    >
    >>Many times I find myself asking for a slice of a specific length, and
    >>writing something like l[12345:12345+10].
    >>This happens both in interactive use and when writing Python programs,
    >>where I have to write an expression twice (or use a temporary variable).

    >
    >
    > With an expression, I'd go for the temp var.
    >
    >
    >>Wouldn't it be nice if the Python grammar had supported this frequent
    >>use?

    >
    >
    > I take this as 'directly support' versus the current indirect support via
    > start+len.
    > My answer: superficially (in isolation) yes, but overall, in the context of
    > Python's somewhat minimalistic grammar/syntax, no. Two ways to slice might
    > easily be seen as one too many.


    I agree that Python should be kept easy to read and understand. However,
    it doesn't mean that there's only one way to do everything. An example
    (it's even from slices): the Numeric people asked for the "..." token
    and got it, even though you can live without it - it simply makes your
    life easier.

    > In addition, the rationale for this, your
    > favorite little addition, would admit perhaps 50 others like it.
    >
    >
    >>My idea is that the expression above might be expressed as l[12345:>10].

    >
    >
    > Sorry, this strike me as ugly, too much like and easily confused with
    > l[12345:-10], and too much looking like a syntax error.

    Well, of course, it *is* a syntax error right now. As for what it looks
    like - I can't argue with what it looks like to you, but since '>' is
    generally perceived as having something to do with "go in the right
    direction", I think that l[12345:>10] can easily be read as "start from
    12345, and take 10 steps to the right. Take all the items you passed over."
    >
    > Given that some other languages slice with (start,len) arguments (but not
    > then, that I remember or know of, also with a start,stop option), I am
    > *sure* that Guido thought carefully about the issue. A plus with his
    > choice is ability to offset (index) from the end *without* calling the len
    > function.
    >

    I think that the fact that other languages use (start, len) quite
    contradicts your assumption that only 50 other people would like it. I
    don't see what brings you to think that you represent 99.99 percent of
    Python users.
    I like Python's slicing very much, and I agree that given only one
    slicing method, (start, end) should be chosen, but what's wrong with
    adding another?
    >
    >>This change, as far as I can see, is quite small: it affects only the
    >>grammar and byte-compiling, and has no side effects.

    >
    >
    > Except the cognitive dissonance of two *almost* identical syntaxes and the
    > flood of other 'small', 'no side effect' change requests.
    >

    Why not judge each 'small, no side effect' change request for its own
    sake? Do you think that Python should only undergo big and complex changes?
    >
    >>Well, what do you think? I would like to hear your comments.

    >
    >
    > Your wish ...

    Yes, I do like to hear other opinions. Perhaps *you* could have been a
    bit more open to hear them...
    >
    > Terry J. Reedy
    >
    >

    Noam Raphael
     
    Noam Raphael, May 22, 2004
    #11
  12. Noam Raphael

    Noam Raphael Guest

    Hello,

    Larry Bates wrote:
    > I think it is odd that I have never encounter
    > many of these types of constructs repeatedly in
    > my code. Perhaps you could share a little more
    > of where you see this type of think popping up
    > a lot? I suspect that there is another method
    > for solving the problem that might be faster
    > and easier to read/program.
    >
    > Larry Bates,
    > Syscon, Inc.
    >
    >


    With pleasure. Here are two examples:

    1. Say I have a list with the number of panda bears hunted in each
    month, starting from 1900. Now I want to know how many panda bears were
    hunted in year y. Currently, I have to write something like this:
    sum(huntedPandas[(y-1900)*12:(y-1900)*12+12])
    If my suggestion is accepted, I would be able to write:
    sum(huntedPandas[(y-1900)*12:>12])

    (Yes, I know that it may also be expressed as
    sum(huntedPandas[(y-1900)*12:(y-1901)*12]), but it's less clear what I
    mean, and it's still longer)

    2. Many data files contain fields of fixed length. Just an example: say
    I want to get the color of the first pixel of a 24-bit color BMP file.
    Say I have a function which gets a 4-byte string and converts it into a
    32-bit integer. The four bytes, from byte no. 10, are the size of the
    header, in bytes. Right now, if I don't want to use temporary variables,
    I have to write:
    picture[s2i(picture[10:14]):s2i(picture[10:14])+4]
    I think this is nicer (and quicker):
    picture[s2i(picture[10:>4]):>4]

    Thanks for your interest,
    Noam Raphael
     
    Noam Raphael, May 22, 2004
    #12
  13. Noam Raphael

    Terry Reedy Guest

    "Noam Raphael" <> wrote in message
    news:c8o9vo$las$...
    > contradicts your assumption that only 50 other people would like it. I
    > don't see what brings you to think that you represent 99.99 percent of
    > Python users.


    Projecting thoughts into my brain that I never had is stupid. I really
    don't like that.

    > Perhaps *you* could have been a bit more open to hear them...


    Making false ad hominen comments is stupid. I don't like that either.

    I an disappointed. Sorry I took your request for comments *on the
    proposal* seriously.

    Terry J. Reedy
     
    Terry Reedy, May 23, 2004
    #13
  14. Op 2004-05-21, Terry Reedy schreef <>:
    >
    > "Noam Raphael" <> wrote in message
    > news:c8l3s3$27o$...
    >
    >> Many times I find myself asking for a slice of a specific length, and
    >> writing something like l[12345:12345+10].
    >> This happens both in interactive use and when writing Python programs,
    >> where I have to write an expression twice (or use a temporary variable).

    >
    > With an expression, I'd go for the temp var.
    >
    >> Wouldn't it be nice if the Python grammar had supported this frequent
    >> use?

    >
    > I take this as 'directly support' versus the current indirect support via
    > start+len.
    > My answer: superficially (in isolation) yes, but overall, in the context of
    > Python's somewhat minimalistic grammar/syntax, no. Two ways to slice might
    > easily be seen as one too many. In addition, the rationale for this, your
    > favorite little addition, would admit perhaps 50 others like it.
    >
    >> My idea is that the expression above might be expressed as l[12345:>10].

    >
    > Sorry, this strike me as ugly, too much like and easily confused with
    > l[12345:-10], and too much looking like a syntax error.
    >
    > Given that some other languages slice with (start,len) arguments (but not
    > then, that I remember or know of, also with a start,stop option), I am
    > *sure* that Guido thought carefully about the issue. A plus with his
    > choice is ability to offset (index) from the end *without* calling the len
    > function.


    Well I hate his choice. It is inconsistent with the fact that generally
    l[a:b] produces the empty list when a > b.

    It is only inconsistent with the Zen of python which says there should
    be only way to do something.

    --
    Antoon Pardon
     
    Antoon Pardon, May 24, 2004
    #14
  15. Noam Raphael wrote:

    > I have to write:
    > picture[s2i(picture[10:14]):s2i(picture[10:14])+4]
    > I think this is nicer (and quicker):
    > picture[s2i(picture[10:>4]):>4]


    that's spelled

    picture = Image.open(file)
    picture.getpixel((0, 0))

    </F>
     
    Fredrik Lundh, May 24, 2004
    #15
  16. Noam Raphael

    Peter Abel Guest

    Noam Raphael <> wrote in message news:<c8l3s3$27o$>...
    > Hello,
    > Many times I find myself asking for a slice of a specific length, and
    > writing something like l[12345:12345+10].
    > This happens both in interactive use and when writing Python programs,
    > where I have to write an expression twice (or use a temporary variable).
    >
    > Wouldn't it be nice if the Python grammar had supported this frequent
    > use? My idea is that the expression above might be expressed as
    > l[12345:>10].
    >
    > This change, as far as I can see, is quite small: it affects only the
    > grammar and byte-compiling, and has no side effects.
    >
    > The only change in syntax is that short_slice would be changed from
    > [lower_bound] ":" [upper_bound]
    > to
    > ([lower_bound] ":" [upper_bound]) | ([lower_bound] ":>" [slice_length])
    >
    > Just to show what will happen to the byte code: l[12345:12345+10] is
    > compiled to:
    > LOAD_GLOBAL 0 (l)
    > LOAD_CONST 1 (12345)
    > LOAD_CONST 1 (12345)
    > LOAD_CONST 2 (10)
    > BINARY_ADD
    > SLICE+3
    >
    > I suggest that l[12345:>10] would be compiled to:
    > LOAD_GLOBAL 0 (l)
    > LOAD_CONST 1 (12345)
    > DUP_TOP
    > LOAD_CONST 2 (10)
    > BINARY_ADD
    > SLICE+3
    >
    > Well, what do you think? I would like to hear your comments.
    >
    > Have a good day (or night),
    > Noam Raphael


    Python has ready a workaround for nearly ervery problem.
    What about the following?

    >>> # iNCREMENTALslICE
    >>> isl=lambda l,start,increment:l.__getslice__(start,start+increment)
    >>> l='zero one two three four five six'.split()
    >>> l

    ['zero', 'one', 'two', 'three', 'four', 'five', 'six']
    >>> isl(l,3,3)

    ['three', 'four', 'five']
    >>>


    Regards
    Peter
     
    Peter Abel, May 24, 2004
    #16
  17. Noam Raphael

    Noam Raphael Guest

    Fredrik Lundh wrote:
    > Noam Raphael wrote:
    >
    >
    >>I have to write:
    >>picture[s2i(picture[10:14]):s2i(picture[10:14])+4]
    >>I think this is nicer (and quicker):
    >>picture[s2i(picture[10:>4]):>4]

    >
    >
    > that's spelled
    >
    > picture = Image.open(file)
    > picture.getpixel((0, 0))
    >
    > </F>
    >
    >
    >
    >

    Hello,
    Thanks for your suggestion, but I meant to give an example for the need
    of those slices when handling files of a format which is not already
    handled by a module someone wrote.
    And what if I want to write a new module for handling images?

    Noam Raphael
     
    Noam Raphael, May 26, 2004
    #17
  18. Noam Raphael

    Noam Raphael Guest

    Terry Reedy wrote:
    > "Noam Raphael" <> wrote in message
    > news:c8o9vo$las$...
    >
    >>contradicts your assumption that only 50 other people would like it. I
    >>don't see what brings you to think that you represent 99.99 percent of
    >>Python users.

    >
    >
    > Projecting thoughts into my brain that I never had is stupid. I really
    > don't like that.
    >

    When you assume that only 50 people would like my suggestion, you assume
    that all the other 99.99 percent of Python users wouldn't like it, just
    because you don't. If I am wrong - correct me.
    >
    >>Perhaps *you* could have been a bit more open to hear them...

    >
    >
    > Making false ad hominen comments is stupid. I don't like that either.
    >
    > I an disappointed. Sorry I took your request for comments *on the
    > proposal* seriously.
    >
    > Terry J. Reedy
    >

    As you may have noticed, I did take your comments seriously, and
    referred to every one of them.
    I'm sorry if my remark offended you. I will try to be more polite in my
    future posts. However, I did sense a tone of impatience in your reply,
    and I think you should try to eliminate it in your future posts.

    Best wishes,
    Noam Raphael
     
    Noam Raphael, May 26, 2004
    #18
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    0
    Views:
    492
  2. Maxwell2006
    Replies:
    3
    Views:
    4,645
    Steven Cheng[MSFT]
    Apr 21, 2006
  3. hycn office
    Replies:
    2
    Views:
    185
    WIlliam Morris
    Oct 10, 2003
  4. RWC
    Replies:
    1
    Views:
    183
    Adrienne
    May 8, 2005
  5. Jan Lelis
    Replies:
    22
    Views:
    295
    Robert Klemme
    Jul 13, 2010
Loading...

Share This Page