A quirk/gotcha of for i, x in enumerate(seq) when seq is empty

Discussion in 'Python' started by Alex Willmer, Feb 24, 2012.

  1. Alex Willmer

    Alex Willmer Guest

    This week I was slightly surprised by a behaviour that I've not
    considered before. I've long used

    for i, x in enumerate(seq):
    # do stuff

    as a standard looping-with-index construct. In Python for loops don't
    create a scope, so the loop variables are available afterward. I've
    sometimes used this to print or return a record count e.g.

    for i, x in enumerate(seq):
    # do stuff
    print 'Processed %i records' % i+1

    However as I found out, if seq is empty then i and x are never
    created. The above code will raise NameError. So if a record count is
    needed, and the loop is not guaranteed to execute the following seems
    more correct:

    i = 0
    for x in seq:
    # do stuff
    i += 1
    print 'Processed %i records' % i

    Just thought it worth mentioning, curious to hear other options/
    improvements/corrections.
    Alex Willmer, Feb 24, 2012
    #1
    1. Advertising

  2. On Thu, 23 Feb 2012 16:30:09 -0800, Alex Willmer wrote:

    > This week I was slightly surprised by a behaviour that I've not
    > considered before. I've long used
    >
    > for i, x in enumerate(seq):
    > # do stuff
    >
    > as a standard looping-with-index construct. In Python for loops don't
    > create a scope, so the loop variables are available afterward. I've
    > sometimes used this to print or return a record count e.g.
    >
    > for i, x in enumerate(seq):
    > # do stuff
    > print 'Processed %i records' % i+1
    >
    > However as I found out, if seq is empty then i and x are never created.


    This has nothing to do with enumerate. It applies to for loops in
    general: the loop variable is not initialised if the loop never runs.
    What value should it take? Zero? Minus one? The empty string? None?
    Whatever answer Python choose would be almost always wrong, so it refuses
    to guess.


    > The above code will raise NameError. So if a record count is needed, and
    > the loop is not guaranteed to execute the following seems more correct:
    >
    > i = 0
    > for x in seq:
    > # do stuff
    > i += 1
    > print 'Processed %i records' % i


    What fixes the problem is not avoiding enumerate, or performing the
    increments in slow Python instead of fast C, but that you initialise the
    loop variable you care about before the loop in case it doesn't run.

    i = 0
    for i,x in enumerate(seq):
    # do stuff

    is all you need: the addition of one extra line, to initialise the loop
    variable i (and, if you need it, x) before hand.




    --
    Steven
    Steven D'Aprano, Feb 24, 2012
    #2
    1. Advertising

  3. Alex Willmer

    Paul Rubin Guest

    Alex Willmer <> writes:
    > i = 0
    > for x in seq:
    > # do stuff
    > i += 1
    > print 'Processed %i records' % i
    >
    > Just thought it worth mentioning, curious to hear other options/
    > improvements/corrections.


    Stephen gave an alternate patch, but you are right, it is a pitfall that
    can be easy to miss in simple testing.

    A more "functional programming" approach might be:

    def do_stuff(x): ...

    n_records = sum(1 for _ in imap(do_stuff, seq))
    Paul Rubin, Feb 24, 2012
    #3
  4. Alex Willmer

    Ethan Furman Guest

    Steven D'Aprano wrote:
    > On Thu, 23 Feb 2012 16:30:09 -0800, Alex Willmer wrote:
    >
    >> This week I was slightly surprised by a behaviour that I've not
    >> considered before. I've long used
    >>
    >> for i, x in enumerate(seq):
    >> # do stuff
    >>
    >> as a standard looping-with-index construct. In Python for loops don't
    >> create a scope, so the loop variables are available afterward. I've
    >> sometimes used this to print or return a record count e.g.
    >>
    >> for i, x in enumerate(seq):
    >> # do stuff
    >> print 'Processed %i records' % i+1
    >>
    >> However as I found out, if seq is empty then i and x are never created.

    >
    > This has nothing to do with enumerate. It applies to for loops in
    > general: the loop variable is not initialised if the loop never runs.
    > What value should it take? Zero? Minus one? The empty string? None?
    > Whatever answer Python choose would be almost always wrong, so it refuses
    > to guess.
    >
    >
    >> The above code will raise NameError. So if a record count is needed, and
    >> the loop is not guaranteed to execute the following seems more correct:
    >>
    >> i = 0
    >> for x in seq:
    >> # do stuff
    >> i += 1
    >> print 'Processed %i records' % i

    >
    > What fixes the problem is not avoiding enumerate, or performing the
    > increments in slow Python instead of fast C, but that you initialise the
    > loop variable you care about before the loop in case it doesn't run.
    >
    > i = 0
    > for i,x in enumerate(seq):
    > # do stuff
    >
    > is all you need: the addition of one extra line, to initialise the loop
    > variable i (and, if you need it, x) before hand.


    Actually,

    i = -1

    or his reporting will be wrong.

    ~Ethan~
    Ethan Furman, Feb 24, 2012
    #4
  5. On 24/02/2012 03:49, Ethan Furman wrote:
    > Steven D'Aprano wrote:
    >> On Thu, 23 Feb 2012 16:30:09 -0800, Alex Willmer wrote:
    >>
    >>> This week I was slightly surprised by a behaviour that I've not
    >>> considered before. I've long used
    >>>
    >>> for i, x in enumerate(seq):
    >>> # do stuff
    >>>
    >>> as a standard looping-with-index construct. In Python for loops don't
    >>> create a scope, so the loop variables are available afterward. I've
    >>> sometimes used this to print or return a record count e.g.
    >>>
    >>> for i, x in enumerate(seq):
    >>> # do stuff
    >>> print 'Processed %i records' % i+1
    >>>
    >>> However as I found out, if seq is empty then i and x are never created.

    >>
    >> This has nothing to do with enumerate. It applies to for loops in
    >> general: the loop variable is not initialised if the loop never runs.
    >> What value should it take? Zero? Minus one? The empty string? None?
    >> Whatever answer Python choose would be almost always wrong, so it
    >> refuses to guess.
    >>
    >>
    >>> The above code will raise NameError. So if a record count is needed, and
    >>> the loop is not guaranteed to execute the following seems more correct:
    >>>
    >>> i = 0
    >>> for x in seq:
    >>> # do stuff
    >>> i += 1
    >>> print 'Processed %i records' % i

    >>
    >> What fixes the problem is not avoiding enumerate, or performing the
    >> increments in slow Python instead of fast C, but that you initialise
    >> the loop variable you care about before the loop in case it doesn't run.
    >>
    >> i = 0
    >> for i,x in enumerate(seq):
    >> # do stuff
    >>
    >> is all you need: the addition of one extra line, to initialise the
    >> loop variable i (and, if you need it, x) before hand.

    >
    > Actually,
    >
    > i = -1
    >
    > or his reporting will be wrong.
    >
    > ~Ethan~


    Methinks an off by one error :)

    --
    Cheers.

    Mark Lawrence.
    Mark Lawrence, Feb 24, 2012
    #5
  6. Alex Willmer

    Peter Otten Guest

    Ethan Furman wrote:

    > Steven D'Aprano wrote:
    >> On Thu, 23 Feb 2012 16:30:09 -0800, Alex Willmer wrote:
    >>
    >>> This week I was slightly surprised by a behaviour that I've not
    >>> considered before. I've long used
    >>>
    >>> for i, x in enumerate(seq):
    >>> # do stuff
    >>>
    >>> as a standard looping-with-index construct. In Python for loops don't
    >>> create a scope, so the loop variables are available afterward. I've
    >>> sometimes used this to print or return a record count e.g.
    >>>
    >>> for i, x in enumerate(seq):
    >>> # do stuff
    >>> print 'Processed %i records' % i+1
    >>>
    >>> However as I found out, if seq is empty then i and x are never created.

    >>
    >> This has nothing to do with enumerate. It applies to for loops in
    >> general: the loop variable is not initialised if the loop never runs.
    >> What value should it take? Zero? Minus one? The empty string? None?
    >> Whatever answer Python choose would be almost always wrong, so it refuses
    >> to guess.
    >>
    >>
    >>> The above code will raise NameError. So if a record count is needed, and
    >>> the loop is not guaranteed to execute the following seems more correct:
    >>>
    >>> i = 0
    >>> for x in seq:
    >>> # do stuff
    >>> i += 1
    >>> print 'Processed %i records' % i

    >>
    >> What fixes the problem is not avoiding enumerate, or performing the
    >> increments in slow Python instead of fast C, but that you initialise the
    >> loop variable you care about before the loop in case it doesn't run.
    >>
    >> i = 0
    >> for i,x in enumerate(seq):
    >> # do stuff
    >>
    >> is all you need: the addition of one extra line, to initialise the loop
    >> variable i (and, if you need it, x) before hand.

    >
    > Actually,
    >
    > i = -1
    >
    > or his reporting will be wrong.


    Yes, either

    i = -1
    for i, x in enumerate(seq):
    ...
    print "%d records" % (i+1)

    or

    i = 0
    for i, x in enumerate(seq, 1):
    ...
    print "%d records" % i
    Peter Otten, Feb 24, 2012
    #6
  7. Alex Willmer

    Rick Johnson Guest

    On Feb 23, 6:30 pm, Alex Willmer <> wrote:
    > [...]
    > as a standard looping-with-index construct. In Python for loops don't
    > create a scope, so the loop variables are available afterward. I've
    > sometimes used this to print or return a record count e.g.
    >
    > for i, x in enumerate(seq):
    >    # do stuff
    > print 'Processed %i records' % i+1


    You could employ the "else clause" of "for loops" to your advantage;
    (psst: which coincidentally are working pro-bono in this down
    economy!)

    >>> for x in []:

    .... print x
    .... else:
    .... print 'Empty Iterable'
    Empty Iterable

    >>> for i,o in enumerate([]):

    .... print i, o
    .... else:
    .... print 'Empty Iterable'
    Empty Iterable
    Rick Johnson, Feb 24, 2012
    #7
  8. Alex Willmer

    Peter Otten Guest

    Rick Johnson wrote:

    > On Feb 23, 6:30 pm, Alex Willmer <> wrote:
    >> [...]
    >> as a standard looping-with-index construct. In Python for loops don't
    >> create a scope, so the loop variables are available afterward. I've
    >> sometimes used this to print or return a record count e.g.
    >>
    >> for i, x in enumerate(seq):
    >> # do stuff
    >> print 'Processed %i records' % i+1

    >
    > You could employ the "else clause" of "for loops" to your advantage;


    >>>> for x in []:

    > ... print x
    > ... else:
    > ... print 'Empty Iterable'
    > Empty Iterable
    >
    >>>> for i,o in enumerate([]):

    > ... print i, o
    > ... else:
    > ... print 'Empty Iterable'
    > Empty Iterable


    No:


    >>> for i in []:

    .... pass
    .... else:
    .... print "else"
    ....
    else
    >>> for i in [42]:

    .... pass
    .... else:
    .... print "else"
    ....
    else
    >>> for i in [42]:

    .... break
    .... else:
    .... print "else"
    ....
    >>>


    The code in the else suite executes only when the for loop is left via
    break. A non-empty iterable is required but not sufficient.
    Peter Otten, Feb 24, 2012
    #8
  9. Alex Willmer

    Peter Otten Guest

    Peter Otten wrote:

    > The code in the else suite executes only when the for loop is left via
    > break.


    Oops, the following statement is nonsense:

    > A non-empty iterable is required but not sufficient.


    Let me try again:

    A non-empty iterable is required but not sufficient to *skip* the else-suite
    of a for loop.
    Peter Otten, Feb 24, 2012
    #9
  10. On Fri, 24 Feb 2012 13:44:15 +0100, Peter Otten wrote:

    >>>> for i in []:

    > ... pass
    > ... else:
    > ... print "else"
    > ...
    > else
    >>>> for i in [42]:

    > ... pass
    > ... else:
    > ... print "else"
    > ...
    > else
    >>>> for i in [42]:

    > ... break
    > ... else:
    > ... print "else"
    > ...
    >>>>
    >>>>

    > The code in the else suite executes only when the for loop is left via
    > break. A non-empty iterable is required but not sufficient.


    You have a typo there. As your examples show, the code in the else suite
    executes only when the for loop is NOT left via break (or return, or an
    exception). The else suite executes regardless of whether the iterable is
    empty or not.


    for...else is a very useful construct, but the name is misleading. It
    took me a long time to stop thinking that the else clause executes when
    the for loop was empty.

    In Python 4000, I think for loops should be spelled:

    for name in iterable:
    # for block
    then:
    # only if not exited with break
    else:
    # only if iterable is empty

    and likewise for while loops.

    Unfortunately we can't do the same now, due to the backward-incompatible
    change in behaviour for "else".



    --
    Steven
    Steven D'Aprano, Feb 24, 2012
    #10
  11. On 24 February 2012 14:54, Steven D'Aprano
    <> wrote:

    > for...else is a very useful construct, but the name is misleading. It
    > took me a long time to stop thinking that the else clause executes when
    > the for loop was empty.


    This is why I think we should call this construct "for / break / else"
    rather than "for / else".

    --
    Arnaud
    Arnaud Delobelle, Feb 24, 2012
    #11
  12. Alex Willmer

    Peter Otten Guest

    Steven D'Aprano wrote:

    >> The code in the else suite executes only when the for loop is left via
    >> break. A non-empty iterable is required but not sufficient.

    >
    > You have a typo there. As your examples show, the code in the else suite
    > executes only when the for loop is NOT left via break (or return, or an
    > exception). The else suite executes regardless of whether the iterable is
    > empty or not.


    Yup, sorry for the confusion.
    Peter Otten, Feb 24, 2012
    #12
  13. Alex Willmer

    Rick Johnson Guest

    On Feb 24, 8:54 am, Steven D'Aprano <steve
    > wrote:

    > for...else is a very useful construct, but the name is misleading. It
    > took me a long time to stop thinking that the else clause executes when
    > the for loop was empty.


    Agreed. This is a major stumbling block for neophytes.

    > In Python 4000, I think for loops should be spelled:
    >
    > for name in iterable:
    >     # for block
    > then:
    >     # only if not exited with break
    > else:
    >     # only if iterable is empty
    >
    > and likewise for while loops.


    I like this syntax better than the current syntax, however, it is
    STILL far too confusing!

    > for name in iterable:
    >     # for block


    this part is okay

    > then:
    >     # only if not exited with break


    I only know how the "then" clause works if you include that comment
    each and every time!

    > else:
    >     # only if iterable is empty


    Again. I need more info before this code becomes intuitive. Too much
    guessing is required. Not to mention that the try/except/else suite
    treats "else" differently.

    try:
    do_this()
    except EXCEPTION:
    recover()
    else NO_EXCEPTION:
    okay_do_this_also().

    for x in iterable:
    do_this()
    except EXCEPTION:
    recover()
    else NO_EXCEPTION:
    do_this_also()

    while LOOPING:
    do_this()
    except EXCEPTION:
    break or recover()
    else NO_EXCEPTION:
    do_this_also()

    In this manner "else" will behave consistently between exception
    handling and looping.

    But this whole idea of using an else clause is ridiculous anyway
    because all you've done is to "break up" the code visually. Not to
    mention; breaking the cognitive flow of a reader!

    try:
    do_this()
    okay_do_this_also()
    what_the_heck.do_this_too()
    except EXCEPTION:
    recover()
    finally:
    always_do_this()

    Loop syntax can drop the "else" and adopt "then/finally" -- if you
    think we even need a finally!?!?

    for x in iterable:
    do_this()
    except EXCEPTION:
    recover()
    then:
    okay_do_this_also()
    what_the_heck.do_this_too()
    finally:
    always_do_this()

    while LOOPING:
    do_this()
    except EXCEPTION:
    recover()
    then:
    okay_do_this_also()
    what_the_heck.do_this_too()
    finally:
    always_do_this()
    Rick Johnson, Feb 28, 2012
    #13
  14. On Wed, Feb 29, 2012 at 9:56 AM, Rick Johnson
    <> wrote:
    > On Feb 24, 8:54 am, Steven D'Aprano <steve
    > > wrote:
    >
    >> In Python 4000, I think for loops should be spelled:
    >>
    >> for name in iterable:
    >>     # for block
    >> then:
    >>     # only if not exited with break
    >> else:
    >>     # only if iterable is empty
    >>
    >> and likewise for while loops.

    >
    > I like this syntax better than the current syntax, however, it is
    > STILL far too confusing!


    Absolutely, it's FAR too confusing. Every syntactic structure should
    have the addition of a "foo:" suite, which will run when the
    programmer expects it to and no other time. This would solve a LOT of
    problems.

    ChrisA
    Chris Angelico, Feb 28, 2012
    #14
  15. On Wed, 29 Feb 2012 10:24:18 +1100, Chris Angelico wrote:

    > Every syntactic structure should
    > have the addition of a "foo:" suite, which will run when the programmer
    > expects it to and no other time. This would solve a LOT of problems.


    Indeed, when I design my killer language, the identifiers "foo"
    and "bar" will be reserved words, never used, and not even
    mentioned in the reference manual. Any program using one will
    simply dump core without comment. Multitudes will rejoice.
    -- Tim Peters, 29 Apr 1998



    --
    Steven
    Steven D'Aprano, Feb 29, 2012
    #15
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. seq. waveform

    , Aug 16, 2005, in forum: VHDL
    Replies:
    5
    Views:
    599
    SUNNY
    Aug 26, 2005
  2. Replies:
    0
    Views:
    674
  3. Roopa

    seq point/atomic var

    Roopa, Nov 9, 2004, in forum: C Programming
    Replies:
    3
    Views:
    406
    Richard Bos
    Nov 9, 2004
  4. Neal Becker

    #elements of seq A in seq B

    Neal Becker, Aug 20, 2009, in forum: Python
    Replies:
    2
    Views:
    254
    Raymond Hettinger
    Aug 21, 2009
  5. Jan Kaliszewski

    Re: #elements of seq A in seq B

    Jan Kaliszewski, Aug 20, 2009, in forum: Python
    Replies:
    4
    Views:
    284
    Jan Kaliszewski
    Aug 21, 2009
Loading...

Share This Page