"Canonical" way of deleting elements from lists

Discussion in 'Python' started by Robert Latest, Jan 9, 2008.

  1. Hello,

    From a list of strings I want to delete all empty ones. This works:

    while '' in keywords: keywords.remove('')

    However, to a long-term C programmer this looks like an awkward way of
    accomplishing a simple goal, because the list will have to be re-evaluated
    in each iteration. Is there a way to just walk the list once and throw out
    unwanted elements as one goes along?

    I started programming back when such little things were real performance
    issues, so I have some sort of cringe reflex when something looks
    inefficient.

    robert
     
    Robert Latest, Jan 9, 2008
    #1
    1. Advertising

  2. Robert Latest wrote:
    >>From a list of strings I want to delete all empty ones. This works:

    >
    > while '' in keywords: keywords.remove('')
    >
    > However, to a long-term C programmer this looks like an awkward way of
    > accomplishing a simple goal, because the list will have to be re-evaluated
    > in each iteration.


    you're using a quadratic algorihm ("in" is a linear search, and remove
    has to move everything around for each call), and you're worried about
    the time it takes Python to fetch a variable?

    > Is there a way to just walk the list once and throw out unwanted
    > elements as one goes along?


    creating a new list is always almost the right way to do things like
    this. in this specific case, filter() or list comprehensions are good
    choices:

    keywords = filter(None, keywords) # get "true" items only

    keywords = [k for k in keywords if k]

    also see:

    http://effbot.org/zone/python-list.htm#modifying

    </F>
     
    Fredrik Lundh, Jan 9, 2008
    #2
    1. Advertising

  3. Robert Latest <> writes:

    > From a list of strings I want to delete all empty ones. This works:
    >
    > while '' in keywords: keywords.remove('')


    If you're looking for a quick (no quadratic behavior) and convenient
    way to do it, you can do it like this:

    keywords = [s for s in keywords if s != '']

    But that creates a new list, which might not be wanted for long lists
    with few empty elements (or for shared lists). It also iterates over
    every list element in a Python loop, which can take some time for long
    lists.

    > Is there a way to just walk the list once and throw out unwanted
    > elements as one goes along?


    I'd do it like this:

    i = 0
    while 1:
    try:
    i = keywords.index('', i)
    except ValueError:
    break
    del keywords

    Or even:

    try:
    i = 0
    while 1:
    i = keywords.index('', i) # throws ValueError and stops the loop
    del keywords
    except ValueError:
    pass

    In both cases the search for the empty string is done in efficient C
    code, and you only loop in Python over the actual matches.
     
    Hrvoje Niksic, Jan 9, 2008
    #3
  4. Fredrik Lundh wrote:

    > creating a new list is always almost the right way to do things like


    message = message.replace("always almost", "almost always")
     
    Fredrik Lundh, Jan 9, 2008
    #4
  5. Hrvoje Niksic <> writes:

    > If you're looking for a quick (no quadratic behavior) and convenient
    > way to do it, you can do it like this:
    >
    > keywords = [s for s in keywords if s != '']


    It now occurred to me that a good compromise between convenience and
    efficiency that retains the same list is:

    keywords[:] = (s for s in keywords if s)
     
    Hrvoje Niksic, Jan 9, 2008
    #5
  6. Fredrik Lundh wrote:

    > keywords = filter(None, keywords) # get "true" items only


    Makes seinse. BTW, where can I find all methods of the built-in types?
    Section 3.6 only talks about strings and mentions the list append() method
    only in an example. Am I too stupid to read the manual, or is this an
    omission?

    robert
     
    Robert Latest, Jan 9, 2008
    #6
  7. Hrvoje Niksic wrote:

    > keywords[:] = (s for s in keywords if s)


    Looks good but is so far beyond my own comprehension that I don't dare
    include it in my code ;-)

    robert
     
    Robert Latest, Jan 9, 2008
    #7
  8. Robert Latest

    Guest

    Robert Latest:
    > Fredrik Lundh wrote:
    > > keywords = filter(None, keywords) # get "true" items only

    >
    > Makes seinse. BTW, where can I find all methods of the built-in types?
    > Section 3.6 only talks about strings and mentions the list append() method
    > only in an example. Am I too stupid to read the manual, or is this an
    > omission?


    filter isn't a method of list, it's a free function.

    For most situations that solution with filter(None,keywords) is good
    enough (but it filters away all false items, so it filters away None,
    0, empty things, etc, too).
    The following one is an in-place version (it may be faster with
    Psyco), but you usually don't need it:


    def inplacefilter(pred, alist):
    """inplacefilter(pred, alist): filters the given list like
    filter(),
    but works inplace, minimizing the used memory. It returns None.

    >>> pr = lambda x: x > 2
    >>> l = []
    >>> inplacefilter(pr, l)
    >>> l

    []
    >>> l = [1,2,2]
    >>> inplacefilter(pr, l)
    >>> l

    []
    >>> l = [3]
    >>> inplacefilter(pr, l)
    >>> l

    [3]
    >>> l = [1,2,3,1,5,1,6,0]
    >>> r = filter(pr, l) # normal filter
    >>> r

    [3, 5, 6]
    >>> inplacefilter(pr, l)
    >>> r == l

    True
    """
    slow = 0
    for fast, item in enumerate(alist):
    if pred(item):
    if slow != fast:
    alist[slow] = alist[fast]
    slow += 1
    del alist[slow:]


    You may want to avoid the enumerate() too if you use Psyco and you
    need max speed.

    Bye,
    bearophile
     
    , Jan 9, 2008
    #8
  9. Robert Latest <> wrote:
    > Hrvoje Niksic wrote:
    >
    > > keywords[:] = (s for s in keywords if s)

    >
    > Looks good but is so far beyond my own comprehension that I don't dare
    > include it in my code ;-)


    :) Worth understanding thought I think - here are some hints

    keywords[:] = (s for s in keywords if s)

    is equivalent to this (but without creating a temporary list)

    keywords[:] = list(s for s in keywords if s)

    which is equivalent to

    keywords[:] = [s for s in keywords if s]

    This

    keywords[:] = ....

    Replaces the contents of the keywords list rather than making a new
    list.

    Here is a demonstration of the fundamental difference

    >>> a=[1,2,3,4]
    >>> b=a
    >>> a=[5,6,7]
    >>> print a, b

    [5, 6, 7] [1, 2, 3, 4]

    >>> a=[1,2,3,4]
    >>> b=a
    >>> a[:]=[5,6,7]
    >>> print a, b

    [5, 6, 7] [5, 6, 7]

    Using keywords[:] stops the creation of another temporary list. The
    other behaviour may or may not be what you want!

    --
    Nick Craig-Wood <> -- http://www.craig-wood.com/nick
     
    Nick Craig-Wood, Jan 9, 2008
    #9
  10. Nick Craig-Wood wrote:

    > Using keywords[:] stops the creation of another temporary list.


    in CPython, "list[:] = iter" actually creates a temporary list object on
    the inside, in case "iter" isn't already a list or a tuple.

    (see the implementation of PySequence_Fast() for details).

    </F>
     
    Fredrik Lundh, Jan 9, 2008
    #10
  11. Robert Latest <> wrote:
    > BTW, where can I find all methods of the built-in types?
    >Section 3.6 only talks about strings and mentions the list append() method
    >only in an example. Am I too stupid to read the manual, or is this an
    >omission?


    3.6 talks about features common to all "sequence" types. Strings
    are discussed specifically in 3.6.1 ("String Methods"). Lists are
    similarly discussed in 3.6.4 ("Mutable Sequence Types"). They are
    certainly not omitted, although maybe the title of 3.6.4 could be
    take a leaf from the Zen and be more explicit.

    --
    \S -- -- http://www.chaos.org.uk/~sion/
    "Frankly I have no feelings towards penguins one way or the other"
    -- Arthur C. Clarke
    her nu becomeþ se bera eadward ofdun hlæddre heafdes bæce bump bump bump
     
    Sion Arrowsmith, Jan 9, 2008
    #11
  12. Fredrik Lundh <> wrote:
    > Nick Craig-Wood wrote:
    >
    > > Using keywords[:] stops the creation of another temporary list.

    >
    > in CPython, "list[:] = iter" actually creates a temporary list object on
    > the inside, in case "iter" isn't already a list or a tuple.
    >
    > (see the implementation of PySequence_Fast() for details).


    Ah, OK, thanks for the correction!

    --
    Nick Craig-Wood <> -- http://www.craig-wood.com/nick
     
    Nick Craig-Wood, Jan 9, 2008
    #12
  13. Sion Arrowsmith wrote:
    > Robert Latest <> wrote:
    >> BTW, where can I find all methods of the built-in types?
    >>Section 3.6 only talks about strings and mentions the list append() method
    >>only in an example. Am I too stupid to read the manual, or is this an
    >>omission?

    >
    > 3.6 talks about features common to all "sequence" types. Strings
    > are discussed specifically in 3.6.1 ("String Methods"). Lists are
    > similarly discussed in 3.6.4 ("Mutable Sequence Types").


    OK, the latter then. Too stupid. Thanks ;-)

    robert
     
    Robert Latest, Jan 9, 2008
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Alex Polite
    Replies:
    17
    Views:
    765
    lyallex
    Jun 8, 2004
  2. foo
    Replies:
    4
    Views:
    3,735
  3. Matthias Czapla

    canonical way for handling raw data

    Matthias Czapla, Aug 24, 2003, in forum: C++
    Replies:
    7
    Views:
    401
    Matthias Czapla
    Aug 27, 2003
  4. Douglas Alan
    Replies:
    17
    Views:
    517
    Douglas Alan
    Mar 2, 2005
  5. Frederick Gotham

    Canonical way to copy an array

    Frederick Gotham, Aug 18, 2006, in forum: C++
    Replies:
    15
    Views:
    565
Loading...

Share This Page