Filter versus comprehension (was Re: something about split()???)

Discussion in 'Python' started by Terry Reedy, Aug 22, 2012.

  1. Terry Reedy

    Terry Reedy Guest

    On 8/22/2012 3:30 AM, Mark Lawrence wrote:
    > On 22/08/2012 06:46, Terry Reedy wrote:
    >> On 8/21/2012 11:43 PM, mingqiang hu wrote:
    >>> why filter is bad when use lambda ?

    >>
    >> Inefficient, not 'bad'. Because the equivalent comprehension or
    >> generator expression does not require a function call.


    for each item in the iterable.

    > A case of premature optimisation? :)


    No, as regards my post. I simply made a factual statement without
    advocating a particular action.

    filter(lambda x: <expr>, iterable)
    (x for x in iterable if <expr>)

    both create iterators that produce the items in iterable such that
    bool(<expr>) is true. The following, with output rounded, shows
    something of the effect of the extra function call.

    >>> timeit.timeit("list(i for i in ranger if False)", "ranger=range(0)")

    0.91
    >>> timeit.timeit("list(i for i in ranger if False)", "ranger=range(20)")

    1.28
    >>> timeit.timeit("list(filter(lambda i: False, ranger))",

    "ranger=range(0)")
    0.83
    >>> timeit.timeit("list(filter(lambda i: False, ranger))",

    "ranger=range(20)")
    2.60

    Simply keeping true items is faster with filter -- at least on my
    particular machine with 3.3.0b2.

    >>> timeit.timeit("list(filter(None, ranger))", "ranger=range(20)")

    1.03

    Filter is also faster if the expression is a function call.

    >>> timeit.timeit("list(filter(f, ranger))", "ranger=range(20);

    f=lambda i: False")
    2.5033614114454394
    >>> timeit.timeit("list(i for i in ranger if f(i))", "ranger=range(20);

    f=lambda i: False")
    3.2394095327040304

    ---
    Perhaps or even yes as regards the so-called rule 'always use
    comprehension'. If one prefers filter as more readable, if one only
    wants to keep true items, if the expression is a function call, if
    evaluating the expression takes much more time than the extra function
    call so the latter does not matter, if the number of items is few enough
    that the extra time does not matter, then the rule is not needed or even
    wrong.

    So I think PyLint should be changed to stop its filter fud.

    --
    Terry Jan Reedy
    Terry Reedy, Aug 22, 2012
    #1
    1. Advertising

  2. On Wednesday, 22 August 2012 22:13:04 UTC+5:30, Terry Reedy wrote:
    > On 8/22/2012 3:30 AM, Mark Lawrence wrote:
    >
    > > On 22/08/2012 06:46, Terry Reedy wrote:

    >
    > >> On 8/21/2012 11:43 PM, mingqiang hu wrote:

    >
    > >>> why filter is bad when use lambda ?

    >
    > >>

    >
    > >> Inefficient, not 'bad'. Because the equivalent comprehension or

    >
    > >> generator expression does not require a function call.

    >
    >
    >
    > for each item in the iterable.
    >
    >
    >
    > > A case of premature optimisation? :)

    >
    >
    >
    > No, as regards my post. I simply made a factual statement without
    >
    > advocating a particular action.
    >
    >
    >
    > filter(lambda x: <expr>, iterable)
    >
    > (x for x in iterable if <expr>)
    >
    >
    >
    > both create iterators that produce the items in iterable such that
    >
    > bool(<expr>) is true. The following, with output rounded, shows
    >
    > something of the effect of the extra function call.
    >
    >
    >
    > >>> timeit.timeit("list(i for i in ranger if False)", "ranger=range(0)")

    >
    > 0.91
    >
    > >>> timeit.timeit("list(i for i in ranger if False)", "ranger=range(20)")

    >
    > 1.28
    >
    > >>> timeit.timeit("list(filter(lambda i: False, ranger))",

    >
    > "ranger=range(0)")
    >
    > 0.83
    >
    > >>> timeit.timeit("list(filter(lambda i: False, ranger))",

    >
    > "ranger=range(20)")
    >
    > 2.60
    >
    >
    >
    > Simply keeping true items is faster with filter -- at least on my
    >
    > particular machine with 3.3.0b2.
    >
    >
    >
    > >>> timeit.timeit("list(filter(None, ranger))", "ranger=range(20)")

    >
    > 1.03
    >
    >
    >
    > Filter is also faster if the expression is a function call.
    >
    >
    >
    > >>> timeit.timeit("list(filter(f, ranger))", "ranger=range(20);

    >
    > f=lambda i: False")
    >
    > 2.5033614114454394
    >
    > >>> timeit.timeit("list(i for i in ranger if f(i))", "ranger=range(20);

    >
    > f=lambda i: False")
    >
    > 3.2394095327040304
    >
    >
    >
    > ---
    >
    > Perhaps or even yes as regards the so-called rule 'always use
    >
    > comprehension'. If one prefers filter as more readable, if one only
    >
    > wants to keep true items, if the expression is a function call, if
    >
    > evaluating the expression takes much more time than the extra function
    >
    > call so the latter does not matter, if the number of items is few enough
    >
    > that the extra time does not matter, then the rule is not needed or even
    >
    > wrong.
    >
    >
    >
    > So I think PyLint should be changed to stop its filter fud.
    >
    >
    >
    > --
    >
    > Terry Jan Reedy


    When filtering for true values, filter(None,xxx) can be used
    Your examples with lambda i:False are unrealistic - you are comparing `if False` vs <lambda function>(xx) - function call vs boolean check
    Ramchandra Apte, Aug 24, 2012
    #2
    1. Advertising

  3. On Wednesday, 22 August 2012 22:13:04 UTC+5:30, Terry Reedy wrote:
    > On 8/22/2012 3:30 AM, Mark Lawrence wrote:
    >
    > > On 22/08/2012 06:46, Terry Reedy wrote:

    >
    > >> On 8/21/2012 11:43 PM, mingqiang hu wrote:

    >
    > >>> why filter is bad when use lambda ?

    >
    > >>

    >
    > >> Inefficient, not 'bad'. Because the equivalent comprehension or

    >
    > >> generator expression does not require a function call.

    >
    >
    >
    > for each item in the iterable.
    >
    >
    >
    > > A case of premature optimisation? :)

    >
    >
    >
    > No, as regards my post. I simply made a factual statement without
    >
    > advocating a particular action.
    >
    >
    >
    > filter(lambda x: <expr>, iterable)
    >
    > (x for x in iterable if <expr>)
    >
    >
    >
    > both create iterators that produce the items in iterable such that
    >
    > bool(<expr>) is true. The following, with output rounded, shows
    >
    > something of the effect of the extra function call.
    >
    >
    >
    > >>> timeit.timeit("list(i for i in ranger if False)", "ranger=range(0)")

    >
    > 0.91
    >
    > >>> timeit.timeit("list(i for i in ranger if False)", "ranger=range(20)")

    >
    > 1.28
    >
    > >>> timeit.timeit("list(filter(lambda i: False, ranger))",

    >
    > "ranger=range(0)")
    >
    > 0.83
    >
    > >>> timeit.timeit("list(filter(lambda i: False, ranger))",

    >
    > "ranger=range(20)")
    >
    > 2.60
    >
    >
    >
    > Simply keeping true items is faster with filter -- at least on my
    >
    > particular machine with 3.3.0b2.
    >
    >
    >
    > >>> timeit.timeit("list(filter(None, ranger))", "ranger=range(20)")

    >
    > 1.03
    >
    >
    >
    > Filter is also faster if the expression is a function call.
    >
    >
    >
    > >>> timeit.timeit("list(filter(f, ranger))", "ranger=range(20);

    >
    > f=lambda i: False")
    >
    > 2.5033614114454394
    >
    > >>> timeit.timeit("list(i for i in ranger if f(i))", "ranger=range(20);

    >
    > f=lambda i: False")
    >
    > 3.2394095327040304
    >
    >
    >
    > ---
    >
    > Perhaps or even yes as regards the so-called rule 'always use
    >
    > comprehension'. If one prefers filter as more readable, if one only
    >
    > wants to keep true items, if the expression is a function call, if
    >
    > evaluating the expression takes much more time than the extra function
    >
    > call so the latter does not matter, if the number of items is few enough
    >
    > that the extra time does not matter, then the rule is not needed or even
    >
    > wrong.
    >
    >
    >
    > So I think PyLint should be changed to stop its filter fud.
    >
    >
    >
    > --
    >
    > Terry Jan Reedy


    When filtering for true values, filter(None,xxx) can be used
    Your examples with lambda i:False are unrealistic - you are comparing `if False` vs <lambda function>(xx) - function call vs boolean check
    Ramchandra Apte, Aug 24, 2012
    #3
  4. Terry Reedy

    Terry Reedy Guest

    On 8/24/2012 10:44 AM, Ramchandra Apte wrote:
    > On Wednesday, 22 August 2012 22:13:04 UTC+5:30, Terry Reedy wrote:


    >> >>> timeit.timeit("list(i for i in ranger if False)", "ranger=range(0)")

    >>
    >> 0.91
    >>
    >> >>> timeit.timeit("list(i for i in ranger if False)", "ranger=range(20)")

    >>
    >> 1.28
    >>
    >> >>> timeit.timeit("list(filter(lambda i: False, ranger))",

    >>
    >> "ranger=range(0)")
    >>
    >> 0.83
    >>
    >> >>> timeit.timeit("list(filter(lambda i: False, ranger))",

    >>
    >> "ranger=range(20)")
    >>
    >> 2.60


    Your mail agent in inserting blank lines in quotes -- google?
    See if you can turn that off.

    > Your examples with lambda i:False are unrealistic - you are comparing
    > `if False` vs <lambda function>(xx) - function call vs boolean check


    That is exactly the comparison I wanted to make. The iteration + boolean
    check takes .37 for 20 items, the iteration + call 1.77.

    --
    Terry Jan Reedy
    Terry Reedy, Aug 24, 2012
    #4
  5. On Fri, 24 Aug 2012 12:04:54 -0400, Terry Reedy <>
    declaimed the following in gmane.comp.python.general:


    >
    > Your mail agent in inserting blank lines in quotes -- google?
    > See if you can turn that off.
    >

    It appears to be a change Google made in the last month or two... My
    hypothesis is that they are replacing hard EOL found in inbound NNTP
    with an HTML <p>, and then on outgoing replacing the <p> with a pair of
    NNTP line endings. In contrast, text composed on Google is coming in as
    long single lines (since quoting said text in a response produces on a
    ">" at the start of the paragraph.
    --
    Wulfraed Dennis Lee Bieber AF6VN
    HTTP://wlfraed.home.netcom.com/
    Dennis Lee Bieber, Aug 24, 2012
    #5
  6. Terry Reedy

    Walter Hurry Guest

    Re: Filter versus comprehension (was Re: something aboutsplit()???)

    On Fri, 24 Aug 2012 14:29:00 -0400, Dennis Lee Bieber wrote:

    > It appears to be a change Google made in the last month or two... My
    > hypothesis is that they are replacing hard EOL found in inbound NNTP
    > with an HTML <p>, and then on outgoing replacing the <p> with a pair of
    > NNTP line endings. In contrast, text composed on Google is coming in as
    > long single lines (since quoting said text in a response produces on a
    > ">" at the start of the paragraph.


    Google Groups sucks. These are computer literate people here. Why don't
    they just use a proper newsreader?
    Walter Hurry, Aug 24, 2012
    #6
  7. On Fri, 24 Aug 2012 19:03:51 +0000 (UTC), Walter Hurry
    <> declaimed the following in
    gmane.comp.python.general:

    >
    > Google Groups sucks. These are computer literate people here. Why don't
    > they just use a proper newsreader?


    Probably because their ISP doesn't offer a free server <G>

    --
    Wulfraed Dennis Lee Bieber AF6VN
    HTTP://wlfraed.home.netcom.com/
    Dennis Lee Bieber, Aug 24, 2012
    #7
  8. Terry Reedy

    Terry Reedy Guest

    On 8/24/2012 5:56 PM, Dennis Lee Bieber wrote:
    > On Fri, 24 Aug 2012 19:03:51 +0000 (UTC), Walter Hurry
    > <> declaimed the following in
    > gmane.comp.python.general:
    >
    >>
    >> Google Groups sucks. These are computer literate people here. Why don't
    >> they just use a proper newsreader?

    >
    > Probably because their ISP doesn't offer a free server <G>


    Python lists are available on the free gmane mail-to-news server.



    --
    Terry Jan Reedy
    Terry Reedy, Aug 24, 2012
    #8
  9. On 8/24/2012 3:03 PM Terry Reedy said...
    > On 8/24/2012 5:56 PM, Dennis Lee Bieber wrote:
    >> On Fri, 24 Aug 2012 19:03:51 +0000 (UTC), Walter Hurry
    >> <> declaimed the following in
    >> gmane.comp.python.general:
    >>
    >>>
    >>> Google Groups sucks. These are computer literate people here. Why don't
    >>> they just use a proper newsreader?

    >>
    >> Probably because their ISP doesn't offer a free server <G>

    >
    > Python lists are available on the free gmane mail-to-news server.


    I'm getting high load related denials with the gmane connections a lot
    recently so I'm open to alternatives.

    Suggestions or recommendations?


    Emile
    Emile van Sebille, Aug 24, 2012
    #9
  10. On 24/08/2012 23:03, Terry Reedy wrote:
    > On 8/24/2012 5:56 PM, Dennis Lee Bieber wrote:
    >> On Fri, 24 Aug 2012 19:03:51 +0000 (UTC), Walter Hurry
    >> <> declaimed the following in
    >> gmane.comp.python.general:
    >>
    >>>
    >>> Google Groups sucks. These are computer literate people here. Why don't
    >>> they just use a proper newsreader?

    >>
    >> Probably because their ISP doesn't offer a free server <G>

    >
    > Python lists are available on the free gmane mail-to-news server.
    >


    I don't think the core-mentorship list is available on gmane. Have I
    missed it, has nobody asked for it to go on there or what?


    --
    Cheers.

    Mark Lawrence.
    Mark Lawrence, Aug 24, 2012
    #10
  11. Terry Reedy

    Ned Deily Guest

    In article <k18uat$9ns$>,
    Emile van Sebille <> wrote:
    > On 8/24/2012 3:03 PM Terry Reedy said...
    > > Python lists are available on the free gmane mail-to-news server.

    > I'm getting high load related denials with the gmane connections a lot
    > recently so I'm open to alternatives.


    The high load denials should be a thing of the past as the gmane NNTP
    server was very recently upgraded to use SSDs instead of standard disks.

    --
    Ned Deily,
    Ned Deily, Aug 24, 2012
    #11
  12. Terry Reedy

    Ned Deily Guest

    In article <k18v53$hgs$>,
    Mark Lawrence <> wrote:
    > I don't think the core-mentorship list is available on gmane. Have I
    > missed it, has nobody asked for it to go on there or what?


    core-mentorship is a closed list so it would not be appropriate for it
    to be mirrored anywhere.

    http://mail.python.org/mailman/listinfo/core-mentorship

    --
    Ned Deily,
    Ned Deily, Aug 24, 2012
    #12
  13. Terry Reedy

    Walter Hurry Guest

    Re: Filter versus comprehension (was Re: something aboutsplit()???)

    On Fri, 24 Aug 2012 17:56:47 -0400, Dennis Lee Bieber wrote:

    > On Fri, 24 Aug 2012 19:03:51 +0000 (UTC), Walter Hurry
    > <> declaimed the following in
    > gmane.comp.python.general:
    >
    >
    >> Google Groups sucks. These are computer literate people here. Why don't
    >> they just use a proper newsreader?

    >
    > Probably because their ISP doesn't offer a free server <G>


    There are plenty of free Usenet providers.
    Walter Hurry, Aug 24, 2012
    #13
  14. On Fri, Aug 24, 2012 at 3:03 PM, Walter Hurry <> wrote:
    > On Fri, 24 Aug 2012 14:29:00 -0400, Dennis Lee Bieber wrote:
    >
    >> It appears to be a change Google made in the last month or two... My
    >> hypothesis is that they are replacing hard EOL found in inbound NNTP
    >> with an HTML <p>, and then on outgoing replacing the <p> with a pair of
    >> NNTP line endings. In contrast, text composed on Google is coming in as
    >> long single lines (since quoting said text in a response produces on a
    >> ">" at the start of the paragraph.

    >
    > Google Groups sucks. These are computer literate people here. Why don't
    > they just use a proper newsreader?

    I haven't used a newsreader in over a decade. I'm quite happy with a
    mailing list. Am I missing something?
    David Robinow, Aug 25, 2012
    #14
  15. Terry Reedy

    Tim Golden Guest

    On 25/08/2012 13:57, David Robinow wrote:
    > On Fri, Aug 24, 2012 at 3:03 PM, Walter Hurry <> wrote:
    >> On Fri, 24 Aug 2012 14:29:00 -0400, Dennis Lee Bieber wrote:
    >>
    >>> It appears to be a change Google made in the last month or two... My
    >>> hypothesis is that they are replacing hard EOL found in inbound NNTP
    >>> with an HTML <p>, and then on outgoing replacing the <p> with a pair of
    >>> NNTP line endings. In contrast, text composed on Google is coming in as
    >>> long single lines (since quoting said text in a response produces on a
    >>> ">" at the start of the paragraph.

    >>
    >> Google Groups sucks. These are computer literate people here. Why don't
    >> they just use a proper newsreader?

    > I haven't used a newsreader in over a decade. I'm quite happy with a
    > mailing list. Am I missing something?


    Not really. I'm the same; it just means you can skip over the occasional
    ggroups-newsreader discussion threads which pop up
    about 3 times a year on average.

    :)

    TJG
    Tim Golden, Aug 25, 2012
    #15
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Vedran Furac(
    Replies:
    4
    Views:
    318
    Marc 'BlackJack' Rintsch
    Dec 19, 2008
  2. Paul Butcher
    Replies:
    12
    Views:
    694
    Gary Wright
    Nov 28, 2007
  3. Robin Becker

    looping versus comprehension

    Robin Becker, Jan 30, 2013, in forum: Python
    Replies:
    0
    Views:
    95
    Robin Becker
    Jan 30, 2013
  4. Chris Angelico

    Re: looping versus comprehension

    Chris Angelico, Jan 30, 2013, in forum: Python
    Replies:
    2
    Views:
    119
    Roy Smith
    Jan 31, 2013
  5. Robin Becker

    Re: looping versus comprehension

    Robin Becker, Jan 30, 2013, in forum: Python
    Replies:
    0
    Views:
    116
    Robin Becker
    Jan 30, 2013
Loading...

Share This Page