Re: restriction on sum: intentional bug?

Discussion in 'Python' started by Terry Reedy, Oct 17, 2009.

  1. Terry Reedy

    Terry Reedy Guest

    Alan G Isaac wrote:
    >
    > As Tim explained in detail, and as Peter
    > explained with brevity, whether it will
    > happen or not, it should happen. This
    > conversation has confirmed that current
    > behavior is a wart: an error is raised
    > despite correct semantics. Ugly!


    The fact that two or three people who agree on something agree on the
    thing that they agree on confirms nothing. One could just as well argue
    that summing anything but numbers is semantically incoherent, not
    correct. Certainly, my dictionary points in that direction.

    tjr
    Terry Reedy, Oct 17, 2009
    #1
    1. Advertising

  2. Terry Reedy

    Jon Clements Guest

    On Oct 17, 1:16 am, Terry Reedy <> wrote:
    > Alan G Isaac wrote:
    >
    > > As Tim explained in detail, and as Peter
    > > explained with brevity, whether it will
    > > happen or not, it should happen.  This
    > > conversation has confirmed that current
    > > behavior is a wart: an error is raised
    > > despite correct semantics. Ugly!

    >
    > The fact that two or three people who agree on something agree on the
    > thing that they agree on confirms nothing. One could just as well argue
    > that summing anything but numbers is semantically incoherent, not
    > correct. Certainly, my dictionary points in that direction.
    >
    > tjr


    I agree here. I don't think it's a case of "warning about
    inefficiency" that Python
    doesn't sum strings, but rather that 'summing' strings doesn't make
    sense.

    An OTT example could be sum(['010111010', '372']) # Binary and decimal

    Sum should return a *numeric* result, it has no way to do anything
    sensible
    with strings -- that's up to the coder and I think it'd be an error in
    Python
    to not raise an error.

    Jon.
    Jon Clements, Oct 17, 2009
    #2
    1. Advertising

  3. On Fri, 16 Oct 2009 18:01:41 -0700, Jon Clements wrote:

    > Sum should return a *numeric* result, it has no way to do anything
    > sensible
    > with strings -- that's up to the coder and I think it'd be an error in
    > Python
    > to not raise an error.



    That's obviously wrong in Python.

    Mathematically, sum() is defined as the repeated application of the +
    operator. In Python, the + operator is well-defined for strings and lists
    as well as numbers. Since you can say "ab" + "cd" + "ef" and get a
    sensible result, then sum() should be able to do the same thing.

    And indeed, if you pass a list-of-lists to sum(), it does:

    >>> sum([[1,2], ['a',None], [1,'b']], [])

    [1, 2, 'a', None, 1, 'b']

    (For the record, summing lists is O(N**2), and unlike strings, there's no
    optimization in CPython to avoid the slow behaviour.)

    Likewise if you defeat sum()'s feeble attempt to stop you from running
    with scissors, it also gives a sensible result for strings:

    >>> class S:

    .... def __add__(self, other):
    .... return other
    ....
    >>> sum(['a', 'b', 'c', 'd'], S())

    'abcd'


    In languages where + is *not* used for string or list concatenation, then
    it makes sense to argue that sum(strings or lists) is meaningless. But
    Python is not one of those languages.



    --
    Steven
    Steven D'Aprano, Oct 17, 2009
    #3
  4. Terry Reedy

    Aahz Guest

    In article <0062f568$0$26941$>,
    Steven D'Aprano <> wrote:
    >
    >Mathematically, sum() is defined as the repeated application of the +
    >operator. In Python, the + operator is well-defined for strings and lists
    >as well as numbers. Since you can say "ab" + "cd" + "ef" and get a
    >sensible result, then sum() should be able to do the same thing.
    >
    >And indeed, if you pass a list-of-lists to sum(), it does:
    >
    >>>> sum([[1,2], ['a',None], [1,'b']], [])

    >[1, 2, 'a', None, 1, 'b']
    >
    >(For the record, summing lists is O(N**2), and unlike strings, there's no
    >optimization in CPython to avoid the slow behaviour.)


    Are you sure?
    --
    Aahz () <*> http://www.pythoncraft.com/

    "To me vi is Zen. To use vi is to practice zen. Every command is a
    koan. Profound to the user, unintelligible to the uninitiated. You
    discover truth everytime you use it."
    Aahz, Oct 17, 2009
    #4
  5. On Oct 17, 3:27 pm, (Aahz) wrote:
    > In article <0062f568$0$26941$>,
    > Steven D'Aprano  <> wrote:
    > > (For the record, summing lists is O(N**2), and unlike strings, there's no
    > > optimization in CPython to avoid the slow behaviour.)

    >
    > Are you sure?


    The O(N**2) claim surprised me too, but it certainly looks that
    way. Here's a script to produce some timings:

    def concat1(list_of_lists):
    return sum(list_of_lists, [])

    def concat2(list_of_lists):
    acc = []
    for l in list_of_lists:
    acc = acc + l
    return acc

    def concat3(list_of_lists):
    acc = []
    for l in list_of_lists:
    acc += l
    return acc

    def concat4(list_of_lists):
    acc = []
    for l in list_of_lists:
    acc.extend(l)
    return acc

    test_list = [ for i in xrange(100000)]

    from timeit import Timer

    for fn in ["concat1", "concat2", "concat3", "concat4"]:
    t = Timer(fn + "(test_list)", "from __main__ import test_list, " +
    fn)
    print fn, t.timeit(number=3)/3.0


    On my machine (OS X 10.5/Core 2 Duo), under Python 2.7 svn I get:

    newton:trunk dickinsm$ ./python.exe ~/time_list_sum.py
    concat1 48.1459733645
    concat2 48.4200883706
    concat3 0.0146766503652
    concat4 0.0184679826101


    For some reason that I don't really understand, the CPython source
    does
    the equivalent of concat2 instead of concat3. See the builtin_sum
    function in

    http://svn.python.org/view/python/trunk/Python/bltinmodule.c?view=markup

    and scroll past the special cases for ints and floats. After a one-
    line
    source change, replacing the PyNumber_Add call with
    PyNumber_InPlaceAdd,
    I get the following results:

    newton:trunk dickinsm$ ./python.exe ~/time_list_sum.py
    concat1 0.0106019973755
    concat2 48.0212899844
    concat3 0.0138022899628
    concat4 0.0179653167725

    --
    Mark
    Mark Dickinson, Oct 17, 2009
    #5
  6. Terry Reedy

    Aahz Guest

    In article <>,
    Mark Dickinson <> wrote:
    >
    >For some reason that I don't really understand, the CPython source does
    >the equivalent of concat2 instead of concat3. See the builtin_sum
    >function in
    >
    >http://svn.python.org/view/python/trunk/Python/bltinmodule.c?view=3Dmarkup
    >
    >and scroll past the special cases for ints and floats. After a
    >one- line source change, replacing the PyNumber_Add call with
    >PyNumber_InPlaceAdd,


    Ahhh, I vaguely remember there being some discussion of this when sum()
    was introduced -- I think that using InPlaceAdd would have caused bad
    behavior when the initial list was referred to by multiple names.
    --
    Aahz () <*> http://www.pythoncraft.com/

    "To me vi is Zen. To use vi is to practice zen. Every command is a
    koan. Profound to the user, unintelligible to the uninitiated. You
    discover truth everytime you use it."
    Aahz, Oct 17, 2009
    #6
  7. On Oct 17, 9:49 pm, (Aahz) wrote:
    > Ahhh, I vaguely remember there being some discussion of this when sum()
    > was introduced -- I think that using InPlaceAdd would have caused bad
    > behavior when the initial list was referred to by multiple names.


    Ah yes. Good point. With my one-line modification, I get:

    Python 2.7a0 (trunk:75468M, Oct 17 2009, 21:57:02)
    [GCC 4.0.1 (Apple Inc. build 5493)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> acc = []
    >>> sum(( for i in range(10)), acc)

    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
    >>> acc

    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

    There should probably be a test for this somewhere, to make
    sure that no-one else is tempted to make this change.

    --
    Mark
    Mark Dickinson, Oct 17, 2009
    #7
  8. On Oct 17, 9:49 pm, (Aahz) wrote:
    > Ahhh, I vaguely remember there being some discussion of this when sum()
    > was introduced -- I think that using InPlaceAdd would have caused bad
    > behavior when the initial list was referred to by multiple names.


    Thanks for the pointer: I've now found the thread. It looks like
    Alex Martelli made this exact change 6 years ago, and then had to
    revert it because it changed behaviour:

    http://mail.python.org/pipermail/python-dev/2003-October/039511.html

    I've just checked in an extra test to test_builtin.py to make sure
    this bright idea isn't repeated. :)

    --
    Mark
    Mark Dickinson, Oct 17, 2009
    #8
  9. Terry Reedy

    Aahz Guest

    In article <>,
    Mark Dickinson <> wrote:
    >On Oct 17, 9:49=A0pm, (Aahz) wrote:
    >>
    >> Ahhh, I vaguely remember there being some discussion of this when sum()
    >> was introduced -- I think that using InPlaceAdd would have caused bad
    >> behavior when the initial list was referred to by multiple names.

    >
    >Thanks for the pointer: I've now found the thread. It looks like
    >Alex Martelli made this exact change 6 years ago, and then had to
    >revert it because it changed behaviour:
    >
    >http://mail.python.org/pipermail/python-dev/2003-October/039511.html
    >
    >I've just checked in an extra test to test_builtin.py to make sure
    >this bright idea isn't repeated. :)


    Thanks! I hope you added a comment to the code, too. ;-)

    And wow, sometimes my memory amazes me...
    --
    Aahz () <*> http://www.pythoncraft.com/

    "To me vi is Zen. To use vi is to practice zen. Every command is a
    koan. Profound to the user, unintelligible to the uninitiated. You
    discover truth everytime you use it."
    Aahz, Oct 17, 2009
    #9
  10. Terry Reedy

    Terry Reedy Guest

    Alan G Isaac wrote:
    > On 10/16/2009 8:16 PM, Terry Reedy wrote:
    >> The fact that two or three people who agree on something agree on the
    >> thing that they agree on confirms nothing.


    If you disagree with this, I think *you* are being silly.

    >> One could just as well argue
    >> that summing anything but numbers is semantically incoherent, not
    >> correct. Certainly, my dictionary points in that direction.


    Ditto.

    > Come on now, that is just a silly argument.


    And I think this is a silly response ;-).

    > And dictionaries are obviously irrelevant;


    Not when talking about the semantics of English words.

    > The only serious reason that has been offered
    > for the current behavior is that people who do
    > not know better will sum strings instead of
    > joining them, which is more efficient. That is
    > a pretty weak argument for breaking expectations
    > and so refusing to do duck typing that an error
    > is raise. Especially in a language like Python.
    > (As Tim and Peter make clear.)


    The absence of other responses is not the same as the absence of other
    possible responses. Some people have better things to do than rehash all
    the details of a past discussion.

    Nothing I have said bears on whether I would have voted for or against
    the current behavior. I have only addressed your to-me silly claim to
    have 'confirmed' the correctness of one position.

    Terry Jan Reedy
    Terry Reedy, Oct 18, 2009
    #10
  11. On Sat, 17 Oct 2009 07:27:44 -0700, Aahz wrote:

    >>(For the record, summing lists is O(N**2), and unlike strings, there's
    >>no optimization in CPython to avoid the slow behaviour.)

    >
    > Are you sure?


    Not 100% -- I haven't read the CPython source code.

    But I have done timing tests for repeated concatenation of strings, and
    demonstrated to my own satisfaction that CPython can avoid O(N**2)
    behaviour under certain circumstances (but not all). I've repeated those
    same tests using lists instead of strings, and seen O(N**2) behaviour
    under all circumstances.

    This was under Python 2.5 or 2.6, not 3.1. The situation may have changed.



    --
    Steven
    Steven D'Aprano, Oct 18, 2009
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Tim Chase
    Replies:
    4
    Views:
    281
    Steven D'Aprano
    Oct 17, 2009
  2. Benjamin Peterson

    Re: restriction on sum: intentional bug?

    Benjamin Peterson, Oct 16, 2009, in forum: Python
    Replies:
    3
    Views:
    318
  3. Carl Banks
    Replies:
    2
    Views:
    285
    Carl Banks
    Oct 17, 2009
  4. Ethan Furman

    Re: restriction on sum: intentional bug?

    Ethan Furman, Oct 19, 2009, in forum: Python
    Replies:
    6
    Views:
    345
    Gabriel Genellina
    Oct 20, 2009
  5. Steve
    Replies:
    3
    Views:
    255
    Steven D'Aprano
    Oct 27, 2009
Loading...

Share This Page