Re: restriction on sum: intentional bug?

Discussion in 'Python' started by Carl Banks, Oct 17, 2009.

  1. Carl Banks

    Carl Banks Guest

    On Oct 16, 12:40 pm, Tim Chase <> wrote:
    > Then I'm fine with sum() being smart enough to recognize this
    > horrid case and do the "right" thing by returning ''.join()
    > instead.


    You don't want Python to get into this business. Trust me. Just
    don't go there.

    If you want sum to call ''.join transparently, then "".join would have
    to produce identical results to what sum() would have produced in all
    cases. That does not happen. If an object within the list defines
    both __str__ and __add__ methods, then "".join will call __str__,
    whereas sum would call __add__, leading to potentially different
    results. Therefore, transparently substituting a call to "".join is
    not an option.

    It'd be better to just remove the special case.


    Carl Banks
     
    Carl Banks, Oct 17, 2009
    #1
    1. Advertising

  2. Carl Banks

    Tim Chase Guest

    Re: restriction on sum: intentional misfeature?

    Carl Banks wrote:
    > On Oct 16, 12:40 pm, Tim Chase <> wrote:
    >> Then I'm fine with sum() being smart enough to recognize this
    >> horrid case and do the "right" thing by returning ''.join()
    >> instead.

    >
    > You don't want Python to get into this business. Trust me. Just
    > don't go there.


    Well python is already in this business of special cases -- it
    trys to be smart about a dumb operation by raising an error.
    Just call __add__ ... if it's slow, that's my problem as a
    programmer. Python doesn't complain about lists, which Steven
    points out

    Steven D'Aprano wrote:
    >And indeed, if you pass a list-of-lists to sum(), it
    >does:
    >
    >>>> >>> sum([[1,2], ['a',None], [1,'b']], [])

    >[1, 2, 'a', None, 1, 'b']
    >
    >(For the record, summing lists is O(N**2), and unlike
    >strings, there's no optimization in CPython to avoid the
    >slow behaviour.)


    which is also slow. By your own words (from a subsequent email)

    > If, say, you were to accept that Python is going to guard against a
    > small number of especially bad cases, this has got to be one of the
    > top candidates.


    In guarding, you can do the intended thing (for strings, that's
    concatenation as the "+" operator does which can be optimized
    with ''.join()), or you can raise an error. I don't see how
    using ''.join() is much different from being smart enough to
    raise an error, except it doesn't break user expectations.

    > If you want sum to call ''.join transparently, then "".join would have
    > to produce identical results to what sum() would have produced in all
    > cases. That does not happen. If an object within the list defines
    > both __str__ and __add__ methods, then "".join will call __str__,
    > whereas sum would call __add__, leading to potentially different
    > results. Therefore, transparently substituting a call to "".join is
    > not an option.


    AFAICT, "".join() does not call __str__ on its elements:

    >>> ''.join(['hello', 42, 'world'])

    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    TypeError: sequence item 1: expected string, int found
    >>> '__str__' in dir(42)

    True

    which is exactly what I'd expect from

    >>> 'hello' + 42 + "world"

    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    TypeError: cannot concatenate 'str' and 'int' objects

    This is under 2.x (I don't have 3.x on hand to see if that
    changed unexpectedly)

    > It'd be better to just remove the special case.


    I'd be happy with either solution for summing strings. Be slow
    or be fast, but don't be erroneous.

    -tkc
     
    Tim Chase, Oct 17, 2009
    #2
    1. Advertising

  3. Carl Banks

    Carl Banks Guest

    Re: restriction on sum: intentional misfeature?

    On Oct 17, 2:15 am, Tim Chase <> wrote:
    > Carl Banks wrote:
    > > On Oct 16, 12:40 pm, Tim Chase <> wrote:
    > >> Then I'm fine with sum() being smart enough to recognize this
    > >> horrid case and do the "right" thing by returning ''.join()
    > >> instead.

    >
    > > You don't want Python to get into this business.  Trust me.  Just
    > > don't go there.

    >
    > Well python is already in this business of special cases -- it
    > trys to be smart about a dumb operation by raising an error.


    Which is irrelevant to what I was saying.

    I followed up to the suggestion to transparently replace sum() with
    "".join, but it's not the special-casing, per se, that's the issue.
    It's micro=optimizing special cases at run time that I have a problem
    with and where Python should never, ever go.

    If Python starts doing stuff like that, everyone and their mother is
    going to want Python to do the same thing for their own pet
    performance bottlenecks. People will post here whining that sum()
    optimizing strings, why doesn't it optimize lists. Let Python
    optimize the general case, not the specific case.

    I am fine with other kinds of special-case behaviors, like throwing an
    exception for an particularly bad argument. That's is merely
    something you want to avoid in general.

    Let's sum up:

    Special-casing to raise an error, avoid in general.
    Special-casing to mirco-optimize stuff, don't go there.

    Now, if you want to argue that lots of special-case optimization do
    occur in Python, or that they dsn't but should, be my guest. At least
    you'll be answering my actual objection.


    > AFAICT, "".join() does not call __str__ on its elements:


    You are correct. Withdrawn. Perhaps "".join does do the same thing
    in all non-pathological cases as sum. In which case I'd upgrade this
    proposal from "not an option" to "a very, very bad option".


    > > It'd be better to just remove the special case.

    >
    > I'd be happy with either solution for summing strings.  Be slow
    > or be fast, but don't be erroneous.


    It's not erroneous. It does exactly what it's documented to do.

    The options should be "slow", "fast", and (in your opinion) "poor
    design choice".


    Carl Banks
     
    Carl Banks, Oct 17, 2009
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Tim Chase
    Replies:
    4
    Views:
    294
    Steven D'Aprano
    Oct 17, 2009
  2. Benjamin Peterson

    Re: restriction on sum: intentional bug?

    Benjamin Peterson, Oct 16, 2009, in forum: Python
    Replies:
    3
    Views:
    335
  3. Terry Reedy
    Replies:
    10
    Views:
    486
    Steven D'Aprano
    Oct 18, 2009
  4. Ethan Furman

    Re: restriction on sum: intentional bug?

    Ethan Furman, Oct 19, 2009, in forum: Python
    Replies:
    6
    Views:
    358
    Gabriel Genellina
    Oct 20, 2009
  5. Steve
    Replies:
    3
    Views:
    267
    Steven D'Aprano
    Oct 27, 2009
Loading...

Share This Page