sum works in sequences (Python 3)

F

Franck Ditter

Hello,
I wonder why sum does not work on the string sequence in Python 3 :
sum((8,5,9,3)) 25
sum([5,8,3,9,2]) 27
sum('rtarze')
TypeError: unsupported operand type(s) for +: 'int' and 'str'

I naively thought that sum('abc') would expand to 'a'+'b'+'c'
And the error message is somewhat cryptic...

franck
 
N

Neil Cerutti

Hello,
I wonder why sum does not work on the string sequence in Python 3 :
sum((8,5,9,3)) 25
sum([5,8,3,9,2]) 27
sum('rtarze')
TypeError: unsupported operand type(s) for +: 'int' and 'str'

I naively thought that sum('abc') would expand to 'a'+'b'+'c'
And the error message is somewhat cryptic...

You got that error message because the default value for the
second 'start' argument is 0. The function tried to add 'r' to 0.
That said:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: sum() can't sum strings [use ''.join(seq) instead]
 
I

Ian Kelly

Hello,
I wonder why sum does not work on the string sequence in Python 3 :
sum((8,5,9,3)) 25
sum([5,8,3,9,2]) 27
sum('rtarze')
TypeError: unsupported operand type(s) for +: 'int' and 'str'

I naively thought that sum('abc') would expand to 'a'+'b'+'c'
And the error message is somewhat cryptic...

It notes in the doc string that it does not work on strings:

sum(...)
sum(sequence[, start]) -> value

Returns the sum of a sequence of numbers (NOT strings) plus the value
of parameter 'start' (which defaults to 0). When the sequence is
empty, returns start.

I think this restriction is mainly for efficiency. sum(['a', 'b',
'c', 'd', 'e']) would be the equivalent of 'a' + 'b' + 'c' + 'd' +
'e', which is an inefficient way to add together strings. You should
use ''.join instead:
'abc'
 
N

Neil Cerutti

It notes in the doc string that it does not work on strings:

sum(...)
sum(sequence[, start]) -> value

Returns the sum of a sequence of numbers (NOT strings) plus
the value of parameter 'start' (which defaults to 0). When
the sequence is empty, returns start.

I think this restriction is mainly for efficiency. sum(['a',
'b', 'c', 'd', 'e']) would be the equivalent of 'a' + 'b' + 'c'
+ 'd' + 'e', which is an inefficient way to add together
strings. You should use ''.join instead:

While the docstring is still useful, it has diverged from the
documentation a little bit.

sum(iterable[, start])

Sums start and the items of an iterable from left to right and
returns the total. start defaults to 0. The iterable‘s items
are normally numbers, and the start value is not allowed to be
a string.

For some use cases, there are good alternatives to sum(). The
preferred, fast way to concatenate a sequence of strings is by
calling ''.join(sequence). To add floating point values with
extended precision, see math.fsum(). To concatenate a series of
iterables, consider using itertools.chain().

Are iterables and sequences different enough to warrant posting a
bug report?
 
S

Steve Howell

It notes in the doc string that it does not work on strings:
sum(...)
    sum(sequence[, start]) -> value
    Returns the sum of a sequence of numbers (NOT strings) plus
    the value of parameter 'start' (which defaults to 0).  When
    the sequence is empty, returns start.
I think this restriction is mainly for efficiency.  sum(['a',
'b', 'c', 'd', 'e']) would be the equivalent of 'a' + 'b' + 'c'
+ 'd' + 'e', which is an inefficient way to add together
strings.  You should use ''.join instead:

While the docstring is still useful, it has diverged from the
documentation a little bit.

  sum(iterable[, start])

  Sums start and the items of an iterable from left to right and
  returns the total. start defaults to 0. The iterable‘s items
  are normally numbers, and the start value is not allowed to be
  a string.

  For some use cases, there are good alternatives to sum(). The
  preferred, fast way to concatenate a sequence of strings is by
  calling ''.join(sequence). To add floating point values with
  extended precision, see math.fsum(). To concatenate a series of
  iterables, consider using itertools.chain().

Are iterables and sequences different enough to warrant posting a
bug report?

Sequences are iterables, so I'd say the docs are technically correct,
but maybe I'm misunderstanding what you would be trying to clarify.
 
S

Steven D'Aprano

I think this restriction is mainly for efficiency. sum(['a', 'b', 'c',
'd', 'e']) would be the equivalent of 'a' + 'b' + 'c' + 'd' + 'e', which
is an inefficient way to add together strings.

It might not be obvious to some people why repeated addition is so
inefficient, and in fact if people try it with modern Python (version 2.3
or better), they may not notice any inefficiency.

But the example given, 'a' + 'b' + 'c' + 'd' + 'e', potentially ends up
creating four strings, only to immediately throw away three of them:

* first it concats 'a' to 'b', giving the new string 'ab'
* then 'ab' + 'c', creating a new string 'abc'
* then 'abc' + 'd', creating a new string 'abcd'
* then 'abcd' + 'e', creating a new string 'abcde'

Each new string requires a block of memory to be allocated, potentially
requiring other blocks of memory to be moved out of the way (at least for
large blocks).

With only five characters in total, you won't really notice any slowdown,
but with large enough numbers of strings, Python could potentially spend
a lot of time building, and throwing away, intermediate strings. Pure
wasted effort.

For another look at this, see:
http://www.joelonsoftware.com/articles/fog0000000319.html

I say "could" because starting in about Python 2.3, there is a nifty
optimization in Python (CPython only, not Jython or IronPython) that can
*sometimes* recognise repeated string concatenation and make it less
inefficient. It depends on the details of the specific strings used, and
the operating system's memory management. When it works, it can make
string concatenation almost as efficient as ''.join(). When it doesn't
work, repeated concatenation is PAINFULLY slow, hundreds or thousands of
times slower than join.
 
S

Steven D'Aprano

Summation is a mathematical function that works on numbers Concatenation
is the process of appending 1 string to another

although they are not related to each other they do share the same
operator(+) which is the cause of confusion. attempting to duck type
this function would cause ambiguity for example what would you expect
from

sum ('a','b',3,4)

'ab34' or 'ab7' ?

Neither. I would expect sum to do exactly what the + operator does if
given two incompatible arguments: raise an exception.

And in fact, that's exactly what it does.

py> sum ([1, 2, 'a'])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'int' and 'str'
 
I

Ian Kelly

Sequences are iterables, so I'd say the docs are technically correct,
but maybe I'm misunderstanding what you would be trying to clarify.

The doc string suggests that the argument to sum() must be a sequence,
when in fact any iterable will do. The restriction in the docs should
be relaxed to match the reality.
 
S

Steve Howell

The doc string suggests that the argument to sum() must be a sequence,
when in fact any iterable will do.  The restriction in the docs should
be relaxed to match the reality.

Ah. The docstring looks to be fixed in 3.1.3, but not in Python 2.


Python 3.1.3 (r313:86834, Mar 13 2011, 00:40:38)
[GCC 4.4.5] on linux2
Type "help", "copyright", "credits" or "license" for more information."sum(iterable[, start]) -> value\n\nReturns the sum of an iterable of
numbers (NOT strings) plus the value\nof parameter 'start' (which
defaults to 0). When the iterable is\nempty, returns start."


Python 2.6.6 (r266:84292, Mar 13 2011, 00:35:19)
[GCC 4.4.5] on linux2
Type "help", "copyright", "credits" or "license" for more information."sum(sequence[, start]) -> value\n\nReturns the sum of a sequence of
numbers (NOT strings) plus the value\nof parameter 'start' (which
defaults to 0). When the sequence is\nempty, returns start."
 
T

Terry Reedy

Summation is a mathematical function that works on numbers
Concatenation is the process of appending 1 string to another

although they are not related to each other they do share the same
operator(+) which is the cause of confusion.

If one represents counts in unary, as a sequence or tally of 1s (or
other markers indicating 'successor' or 'increment'), then count
addition is sequence concatenation. I think Guido got it right.

It happens that when the members of all sequences are identical, there
is a much more compact exponential place value notation that enables
more efficient addition and other operations. When not, other tricks are
needed to avoid so much copying that an inherently O(N) operation
balloons into an O(N*N) operation.
 
H

Hans Mulder

Hello,
I wonder why sum does not work on the string sequence in Python 3 :
sum((8,5,9,3)) 25
sum([5,8,3,9,2])
27
sum('rtarze')
TypeError: unsupported operand type(s) for +: 'int' and 'str'

I naively thought that sum('abc') would expand to 'a'+'b'+'c'
And the error message is somewhat cryptic...

franck

Summation is a mathematical function that works on numbers
Concatenation is the process of appending 1 string to another

Actually, the 'sum' builtin function is quite capable of
concatenatig objects, for example lists:
sum(([2,3], [5,8], [13,21]), [])
[2, 3, 5, 8, 13, 21]

But if you pass a string as a starting value, you get an error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: sum() can't sum strings [use ''.join(seq) instead]

In fact, you can bamboozle 'sum' into concatenating string by
by tricking it with a non-string starting value:
.... def __add__(self, other):
.... return other
....
sum("rtarze", not_a_string()) 'rtarze'
sum(["Monty ", "Python", "'s Fly", "ing Ci", "rcus"],
.... not_a_string())
"Monty Python's Flying Circus"

Hope this helps,

-- HansM
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top