Python3: on removing map, reduce, filter

  • Thread starter Andrey Tatarinov
  • Start date
A

Andrey Tatarinov

Paul said:
I think that article was written before list comprehensions were added
to Python.

anyway list comprehensions are just syntaxic sugar for

(is that correct?)

so there will be no speed gain, while map etc. are C-implemented
 
R

Robert Kern

Andrey said:
anyway list comprehensions are just syntaxic sugar for


(is that correct?)

so there will be no speed gain, while map etc. are C-implemented

It depends.

Try

def square(x):
return x*x
map(square, range(1000))

versus

[x*x for x in range(1000)]

Hint: function calls are expensive.

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
 
S

Steve Holden

Andrey said:
Hi.

How does GvR suggestions on removing map(), reduce(), filter() correlate
with the following that he wrote himself (afaik):

http://www.python.org/doc/essays/list2str.html

?
It promotes the sensible realization that when optimization is the goal
code may well tend to get ugly, sometimes uglier than necessary. Note
that the first version is completely straightforward and comprehensible.

And note that the summary in the conclusiogn BEGINS with "Rule number
one: only optimize when there is a proven speed bottleneck", which seems
to adequately imply that straightforward code is to be preferred unless
speed requirements override that.

regards
Steve
 
A

Andrey Tatarinov

Steve said:
>
And note that the summary in the conclusiogn BEGINS with "Rule number
one: only optimize when there is a proven speed bottleneck", which seems
to adequately imply that straightforward code is to be preferred unless
speed requirements override that.

My main question was: "how could be this advices applied, when map,
reduce and filter would be removed?"

but earlier I got answers about speed of list comprehensions, though
they need to be proved.
 
R

Robert Kern

Andrey said:
My main question was: "how could be this advices applied, when map,
reduce and filter would be removed?"

Since Python 3.0 is currently mythical and will involve a complete
rewrite of the language and interpreter, I don't think that you should
expect any optimization advice to carry over.

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
 
S

Steven Bethard

Robert said:
Andrey said:
anyway list comprehensions are just syntaxic sugar for


(is that correct?)

so there will be no speed gain, while map etc. are C-implemented


It depends.

Try

def square(x):
return x*x
map(square, range(1000))

versus

[x*x for x in range(1000)]

Hint: function calls are expensive.

Some timings to verify this:

$ python -m timeit -s "def square(x): return x*x" "map(square, range(1000))"
1000 loops, best of 3: 693 usec per loop

$ python -m timeit -s "[x*x for x in range(1000)]"
10000000 loops, best of 3: 0.0505 usec per loop

Note that list comprehensions are also C-implemented, AFAIK.

Steve
 
J

John Roth

Andrey Tatarinov said:
Hi.

How does GvR suggestions on removing map(), reduce(), filter() correlate
with the following that he wrote himself (afaik):

http://www.python.org/doc/essays/list2str.html

This is fairly old. Note that the fastest version
uses the array module, and is quite comprehensible -
if you know the array module and how it works.
It doesn't use the map function.

John Roth
 
J

John Machin

Steven said:
Note that list comprehensions are also C-implemented, AFAIK.

Rather strange meaning attached to "C-implemented". The implementation
generates the code that would have been generated had you written out
the loop yourself, with a speed boost (compared with the fastest DIY
approach) from using a special-purpose opcode LIST_APPEND. See below.
.... return [x*x for x in xrange(n)]
....
afunc(3) [0, 1, 4]
import dis
dis.dis(afunc)
2 0 BUILD_LIST 0
3 DUP_TOP
4 STORE_FAST 1 (_[1])
7 LOAD_GLOBAL 1 (xrange)
10 LOAD_FAST 0 (n)
13 CALL_FUNCTION 1
16 GET_ITER20 STORE_FAST 2 (x)
23 LOAD_FAST 1 (_[1])
26 LOAD_FAST 2 (x)
29 LOAD_FAST 2 (x)
32 BINARY_MULTIPLY
33 LIST_APPEND
34 JUMP_ABSOLUTE 17
37 DELETE_FAST 1 (_[1]) 40 RETURN_VALUE
def bfunc(n):
.... blist=[]; blapp=blist.append
.... for x in xrange(n):
.... blapp(x*x)
.... return blist
....
bfunc(3) [0, 1, 4]
dis.dis(bfunc)
2 0 BUILD_LIST 0
3 STORE_FAST 3 (blist)
6 LOAD_FAST 3 (blist)
9 LOAD_ATTR 1 (append)
12 STORE_FAST 2 (blapp)

3 15 SETUP_LOOP 34 (to 52)
18 LOAD_GLOBAL 3 (xrange)
21 LOAD_FAST 0 (n)
24 CALL_FUNCTION 1
27 GET_ITER31 STORE_FAST 1 (x)

4 34 LOAD_FAST 2 (blapp)
37 LOAD_FAST 1 (x)
40 LOAD_FAST 1 (x)
43 BINARY_MULTIPLY
44 CALL_FUNCTION 1
47 POP_TOP
48 JUMP_ABSOLUTE 28
5 >> 52 LOAD_FAST 3 (blist)
55 RETURN_VALUE
 
B

beliavsky

Steve said:
def square(x):
return x*x
map(square, range(1000))
versus
[x*x for x in range(1000)]
Hint: function calls are expensive.
$ python -m timeit -s "def square(x): return x*x" "map(square, range(1000))"
1000 loops, best of 3: 693 usec per loop
$ python -m timeit -s "[x*x for x in range(1000)]"
10000000 loops, best of 3: 0.0505 usec per loop

Functions will often be complicated enought that inlining them is not
feasible. In Fortran 95 one can define an elemental function that takes
an argument of any shape (scalar, vector, matrix etc.) but of a
specified type, returning a result of the same shape, so that one could
write

elemental function square(i) result(isq)
integer, intent(in) :: i
integer :: isq
isq = i*i
end function square

and then

print*,square((/i,i=0,999/))

The Numeric and Numarray Python modules have predefined ufunc's that
are essentially elemental functions with one or two arguments. It would
be nice if one could define new ufunc's using only Python code -- I
presume new ones can currently be added by writing C code.
 
S

Steve Holden

Steve Holden wrote:
No he didn't. I think you will find you are attributing Steven Bethard's
comments to me.

[...]
The Numeric and Numarray Python modules have predefined ufunc's that
are essentially elemental functions with one or two arguments. It would
be nice if one could define new ufunc's using only Python code -- I
presume new ones can currently be added by writing C code.

regards
Steve
 
T

Terry Reedy

Andrey Tatarinov said:
How does GvR suggestions on removing map(), reduce(), filter()

While GvR *might* prefer removing them completely on any given day, I think
moving them to a functional module, as others have suggested and requested,
is currently more likely. I believe that GvR has indicated at times that
this would be an acceptible compromise.

I am one of those who think the list of builtins is currently too long to
be easily grasped and should be shrunk.

Terry J. Reedy
 
S

Steven Bethard

John said:
Rather strange meaning attached to "C-implemented". The implementation
generates the code that would have been generated had you written out
the loop yourself, with a speed boost (compared with the fastest DIY
approach) from using a special-purpose opcode LIST_APPEND. See below.

Fair enough. ;)

So you basically replace the SETUP_LOOP, CALL_FUNCTION, POP_TOP and
POP_BLOCK with a DUP_TOP and a LIST_APPEND.

Steve
 
S

Steven Bethard

Steve said:
Robert said:
def square(x):
return x*x
map(square, range(1000))

versus
[x*x for x in range(1000)]

Hint: function calls are expensive.

$ python -m timeit -s "def square(x): return x*x" "map(square, range(1000))"
1000 loops, best of 3: 693 usec per loop

$ python -m timeit -s "[x*x for x in range(1000)]"
10000000 loops, best of 3: 0.0505 usec per loop

Functions will often be complicated enought that inlining them is not
feasible.

True, true. However, list comprehensions still seem to be comparable in
speed (at least in Python 2.4):

$ python -m timeit -s "def f(x): return x*x" "[f(x) for x in xrange(1000)]"
1000 loops, best of 3: 686 usec per loop

$ python -m timeit -s "def f(x): return x*x" "map(f, xrange(1000))"
1000 loops, best of 3: 690 usec per loop

Presumably this is because the C code for the byte codes generated by a
list comprehension isn't too far off of the C code in map. I looked at
bltinmodule.c for a bit, but I'm not ambitious enough to try verify this
hypothesis. ;)

Steve
 
N

Nick Coghlan

Terry said:
While GvR *might* prefer removing them completely on any given day, I think
moving them to a functional module, as others have suggested and requested,
is currently more likely. I believe that GvR has indicated at times that
this would be an acceptible compromise.

I am one of those who think the list of builtins is currently too long to
be easily grasped and should be shrunk.

Heh. When PEP 309 hits CVS (with functional.partial), maybe it can grow aliases
for the three of them so people can get used to the idea.

It might keep partial from getting too lonely. . . :)

Cheers,
Nick.
 
D

David M. Cooke

Steven Bethard said:
Some timings to verify this:

$ python -m timeit -s "def square(x): return x*x" "map(square, range(1000))"
1000 loops, best of 3: 693 usec per loop

$ python -m timeit -s "[x*x for x in range(1000)]"
10000000 loops, best of 3: 0.0505 usec per loop

Maybe you should compare apples with apples, instead of oranges :)
You're only running the list comprehension in the setup code...

$ python2.4 -m timeit -s "def square(x): return x*x" "map(square, range(1000))"
1000 loops, best of 3: 464 usec per loop
$ python2.4 -m timeit "[x*x for x in range(1000)]"
1000 loops, best of 3: 216 usec per loop

So factor of 2, instead of 13700 ...
 
S

Steven Bethard

David said:
Steven Bethard said:
Some timings to verify this:

$ python -m timeit -s "def square(x): return x*x" "map(square, range(1000))"
1000 loops, best of 3: 693 usec per loop

$ python -m timeit -s "[x*x for x in range(1000)]"
10000000 loops, best of 3: 0.0505 usec per loop


Maybe you should compare apples with apples, instead of oranges :)
You're only running the list comprehension in the setup code...

$ python2.4 -m timeit -s "def square(x): return x*x" "map(square, range(1000))"
1000 loops, best of 3: 464 usec per loop
$ python2.4 -m timeit "[x*x for x in range(1000)]"
1000 loops, best of 3: 216 usec per loop

So factor of 2, instead of 13700 ...

Heh heh. Yeah, that'd be better. Sorry about that!

Steve
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,900
Latest member
Nell636132

Latest Threads

Top