Need an identity operator because lambda is too slow

D

Deron Meranda

This is an optimization problem, one that occurs deep inside a loop
that gets executed thousands of times. I'm looking for something in
Python which would act like an identity operator, or I-combinator: a
do-nothing function. The best I know of is: (lambda x: x), but it is
not ideal.

Consider a much-simplified example of such an iteration where
sometimes you need to transform an item by some function, but most of
the time you don't:

if some_rare_condition:
func = some_transform_function
else:
func = lambda x: x
for item in some_sequence:
item2 = func(item)
..... # more stuff

Now the "lambda x:x" acts suitably like an identity operator. But it
is very slow, when compared to using more complex-looking code:

do_transform = some_rare_condition
for item in some_sequence:
if do_transform:
item2 = transform_function(item)
else:
item2 = item
..... # more stuff

What python needs is something like a built-in "operator.identity"
function, which acts like "lambda x:x", but which the byte compiler
could recognize and completely optimize away so there is no function
call overhead.

Does this exist someplace in the corners of the language, or are there
any ideas about how best to do something like this without having to
push a bunch of ugly if-then-else constructs inside all the loops
rather than a more elegant function object? (The need for this is
more obvious in my real-world code versus the simplified examples
above).

Thanks
 
P

Paul Rubin

Deron Meranda said:
do_transform = some_rare_condition
for item in some_sequence:
if do_transform:
item2 = transform_function(item)
else:
item2 = item
..... # more stuff

This might be a little more direct:

from itertools import imap
if some_rare_condition:
mapped_sequence = imap(transform_function, some_sequence)
else:
mapped_sequence = some_sequence

for item in mapped_sequence:
# more stuff

I agree that an identity function would be useful. I think I
suggested it a long time ago and didn't get much interest.
 
R

Raymond Hettinger

[Deron Meranda
] I'm looking for something in
Python which would act like an identity operator, or I-combinator: a
do-nothing function. The best I know of is: (lambda x: x), but it is
not ideal.

File a feature request on SourceForge and assign to me. This has come
up a couple of times and would be trivial to implement in the operator
or functools modules.

Raymond
 
S

Steven D'Aprano

Consider a much-simplified example of such an iteration where
sometimes you need to transform an item by some function, but most of
the time you don't:

if some_rare_condition:
func = some_transform_function
else:
func = lambda x: x
for item in some_sequence:
item2 = func(item)
..... # more stuff

This does the test once, but does a function look-up every loop.

Now the "lambda x:x" acts suitably like an identity operator. But it
is very slow, when compared to using more complex-looking code:

do_transform = some_rare_condition
for item in some_sequence:
if do_transform:
item2 = transform_function(item)
else:
item2 = item
..... # more stuff

Despite doing the if..else comparison every loop, this is a little faster
for the case where some_rare_condition is false.

But I have to question whether this is a worthwhile optimization,
especially if some_rare_condition really is rare. Calling the identity
function "lambda x: x" one million times takes about half a second; or to
put it another way, each call to the identity function costs you about
half a microsecond. How much time does the rest of the loop processing
take? Are you sure you're optimizing something which needs to be optimized?

(E.g. if your main loop takes fifteen minutes to run, and you're trying to
shave off half a second, just don't bother.)

Compare the following, where I use an arbitrary small function as a
placeholder for whatever work you really want to do:

setup = """import time
identity = lambda x: x
do_work = lambda x: time.time() + x
x = 1
"""

Now use timeit to compare doing the work on its own with doing the work
together with an identity function:
timeit.Timer("do_work(x)", setup).repeat() [3.1834621429443359, 3.1083459854125977, 3.1382210254669189]
timeit.Timer("do_work(identity(x))", setup).repeat()
[3.5951459407806396, 3.6067559719085693, 3.5801000595092773]

Is your "do_work" function really so fast that one second per two million
calls to identity() is a significant load?



If some_rare_condition really is rare, then a possible solution might
be:

if some_rare_condition:
some_sequence = [transform(item) for item in some_sequence]
# or modify in place if needed... see enumerate
for item in some_sequence:
do_something_with(item)

This should run as fast as possible in the common case, and slow down only
in the rare condition.

Another solution would be:

if some_rare_condition:
for item in some_sequence:
item = transform(item)
do_something_with(item)
else:
for item in some_sequence:
do_something_with(item)

This is probably going to be the fastest of all, but has the disadvantage
that you are writing the same code twice.
 
S

Steven D'Aprano

[Deron Meranda
] I'm looking for something in
Python which would act like an identity operator, or I-combinator: a
do-nothing function. The best I know of is: (lambda x: x), but it is
not ideal.

File a feature request on SourceForge and assign to me. This has come
up a couple of times and would be trivial to implement in the operator
or functools modules.

I'm surprised. The Original Poster specified

What python needs is something like a built-in "operator.identity"
function, which acts like "lambda x:x", but which the byte compiler
could recognize and completely optimize away so there is no function
call overhead.
[end quote]

I would have guessed that optimizing the call away would require support
from the compiler.
 
D

David Wahler

Consider a much-simplified example of such an iteration where
sometimes you need to transform an item by some function, but most of
the time you don't:

if some_rare_condition:
func = some_transform_function
else:
func = lambda x: x
for item in some_sequence:
item2 = func(item)
..... # more stuff

Now the "lambda x:x" acts suitably like an identity operator. But it
is very slow, when compared to using more complex-looking code:

do_transform = some_rare_condition
for item in some_sequence:
if do_transform:
item2 = transform_function(item)
else:
item2 = item
..... # more stuff

What python needs is something like a built-in "operator.identity"
function, which acts like "lambda x:x", but which the byte compiler
could recognize and completely optimize away so there is no function
call overhead.

Unless I'm misunderstanding you, you seem to be proposing that the
compiler should avoid generating the call to func2() if it has a
certain value. How could that information possibly be available at
compile time?

-- David
 
S

Sean McIlroy

[snip]

this may be really dense, but i'm curious what's wrong with the
"multiplexer" idiom:

for item in some_sequence:
item2 = (not some_rare_condition and item) or \
(some_rare_condition and
some_transform_function(item))
..... # more stuff

peace
stm
 
G

Gabriel Genellina

[snip]

this may be really dense, but i'm curious what's wrong with the
"multiplexer" idiom:

for item in some_sequence:
item2 = (not some_rare_condition and item) or \
(some_rare_condition and
some_transform_function(item))
..... # more stuff

The main concern of the OP was that lambda x:x is slow, so it's better to
move the condition out of the loop.
Also, this and/or trick does not work when the Boolean value of item is
false (like 0, (), []...)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,042
Latest member
icassiem

Latest Threads

Top