Python code written in 1998, how to improve/change it?

Petr Jakes · Jan 22, 2006

Sorry for the typo in my previous posting. Of course it has to be: ....
simple liftt/push ON/OFF button....

Carl Cerecke · Jan 23, 2006

Petr said:
Sorry, I can't get in. Can you please show me, how to use your approach
on the simple push/push ON/OFF button for example please?

PS: seriously it is not a homework and I feel it like a shame I am
asking such a simple questions

States: ON, OFF
Transition event: "push", "lift"

transition diagram:
=========================

___ lift
| |
_V___|__
,->| ON |__
| |________| |
lift | | push
| ________ |
'--| OFF |<-'
|________|
^ |
|___|push

As a starting point, how about:

l = 'lift'
p = 'push'

action_sequence = [l,p,p,l,l,p,l,p,None]
next_action = iter(action_sequence).next

s_on = compile('''
print 'on'
action = next_action()
if action == 'lift':
state = s_on
elif action == 'push':
state = s_off
else:
state = None
''','','exec')

s_off = compile('''
print 'off'
action = next_action()
if action == 'lift':
state = s_on
elif action == 'push':
state = s_off
else:
state = None
''','','exec')

state = s_on # start state
while state:
eval(state)

Cheers,
Carl

Bengt Richter · Jan 23, 2006

How about something like

actions = dict(

Click to expand...

... a=compile('print "A"; state="b"','','exec'),
... b=compile('print "B"; state="c"','','exec'),
... c=compile('print "C"; state=None','','exec')
... )

state = 'a'
while state: eval(actions[state])

Click to expand...

...
A
B
C

Click to expand...

Good idea. But we can eliminate the dictionary lookup:

a1 = compile('print "A"; state=b1','','exec')
b1 = compile('print "B"; state=c1','','exec')
c1 = compile('print "C"; state=None','','exec')

state = a1
while state:
eval(state)

Cool. But unfortunately, neither version works inside a function's local namespace.
Using exec instead of eval seems to do it in either context though.
Now, how can we get optimized code (i.e., LOAD_FAST vs LOAD_NAME etc in a1 etc)
without byte code hackery?

Regards,
Bengt Richter

Wolfgang Keller · Jan 24, 2006

I believe the more modern approach to this is to use generators in some
way, yield each other as the next state. This way you avoid all almost
all the function call overhead (the part that takes significant time,
which is setting up the stack frame) and don't have to resort to
bytecode hacks for better performance.

Dumb question from a endless newbie scripting dilettant:

Do you have a reference to a cookbook example for this method?

Sidequestion: As I understand it, as a general rule generators are more
efficient than functions in any case where the function is called several
times, right? So basically if I want to write a long-running program in
Python, it would make sense to code all functions that are likely to be
called more than once as generators...

TIA,

Sincerely,

Wolfgang Keller

Carl Cerecke · Jan 24, 2006

Wolfgang said:
Dumb question from a endless newbie scripting dilettant:

Do you have a reference to a cookbook example for this method?

It turns out that generators are more efficient than the eval function
excuting bits of compiled code. About 20-25% faster.

However, the generators exhibit some weird behaviour.

The included code shows an example using a compile/eval FSM method, and
a generator method. Note, in particular, the pattern of ids of the
generator objects. I had expected some complications in the ids, but not
a repeatable sequence of length 3 after the first generator object.

What is going on? Anybody?

#!/usr/bin/env python

import time

s_on = compile('''
print 'on'
action = next_action()
if action == 'lift':
state = s_on
elif action == 'push':
state = s_off
else:
state = None
''','','exec')

s_off = compile('''
print 'off'
action = next_action()
if action == 'lift':
state = s_on
elif action == 'push':
state = s_off
else:
state = None
''','','exec')

def g_on():

print "on"
action = next_action()
if action == 'lift':
yield g_on()
elif action == 'push':
yield g_off()
else:
yield None

def g_off():

print "off"
action = next_action()
if action == 'lift':
yield g_on()
elif action == 'push':
yield g_off()
else:
yield None

def actions(n):
import random
for i in range(n-1):
yield random.choice(['lift','push'])
yield None

#r = 1000000
r = 10
next_action = actions(r).next

#a = time.clock()
while next_action():
pass
#z = time.clock()
#print "action generator:",z-a
next_action = actions(r).next
#print "---"
state = s_on # start state
#a = time.clock()
while state:
eval(state)
#z = time.clock()
#print z-a
print "---"

next_action = actions(r).next
s_g_on = g_on()
s_g_off = g_off()
state = s_g_on
#a = time.clock()
while state:
print id(state)
state = state.next()
#z = time.clock()
#print z-a

#Cheers,
#Carl.

skip · Jan 24, 2006

Wolfgang> So basically if I want to write a long-running program in
Wolfgang> Python, it would make sense to code all functions that are
Wolfgang> likely to be called more than once as generators...

If they need to resume their calculations from where they left off after the
last yield.

Skip

Wolfgang Keller · Jan 24, 2006

def g_on():

print "on"
action = next_action()
if action == 'lift':
yield g_on()
elif action == 'push':
yield g_off()
else:
yield None

def g_off():

print "off"
action = next_action()
if action == 'lift':
yield g_on()
elif action == 'push':
yield g_off()
else:
yield None

Amazing.

Executable pseudo-code, really.

And that's even (run-time) efficient?

Tanks a lot,

Sincerely,

Wolfgang Keller

Fredrik Lundh · Jan 24, 2006

Carl said:
It turns out that generators are more efficient than the eval function
excuting bits of compiled code. About 20-25% faster.

why are you using generators to return things from a function, when
you can just return the things ?

def f_on():
print "on"
action = next_action()
if action == 'lift':
return f_on
elif action == 'push':
return f_off

def f_off():
print "off"
action = next_action()
if action == 'lift':
return f_on
elif action == 'push':
return f_off

state = f_on
while state:
state = state()

</F>

Carl Cerecke · Jan 24, 2006

Fredrik said:
Carl Cerecke wrote:

why are you using generators to return things from a function, when
you can just return the things ?

Trying to find the fastest way to implement finite state machine.
The included file times 4 different ways, with functions the fastest.

The reason, I think, for the unusual sequence if id()s of the generators
- see grandparent post - (and the reason for their poor performance
compared to functions), is because the reference to the generator is
being lost, then another generator is being created with the same id.
Properly done, I would expect generators to out perform functions.

These are the numbers I get on a P3 600:
$ ./foo.py
action generator overhead: 13.25
---
exec: 8.7
---
eval: 10.09
---
generators: 6.68
---
functions: 3.37

#!/usr/bin/env python

import time

s_on = compile('''
#print 'on'
action = next_action()
if action == 'lift':
state = s_on
elif action == 'push':
state = s_off
else:
state = None
''','','exec')

s_off = compile('''
#print 'off'
action = next_action()
if action == 'lift':
state = s_on
elif action == 'push':
state = s_off
else:
state = None
''','','exec')

def f_on():
action = next_action()
if action == 'lift':
return f_on
elif action == 'push':
return f_off
else:
return None

def f_off():
action = next_action()
if action == 'lift':
return f_on
elif action == 'push':
return f_off
else:
return None

def g_on():

#print "on"
action = next_action()
if action == 'lift':
yield g_on()
elif action == 'push':
yield g_off()
else:
yield None

def g_off():

#print "off"
action = next_action()
if action == 'lift':
yield g_on()
elif action == 'push':
yield g_off()
else:
yield None

def actions(n):
import random
for i in range(n-1):
yield random.choice(['lift','push'])
yield None

r = 1000000
#r = 10
next_action = actions(r).next

a = time.clock()
while next_action():
pass
z = time.clock()
print "action generator overhead:",z-a
common = z-a

print "---"
next_action = actions(r).next
state = s_on # start state
a = time.clock()
while state:
exec state
z = time.clock()
print "exec:",z-a-common

print "---"
next_action = actions(r).next
state = s_on # start state
a = time.clock()
while state:
eval(state)
z = time.clock()
print "eval:",z-a-common

print "---"
next_action = actions(r).next
s_g_on = g_on()
s_g_off = g_off()
state = s_g_on
a = time.clock()
while state:
#print id(state)
state = state.next()
z = time.clock()
print "generators:",z-a-common

print "---"
next_action = actions(r).next
state = f_on
a = time.clock()
while state:
state = state()
z = time.clock()
print "functions:",z-a-common

Carl Cerecke · Jan 24, 2006

Carl said:
Trying to find the fastest way to implement finite state machine.
The included file times 4 different ways, with functions the fastest.

The reason, I think, for the unusual sequence if id()s of the generators
- see grandparent post - (and the reason for their poor performance
compared to functions), is because the reference to the generator is
being lost, then another generator is being created with the same id.
Properly done, I would expect generators to out perform functions.

Generator FSM done properly (well, better anyway). They are still almost
twice as slow as plain function calls, though.

def g_on():

while 1:
action = next_action()
if action == 'lift':
yield s_g_on
elif action == 'push':
yield s_g_off
else:
break
yield None

def g_off():

while 1:
action = next_action()
if action == 'lift':
yield s_g_on
elif action == 'push':
yield s_g_off
else:
break
yield None

def actions(n):
import random
for i in range(n-1):
yield random.choice(['lift','push'])
yield None

r = 1000000
#r = 10
next_action = actions(r).next
s_g_on = g_on()
s_g_off = g_off()
state = s_g_on

while state:
state = state.next()
z = time.clock()

Carl Cerecke · Jan 24, 2006

Adding a continue statemtent after the yield statements yields

a
speed increase. Still not as good as functions though. (about 30% slower)

Cheers,
Carl

Bengt Richter · Jan 25, 2006

Wolfgang> So basically if I want to write a long-running program in
Wolfgang> Python, it would make sense to code all functions that are
Wolfgang> likely to be called more than once as generators...

If they need to resume their calculations from where they left off after the
last yield.

Hm, I wonder how (all untested)

def foo(x,y):
return x**2+y**2
for pair in pairs: print foo(*pair)

would compare to

def bar(arglist):
while True:
x,y = arglist
yield x**2 + y**2
barargs = []
ibar = bar(barargs).next
for barargs[:] in pairs: print ibar()

Of course, for simple stuff there's always manual inlining ;-)

for x, y in pairs: print x**2, y**2

Hm, <bf warning> it might be interesting if one could bind arg list items of
a generator from the outside, e.g.,

def gfoo(x, y):
while True:
yield x**2 + y**2

ifoo = gfoo('dummy','dummy') # or first pair
for ifoo.x, ifoo.y in pairs: print ifoo.next()

Regards,
Bengt Richter

Wolfgang Keller · Jan 25, 2006

So basically if I want to write a long-running program in
This was meant as a question.

If they need to resume their calculations from where they left off after the
last yield.

Well, no, independently from that.

Just to avoid to inital overhead of the function call.

?

Sincerely,

Wolfgang Keller

Fredrik Lundh · Jan 25, 2006

Wolfgang said:
This was meant as a question.

Well, no, independently from that.

Just to avoid to inital overhead of the function call.

?

what makes you think that resuming a generator won't involve function
calls ?

</F>

Wolfgang Keller · Jan 25, 2006

what makes you think that resuming a generator won't involve function

calls ?

That was not what I wrote.

I believe the more modern approach to this is to use generators in some
way, yield each other as the next state. This way you avoid all almost
all the function call overhead (the part that takes significant time,
which is setting up the stack frame)

The way I understand this, resuming a generator causes less overhead than the
inital overhead of a function call.

Again, I'm just a poor scripting dilettant who's asking questions.

Sincerely,

Wolfgang Keller

Bengt Richter · Jan 25, 2006

Adding a continue statemtent after the yield statements yields a
speed increase. Still not as good as functions though. (about 30% slower)

Cheers,
Carl

<snip>
I think I would use a generator to do state transitions, but make the state
external, or at least the part that's interesting to the outside world.

In this peculiar example the transition rules are the same for both states,
so you only need one implementation of the logic. So the example is not so nice.

... for action in events:
... if action == 'lift': state.name = 'ON'
... elif action == 'push': state.name = 'OFF'
... else:
... state.name = 'END'
... break
... yield state
... ... import random
... return iter([random.choice(['lift','push']) for i in range(n-1)] + [None])
... ... state = State()
... state.name = 'ON'
... from time import clock
... t0 = clock()
... for state in fsm(state, actions(r)): pass
... t1 = clock()
... print '%12.6f'%((t1-t0)/r)
...

>>> test(1000) 0.000058
>>> test(1000) 0.000032
>>> test(1000) 0.000032
>>> test(100000) 0.000032
>>> a = list(actions(10))
>>> a ['lift', 'push', 'push', 'lift', 'push', 'lift', 'push', 'lift', 'lift', None]
>>> state = State()
>>> state.name = 'START'
>>> f = fsm(state, a)
>>> for state in f: print state.name,

Click to expand...

Click to expand...

...
ON OFF OFF ON OFF ON OFF ON ON
Obviously you can keep state in the fsm generator by placing yields in different places and
looping in different ways, but always storing the externally interesting state as attributes
of the state parameter and yielding that to tell the world the latest. Since only attributes
are being modified, the original state binding could be used and the generator's yielded
value could be ignored, but it could be handy if the generator is passed around IWT.

The world could also feed info in as attributes of state. And other generators could share
the same external state variable and all kinds of weird things could be built ;-)

Regards,
Bengt Richter

Magnus Lycka · Jan 25, 2006

Wolfgang said:
The way I understand this, resuming a generator causes less overhead than the
inital overhead of a function call.

I don't have Python 2.4 handy, but it doesn't seem to be true in 2.3.
I'm not very proficient with generators though, so maybe I'm doing
something stupid here...
.... return 1
........ while 1:
.... yield 1
........ start = time.time()
.... for i in xrange(n):
.... c()
.... print time.time()-start
....0.293347120285

For refernce:.... start = time.time()
.... for i in xrange(n):
.... pass
.... print time.time()-start
....0.0523891448975

Maybe the ratio is completely different in a newer Python than
2.3.4 (RH EL3 standard install). Or maybe it's very different if
there are plenty of local variables etc in f / g.

Peter Hansen · Jan 25, 2006

Wolfgang said:
That was not what I wrote.

The way I understand this, resuming a generator causes less overhead than the
inital overhead of a function call.

I think in the spirit of avoiding "premature optimization" and all
things related, now is the time to re-inject the other part of my
above-quoted posting, where I also said:

"""
Of course, if you have a state machine with many small states each doing
a tiny bit of processing and you're still concerned over performance,
you probably should be looking into Pysco or Pyrex and avoid making your
code really unreadable.
"""

-Peter

skip · Jan 25, 2006

Wolfgang> So basically if I want to write a long-running program in
Wolfgang> Python, it would make sense to code all functions that are
Wolfgang> likely to be called more than once as generators...

Skip> If they need to resume their calculations from where they left off
Skip> after the last yield.

Bengt> Hm, I wonder how (all untested)
...
Bengt> would compare to
...

I was thinking about things like complex tokenizers. Take a look, for
example, at the SpamBayes tokenizer and think about how to implement it
without yield. Clearly it can be don (and not all that difficult). Still,
I think the generator version would be easier to understand and modify.

Skip

skip · Jan 25, 2006

Wolfgang> Well, no, independently from that.

Wolfgang> Just to avoid to inital overhead of the function call.

How do you pass in parameters? Consider:

def square(x):
return x*x

vs

def square(x)
while True:
yield x*x

How do you get another value of x into the generator?
... while True:
... yield x*x
... Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: expected 0 arguments, got 1

Skip

How to improve writing code in python?	2	Feb 5, 2013
how to change response code in CGIHTTPServer.py	1	May 28, 2009
A possible change to decimal.Decimal?	5	Mar 2, 2012
Python3: to add, remove and change	3	Apr 2, 2009
How to avoid reverse code engineering of a python executable	3	Aug 8, 2007
How to test a URL request in a "while True" loop	11	Dec 30, 2009
How to access C variables in Python code object generated byPy_C ompileString	1	Nov 20, 2007
Noob: What is a slot? Me trying to understand another's code	2	Sep 4, 2007

Python code written in 1998, how to improve/change it?

Petr Jakes

Carl Cerecke

Bengt Richter

Wolfgang Keller

Carl Cerecke

skip

Wolfgang Keller

Fredrik Lundh

Carl Cerecke

Carl Cerecke

Carl Cerecke

Bengt Richter

Wolfgang Keller

Fredrik Lundh

Wolfgang Keller

Bengt Richter

Magnus Lycka

Peter Hansen

skip

skip

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads