Python code written in 1998, how to improve/change it?

P

Petr Jakes

Sorry for the typo in my previous posting. Of course it has to be: ....
simple liftt/push ON/OFF button....
 
C

Carl Cerecke

Petr said:
Sorry, I can't get in. Can you please show me, how to use your approach
on the simple push/push ON/OFF button for example please?

PS: seriously it is not a homework :) and I feel it like a shame I am
asking such a simple questions :(

States: ON, OFF
Transition event: "push", "lift"

transition diagram:
=========================

___ lift
| |
_V___|__
,->| ON |__
| |________| |
lift | | push
| ________ |
'--| OFF |<-'
|________|
^ |
|___|push

As a starting point, how about:

l = 'lift'
p = 'push'

action_sequence = [l,p,p,l,l,p,l,p,None]
next_action = iter(action_sequence).next

s_on = compile('''
print 'on'
action = next_action()
if action == 'lift':
state = s_on
elif action == 'push':
state = s_off
else:
state = None
''','','exec')

s_off = compile('''
print 'off'
action = next_action()
if action == 'lift':
state = s_on
elif action == 'push':
state = s_off
else:
state = None
''','','exec')


state = s_on # start state
while state:
eval(state)



Cheers,
Carl
 
B

Bengt Richter

How about something like
actions = dict(
... a=compile('print "A"; state="b"','','exec'),
... b=compile('print "B"; state="c"','','exec'),
... c=compile('print "C"; state=None','','exec')
... )
state = 'a'
while state: eval(actions[state])
...
A
B
C

Good idea. But we can eliminate the dictionary lookup:

a1 = compile('print "A"; state=b1','','exec')
b1 = compile('print "B"; state=c1','','exec')
c1 = compile('print "C"; state=None','','exec')

state = a1
while state:
eval(state)
Cool. But unfortunately, neither version works inside a function's local namespace.
Using exec instead of eval seems to do it in either context though.
Now, how can we get optimized code (i.e., LOAD_FAST vs LOAD_NAME etc in a1 etc)
without byte code hackery?

Regards,
Bengt Richter
 
W

Wolfgang Keller

I believe the more modern approach to this is to use generators in some
way, yield each other as the next state. This way you avoid all almost
all the function call overhead (the part that takes significant time,
which is setting up the stack frame) and don't have to resort to
bytecode hacks for better performance.

Dumb question from a endless newbie scripting dilettant:

Do you have a reference to a cookbook example for this method?

Sidequestion: As I understand it, as a general rule generators are more
efficient than functions in any case where the function is called several
times, right? So basically if I want to write a long-running program in
Python, it would make sense to code all functions that are likely to be
called more than once as generators...

TIA,

Sincerely,

Wolfgang Keller
 
C

Carl Cerecke

Wolfgang said:
Dumb question from a endless newbie scripting dilettant:

Do you have a reference to a cookbook example for this method?

It turns out that generators are more efficient than the eval function
excuting bits of compiled code. About 20-25% faster.

However, the generators exhibit some weird behaviour.

The included code shows an example using a compile/eval FSM method, and
a generator method. Note, in particular, the pattern of ids of the
generator objects. I had expected some complications in the ids, but not
a repeatable sequence of length 3 after the first generator object.

What is going on? Anybody?

#!/usr/bin/env python

import time

s_on = compile('''
print 'on'
action = next_action()
if action == 'lift':
state = s_on
elif action == 'push':
state = s_off
else:
state = None
''','','exec')

s_off = compile('''
print 'off'
action = next_action()
if action == 'lift':
state = s_on
elif action == 'push':
state = s_off
else:
state = None
''','','exec')


def g_on():

print "on"
action = next_action()
if action == 'lift':
yield g_on()
elif action == 'push':
yield g_off()
else:
yield None

def g_off():

print "off"
action = next_action()
if action == 'lift':
yield g_on()
elif action == 'push':
yield g_off()
else:
yield None


def actions(n):
import random
for i in range(n-1):
yield random.choice(['lift','push'])
yield None

#r = 1000000
r = 10
next_action = actions(r).next

#a = time.clock()
while next_action():
pass
#z = time.clock()
#print "action generator:",z-a
next_action = actions(r).next
#print "---"
state = s_on # start state
#a = time.clock()
while state:
eval(state)
#z = time.clock()
#print z-a
print "---"

next_action = actions(r).next
s_g_on = g_on()
s_g_off = g_off()
state = s_g_on
#a = time.clock()
while state:
print id(state)
state = state.next()
#z = time.clock()
#print z-a


#Cheers,
#Carl.
 
S

skip

Wolfgang> So basically if I want to write a long-running program in
Wolfgang> Python, it would make sense to code all functions that are
Wolfgang> likely to be called more than once as generators...

If they need to resume their calculations from where they left off after the
last yield.

Skip
 
W

Wolfgang Keller

def g_on():

print "on"
action = next_action()
if action == 'lift':
yield g_on()
elif action == 'push':
yield g_off()
else:
yield None

def g_off():

print "off"
action = next_action()
if action == 'lift':
yield g_on()
elif action == 'push':
yield g_off()
else:
yield None

Amazing.

Executable pseudo-code, really. :)

And that's even (run-time) efficient?

Tanks a lot,

Sincerely,

Wolfgang Keller
 
F

Fredrik Lundh

Carl said:
It turns out that generators are more efficient than the eval function
excuting bits of compiled code. About 20-25% faster.

why are you using generators to return things from a function, when
you can just return the things ?

def f_on():
print "on"
action = next_action()
if action == 'lift':
return f_on
elif action == 'push':
return f_off

def f_off():
print "off"
action = next_action()
if action == 'lift':
return f_on
elif action == 'push':
return f_off

state = f_on
while state:
state = state()

</F>
 
C

Carl Cerecke

Fredrik said:
Carl Cerecke wrote:




why are you using generators to return things from a function, when
you can just return the things ?

Trying to find the fastest way to implement finite state machine.
The included file times 4 different ways, with functions the fastest.

The reason, I think, for the unusual sequence if id()s of the generators
- see grandparent post - (and the reason for their poor performance
compared to functions), is because the reference to the generator is
being lost, then another generator is being created with the same id.
Properly done, I would expect generators to out perform functions.

These are the numbers I get on a P3 600:
$ ./foo.py
action generator overhead: 13.25
---
exec: 8.7
---
eval: 10.09
---
generators: 6.68
---
functions: 3.37


#!/usr/bin/env python

import time

s_on = compile('''
#print 'on'
action = next_action()
if action == 'lift':
state = s_on
elif action == 'push':
state = s_off
else:
state = None
''','','exec')

s_off = compile('''
#print 'off'
action = next_action()
if action == 'lift':
state = s_on
elif action == 'push':
state = s_off
else:
state = None
''','','exec')

def f_on():
action = next_action()
if action == 'lift':
return f_on
elif action == 'push':
return f_off
else:
return None

def f_off():
action = next_action()
if action == 'lift':
return f_on
elif action == 'push':
return f_off
else:
return None


def g_on():

#print "on"
action = next_action()
if action == 'lift':
yield g_on()
elif action == 'push':
yield g_off()
else:
yield None

def g_off():

#print "off"
action = next_action()
if action == 'lift':
yield g_on()
elif action == 'push':
yield g_off()
else:
yield None


def actions(n):
import random
for i in range(n-1):
yield random.choice(['lift','push'])
yield None

r = 1000000
#r = 10
next_action = actions(r).next

a = time.clock()
while next_action():
pass
z = time.clock()
print "action generator overhead:",z-a
common = z-a

print "---"
next_action = actions(r).next
state = s_on # start state
a = time.clock()
while state:
exec state
z = time.clock()
print "exec:",z-a-common

print "---"
next_action = actions(r).next
state = s_on # start state
a = time.clock()
while state:
eval(state)
z = time.clock()
print "eval:",z-a-common

print "---"
next_action = actions(r).next
s_g_on = g_on()
s_g_off = g_off()
state = s_g_on
a = time.clock()
while state:
#print id(state)
state = state.next()
z = time.clock()
print "generators:",z-a-common

print "---"
next_action = actions(r).next
state = f_on
a = time.clock()
while state:
state = state()
z = time.clock()
print "functions:",z-a-common
 
C

Carl Cerecke

Carl said:
Trying to find the fastest way to implement finite state machine.
The included file times 4 different ways, with functions the fastest.

The reason, I think, for the unusual sequence if id()s of the generators
- see grandparent post - (and the reason for their poor performance
compared to functions), is because the reference to the generator is
being lost, then another generator is being created with the same id.
Properly done, I would expect generators to out perform functions.

Generator FSM done properly (well, better anyway). They are still almost
twice as slow as plain function calls, though.

def g_on():

while 1:
action = next_action()
if action == 'lift':
yield s_g_on
elif action == 'push':
yield s_g_off
else:
break
yield None

def g_off():

while 1:
action = next_action()
if action == 'lift':
yield s_g_on
elif action == 'push':
yield s_g_off
else:
break
yield None

def actions(n):
import random
for i in range(n-1):
yield random.choice(['lift','push'])
yield None

r = 1000000
#r = 10
next_action = actions(r).next
s_g_on = g_on()
s_g_off = g_off()
state = s_g_on

while state:
state = state.next()
z = time.clock()
 
C

Carl Cerecke

Adding a continue statemtent after the yield statements yields :) a
speed increase. Still not as good as functions though. (about 30% slower)

Cheers,
Carl
 
B

Bengt Richter

Wolfgang> So basically if I want to write a long-running program in
Wolfgang> Python, it would make sense to code all functions that are
Wolfgang> likely to be called more than once as generators...

If they need to resume their calculations from where they left off after the
last yield.
Hm, I wonder how (all untested)

def foo(x,y):
return x**2+y**2
for pair in pairs: print foo(*pair)

would compare to

def bar(arglist):
while True:
x,y = arglist
yield x**2 + y**2
barargs = []
ibar = bar(barargs).next
for barargs[:] in pairs: print ibar()

Of course, for simple stuff there's always manual inlining ;-)

for x, y in pairs: print x**2, y**2

Hm, <bf warning> it might be interesting if one could bind arg list items of
a generator from the outside, e.g.,

def gfoo(x, y):
while True:
yield x**2 + y**2

ifoo = gfoo('dummy','dummy') # or first pair
for ifoo.x, ifoo.y in pairs: print ifoo.next()

Regards,
Bengt Richter
 
W

Wolfgang Keller

So basically if I want to write a long-running program in
This was meant as a question.
If they need to resume their calculations from where they left off after the
last yield.

Well, no, independently from that.

Just to avoid to inital overhead of the function call.

?

Sincerely,

Wolfgang Keller
 
F

Fredrik Lundh

Wolfgang said:
This was meant as a question.


Well, no, independently from that.

Just to avoid to inital overhead of the function call.

?

what makes you think that resuming a generator won't involve function
calls ?

</F>
 
W

Wolfgang Keller

what makes you think that resuming a generator won't involve function

That was not what I wrote.

I believe the more modern approach to this is to use generators in some
way, yield each other as the next state. This way you avoid all almost
all the function call overhead (the part that takes significant time,
which is setting up the stack frame)

The way I understand this, resuming a generator causes less overhead than the
inital overhead of a function call.

Again, I'm just a poor scripting dilettant who's asking questions.

Sincerely,

Wolfgang Keller
 
B

Bengt Richter

Adding a continue statemtent after the yield statements yields :) a
speed increase. Still not as good as functions though. (about 30% slower)

Cheers,
Carl
<snip>
I think I would use a generator to do state transitions, but make the state
external, or at least the part that's interesting to the outside world.

In this peculiar example the transition rules are the same for both states,
so you only need one implementation of the logic. So the example is not so nice.

... for action in events:
... if action == 'lift': state.name = 'ON'
... elif action == 'push': state.name = 'OFF'
... else:
... state.name = 'END'
... break
... yield state
... ... import random
... return iter([random.choice(['lift','push']) for i in range(n-1)] + [None])
... ... state = State()
... state.name = 'ON'
... from time import clock
... t0 = clock()
... for state in fsm(state, actions(r)): pass
... t1 = clock()
... print '%12.6f'%((t1-t0)/r)
...
>>> test(1000) 0.000058
>>> test(1000) 0.000032
>>> test(1000) 0.000032
>>> test(100000) 0.000032
>>> a = list(actions(10))
>>> a ['lift', 'push', 'push', 'lift', 'push', 'lift', 'push', 'lift', 'lift', None]
>>> state = State()
>>> state.name = 'START'
>>> f = fsm(state, a)
>>> for state in f: print state.name,
...
ON OFF OFF ON OFF ON OFF ON ON
Obviously you can keep state in the fsm generator by placing yields in different places and
looping in different ways, but always storing the externally interesting state as attributes
of the state parameter and yielding that to tell the world the latest. Since only attributes
are being modified, the original state binding could be used and the generator's yielded
value could be ignored, but it could be handy if the generator is passed around IWT.

The world could also feed info in as attributes of state. And other generators could share
the same external state variable and all kinds of weird things could be built ;-)

Regards,
Bengt Richter
 
M

Magnus Lycka

Wolfgang said:
The way I understand this, resuming a generator causes less overhead than the
inital overhead of a function call.

I don't have Python 2.4 handy, but it doesn't seem to be true in 2.3.
I'm not very proficient with generators though, so maybe I'm doing
something stupid here...
.... return 1
........ while 1:
.... yield 1
........ start = time.time()
.... for i in xrange(n):
.... c()
.... print time.time()-start
....0.293347120285

For refernce:.... start = time.time()
.... for i in xrange(n):
.... pass
.... print time.time()-start
....0.0523891448975

Maybe the ratio is completely different in a newer Python than
2.3.4 (RH EL3 standard install). Or maybe it's very different if
there are plenty of local variables etc in f / g.
 
P

Peter Hansen

Wolfgang said:
That was not what I wrote.



The way I understand this, resuming a generator causes less overhead than the
inital overhead of a function call.

I think in the spirit of avoiding "premature optimization" and all
things related, now is the time to re-inject the other part of my
above-quoted posting, where I also said:

"""
Of course, if you have a state machine with many small states each doing
a tiny bit of processing and you're still concerned over performance,
you probably should be looking into Pysco or Pyrex and avoid making your
code really unreadable.
"""

-Peter
 
S

skip

Wolfgang> So basically if I want to write a long-running program in
Wolfgang> Python, it would make sense to code all functions that are
Wolfgang> likely to be called more than once as generators...

Skip> If they need to resume their calculations from where they left off
Skip> after the last yield.

Bengt> Hm, I wonder how (all untested)
...
Bengt> would compare to
...

I was thinking about things like complex tokenizers. Take a look, for
example, at the SpamBayes tokenizer and think about how to implement it
without yield. Clearly it can be don (and not all that difficult). Still,
I think the generator version would be easier to understand and modify.

Skip
 
S

skip

Wolfgang> Well, no, independently from that.

Wolfgang> Just to avoid to inital overhead of the function call.

How do you pass in parameters? Consider:

def square(x):
return x*x

vs

def square(x)
while True:
yield x*x

How do you get another value of x into the generator?
... while True:
... yield x*x
... Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: expected 0 arguments, got 1

Skip
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,777
Messages
2,569,604
Members
45,235
Latest member
Top Crypto Podcasts_

Latest Threads

Top