Programming D. E. Knuth in Python with the Deterministic Finite Automatonconstruct

A

Antti J Ylikoski

In his legendary book series The Art of Computer Programming,
Professor Donald E. Knuth presents many of his algorithms in the form
that they have been divided in several individual phases, with
instructions to GOTO to another phase interspersed in the text of the
individual phases.



I. e. they look like the following, purely invented, example: (Knuth is
being clearer than me below.....)



A1. (Do the work of Phase A1.) If <zap> then go to Phase A5,
otherwise continue.

A2. (Do some work.) If <zorp> go to Phase A4.

A3. (Some more work.)

A4. (Do something.) If <condition ZZZ> go to Phase A1.

A5. (Something more). If <foobar> then go to Phase A2, otherwise
end.



I came across the problem, which would be the clearest way to program
such algorithms with a programming language such as Python, which has
no GOTO statement. It struck me that the above construction actually
is a modified Deterministic Finite Automaton with states A1 -- A5 +
[END], transferring to different states, not on read input, but
according to conditions in the running program.

So one very clear way to program Knuth with Python is the following
kind of a construct.



continueLoop = 1
nextState = "A1"

while continueLoop:
if nextState == "A1":
# (Do the work of Phase A1.)
if <zap>:
nextState = "A5"
elif nextState == "A2":
# (Do some work.)
if zorp:
nextState = "A4"
else:
nextState = "A3"
elif nextState == "A3":
# (Some more work.)
nextState = "A4"
elif nextState == "A4":
# (Do something.)
if ZZZ:
nextState = "A1"
else:
nextState = "A5"
elif nextState == "A5":
# (Something more).
if foobar:
nextState = "A2"
else:
continueLoop = 0
else:
error("Impossible -- I quit!\n")



Following is a working Python function which iteratively calculates
the lexicographically ordered permutations of integers [1, 2, 3, 4,
...., n], where n is an arbitary integer. The function was written
after D. E. Knuth with the abovementioned DFA construct.




def iterAllPerm(n):

# iteratively generate all permutations of n integers 1-n
# After Donald Knuth, The Art of Computer Programming, Vol4,
# Fascicle 2,
# ISBN 0-201-85393-0. See pp. 39--40.

listofPerm = [] # list of lists to collect permutations
continueLoop = 1 # indicates whether to continue the iteration
nextStat = "L1" # next phase in Knuth's text
a = list(range(0, n+1)) # [0, 1, 2, 3, 4, ..., n] -- see Knuth

while continueLoop:
if nextStat == "L1":
app = listofPerm.append(a[1:n+1])
nextStat = "L2"
continueLoop = 1
elif nextStat == "L2":
j = n - 1
while a[j] >= a[j+1]:
j -= 1
if j == 0:
continueLoop = 0
nextStat = "Finis Algorithm"
else:
continueLoop = 1
nextStat = "L3"
elif nextStat == "L3":
l = n
while a[j] >= a[l]:
l -= 1
temp = a[j]
a[j] = a[l]
a[l] = temp
nextStat = "L4"
continueLoop = 1
elif nextStat == "L4":
k = j + 1
l = n
while k < l:
temp = a[k]
a[k] = a[l]
a[l] = temp
k += 1
l -= 1
nextStat = "L1"
continueLoop = 1
else:
continueLoop = 0
error("Impossible -- I quit!\n")

return(listofPerm)




kind regards, Antti J Ylikoski
Helsinki, Finland, the EU
 
M

Mel Wilson

Antti said:
In his legendary book series The Art of Computer Programming,
Professor Donald E. Knuth presents many of his algorithms in the form
that they have been divided in several individual phases, with
instructions to GOTO to another phase interspersed in the text of the
individual phases.



I. e. they look like the following, purely invented, example: (Knuth is
being clearer than me below.....)



A1. (Do the work of Phase A1.) If <zap> then go to Phase A5,
otherwise continue.

A2. (Do some work.) If <zorp> go to Phase A4.

A3. (Some more work.)

A4. (Do something.) If <condition ZZZ> go to Phase A1.

A5. (Something more). If <foobar> then go to Phase A2, otherwise
end.



I came across the problem, which would be the clearest way to program
such algorithms with a programming language such as Python, which has
no GOTO statement. It struck me that the above construction actually
is a modified Deterministic Finite Automaton with states A1 -- A5 +
[END], transferring to different states, not on read input, but
according to conditions in the running program.

So one very clear way to program Knuth with Python is the following
kind of a construct.

Yeah. This is an idea that came up during the '70s after Dijkstra published
his "GOTO Considered Harmful". Model the execution pointer as a state, and
then explicit changes to the execution pointer (prohibited in GOTO-less
languages) get replaced by assignments to the state. It preserves the
objectionable part of GOTO: that there's no easy way to predict the
conditions that any statement might execute under. You can't understand any
of the program until you understand all of the program.

I think Knuth kept the assembly-language model for his algorithms because it
promotes his real goal, which is mathematical analysis of the performance of
the algorithms. It helps that his algorithms are very short.

As "the quickest way to get a Knuth algorithm running in Python", this is a
pretty good idea. My own preference is to get the algo "really" coded in
Python, but that usually takes a fair bit of time and effort.

Mel.
 
R

Roy Smith

Antti J Ylikoski said:
I came across the problem, which would be the clearest way to program
such algorithms with a programming language such as Python, which has
no GOTO statement. It struck me that the above construction actually
is a modified Deterministic Finite Automaton with states A1 -- A5 +
[END], transferring to different states, not on read input, but
according to conditions in the running program.

So one very clear way to program Knuth with Python is the following
kind of a construct.



continueLoop = 1
nextState = "A1"

while continueLoop:
if nextState == "A1":
# (Do the work of Phase A1.)
if <zap>:
nextState = "A5"
elif nextState == "A2":
# (Do some work.)
if zorp:
nextState = "A4"
else:
nextState = "A3"
elif nextState == "A3":
# (Some more work.)
nextState = "A4"
elif nextState == "A4":
# (Do something.)
if ZZZ:
nextState = "A1"
else:
nextState = "A5"
elif nextState == "A5":
# (Something more).
if foobar:
nextState = "A2"
else:
continueLoop = 0
else:
error("Impossible -- I quit!\n")

Oh, my, I can't even begin to get my head around all the nested
conditionals. And that for a nearly trivial machine with only 5 states.
Down this path lies madness. Keep in mind that Knuth wrote The Art of
Computer Programming in the 1960s. The algorithms may still be valid,
but we've learned a lot about how to write readable programs since then.
Most people today are walking around with phones that have far more
compute power than the biggest supercomputers of Knuth's day. We're no
longer worried about bumming every cycle by writing in assembler.

When I've done FSMs in Python, I've found the cleanest way is to make
each state a function. Do something like:

def a1(input):
# (Do the work of Phase A1.)
if <zap>:
return a5 # goto state a5
else:
return a1 # stay in the same state

# and so on for the other states.

next_state = a1
for input in whatever():
next_state = next_state(input)
if next_state is None:
break

You can adjust that for your needs. Sometimes I have the states return
a (next_state, output) tuple. You could use a distinguished done()
state, or just use None for that. I wrote the example above as global
functions, but more commonly these would be methods of some StateMachine
class.
 
M

Michael Torrie

The resulting code is inefficient, difficult to comprehend and to mantain.


One should rewrite the code. There is a reason why Python doesn't have
gotos.

We appear to have a language barrier here. How should one rewrite the
code? Everyone knows python doesn't have gotos and state machines have
to be created using other mechanisms like loops, state variables, and
such. Your suggestion to "rewrite the code" is unhelpful to the OP if
you're not willing to suggest the best method for doing so. Saying, "be
like a decompiler" doesn't say anything.
 
A

Antti J Ylikoski

Antti J Ylikoski said:
I came across the problem, which would be the clearest way to program
such algorithms with a programming language such as Python, which has
no GOTO statement. It struck me that the above construction actually
is a modified Deterministic Finite Automaton with states A1 -- A5 +
[END], transferring to different states, not on read input, but
according to conditions in the running program.

So one very clear way to program Knuth with Python is the following
kind of a construct.



continueLoop = 1
nextState = "A1"

while continueLoop:
if nextState == "A1":
# (Do the work of Phase A1.)
if<zap>:
nextState = "A5"
elif nextState == "A2":
# (Do some work.)
if zorp:
nextState = "A4"
else:
nextState = "A3"
elif nextState == "A3":
# (Some more work.)
nextState = "A4"
elif nextState == "A4":
# (Do something.)
if ZZZ:
nextState = "A1"
else:
nextState = "A5"
elif nextState == "A5":
# (Something more).
if foobar:
nextState = "A2"
else:
continueLoop = 0
else:
error("Impossible -- I quit!\n")

Oh, my, I can't even begin to get my head around all the nested
conditionals. And that for a nearly trivial machine with only 5 states.
Down this path lies madness. Keep in mind that Knuth wrote The Art of
Computer Programming in the 1960s. The algorithms may still be valid,
but we've learned a lot about how to write readable programs since then.
Most people today are walking around with phones that have far more
compute power than the biggest supercomputers of Knuth's day. We're no
longer worried about bumming every cycle by writing in assembler.

When I've done FSMs in Python, I've found the cleanest way is to make
each state a function. Do something like:

def a1(input):
# (Do the work of Phase A1.)
if<zap>:
return a5 # goto state a5
else:
return a1 # stay in the same state

# and so on for the other states.

next_state = a1
for input in whatever():
next_state = next_state(input)
if next_state is None:
break

You can adjust that for your needs. Sometimes I have the states return
a (next_state, output) tuple. You could use a distinguished done()
state, or just use None for that. I wrote the example above as global
functions, but more commonly these would be methods of some StateMachine
class.

Thank you, that is a very good idea to my opinion.

Antti "Andy"
 
J

John Nagle

Right. Few programs should be written as state machines.
As a means of rewriting Knuth's algorithms, it's inappropriate.

Some should. LALR(1) parsers, such as what YACC and Bison
generate, are state machines. They're huge collections of nested
switch statements.

Python doesn't have a "switch" or "case" statement. Which is
surprising, for a language that loves dictionary lookups.
You can create a dict full of function names and lambdas, but
it's clunky looking.

John Nagle
 
M

Michael Torrie

Why should I write a treatise on decompilation techniques on this ng?

You were the one that said simply, you're doing it wrong followed by a
terse statement, do it like a decompiler. I am familiar with how one
might implement a decompiler, as well as a compiler (having written a
simple one in the past), but even now I don't see a connection between a
decompiler and the process of converting a knuth algorithm into a python
python implementation. I was hoping you would shed some light on that.
But alas, I'm not really as much of an "interested reader" as you would
like me to be.
That looks like a glaring contradiction to me...

True, if you wish to be pedantic. I should have said, "meaningless," or
at least, "not a useful response."
Here's an example of rewriting:
<snip>

Thank you. Your example makes more clear your assertion about "labels"
and how really A1 and A5 were the only real labels in the example.
Though I still do not really see why "states" is not a good equivalence
for labels in this case. As well, Roy's idea for doing the state
machines, which works equally well as the nested if statements, is more
pythonic, which is generally preferred in Python.
 
S

Steven D'Aprano

Thank you. Your example makes more clear your assertion about "labels"
and how really A1 and A5 were the only real labels in the example.
Though I still do not really see why "states" is not a good equivalence
for labels in this case.

Program labels are states.

You can treat every line of code as being invisibly labelled with the
line number. (Or visibly, if you are using BASIC back in 1975.) Clearly
"the interpreter is executing at line 42" is a state distinct from "the
interpreter is executing line 23", but that state *alone* is not
sufficient to know the overall state of the program.

Adding an explicit GOTO label does not change this.

But this refers to the state of the interpreter, not the state of the
program being executed, and either way, is not a state in the sense of a
finite state machine.
 
E

Evan Driscoll

In his legendary book series The Art of Computer Programming,
Professor Donald E. Knuth presents many of his algorithms in the form
that they have been divided in several individual phases, with
instructions to GOTO to another phase interspersed in the text of the
individual phases.


A1. (Do the work of Phase A1.) If <zap> then go to Phase A5,
otherwise continue.

A2. (Do some work.) If <zorp> go to Phase A4.

A3. (Some more work.)

A4. (Do something.) If <condition ZZZ> go to Phase A1.

A5. (Something more). If <foobar> then go to Phase A2, otherwise
end.

Clearly you just need the goto module (http://entrian.com/goto/):

from goto import goto, label

label .A1
# do work of phase A1
if <zap>: goto .A5

label .A2
# do some work
if <zorp>: goto .A4

# do some more work

label .A4
# do something
if <condition zzz>: goto .A1

label .A5
# something more
if <foobar>: goto .A2

Clearly the best solution of all.

(Note: do not actually do this.)

Evan


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJPZU+ZAAoJEAOzoR8eZTzgx6YH/1r43y6XWZixjFMgw8w4KFrO
gQdYN1sB/sjfjkMnqf8QmN7GKAlXWe9QxuqqqIB1E7dqNrIYwLgbhM2KaQe72huU
NSAlpSjbBeZNYnpZOYE0ITGQxKfpHV+b82FAGUYHMOoK4uJEpUQhmE5FBMW/+T82
3AF+mNSJddDbP/qEUv8x9BSjPuzl4NuC4Q1epnYJU7WQySvg4OM+UWDENaTEGTtq
VDYUDRRkbjHnZ0iSA9YLge44yehdHchAx+K6DKvnmwSHsD8Ozsy2z3gRbG2nd1Rq
0EBesNyYYlsJOUPJyec2BLw4AXGK9MfIbu38JHeS1lxPxuoMtBK++TlJuYkWGAk=
=Gb8C
-----END PGP SIGNATURE-----
 
A

Albert van der Horst

Program labels are states.

You can treat every line of code as being invisibly labelled with the
line number. (Or visibly, if you are using BASIC back in 1975.) Clearly
"the interpreter is executing at line 42" is a state distinct from "the
interpreter is executing line 23", but that state *alone* is not
sufficient to know the overall state of the program.

This is the idea of the original (not universal, hard coded) Turing
machine with cards. Of course you then still need the infinite tape
to store calculation input and output.
Adding an explicit GOTO label does not change this.

But this refers to the state of the interpreter, not the state of the
program being executed, and either way, is not a state in the sense of a
finite state machine.

I hope the reference to the Turing machine makes this clearer.
Turing Machines and Finite State Machines are different constructions
in automaton theory.

Remember those definitions are like

A Turing machine is a set <S, T, F, G, Q >

S the set of symbols <blank, 0, 1>
T a mapping of S onto IZ (natural numbers)
...
F is a mapping from SxT into G
..

Some such.

(A FSM is just different <A,B,C..Z> with different mappings )

The memory of the Turing machine is T , the tape, time dependant.
The program of the Turing is e.g. F, to be thought of as hard wiring.

A Turing machine is *not* a stored program computer!
The universal Turing machine is, it contains a hardwired program
to execute a stored program on the tape.

Groetjes Albert
 
A

Albert van der Horst

In his legendary book series The Art of Computer Programming,
Professor Donald E. Knuth presents many of his algorithms in the form
that they have been divided in several individual phases, with
instructions to GOTO to another phase interspersed in the text of the
individual phases.



I. e. they look like the following, purely invented, example: (Knuth is
being clearer than me below.....)



A1. (Do the work of Phase A1.) If <zap> then go to Phase A5,
otherwise continue.

A2. (Do some work.) If <zorp> go to Phase A4.

A3. (Some more work.)

A4. (Do something.) If <condition ZZZ> go to Phase A1.

A5. (Something more). If <foobar> then go to Phase A2, otherwise
end.

I can rewrite this into Python in my sleep, without resorting
to formal techniques.
Instead try one of the harder algorithms like T (Toom Cook)
that must be translated to recursive functions that pass
data down. That took me quite a wile.

The correct answer is, it is just labour. Deal with it.
Note that if you want to translate it to assembler, it is
relatively easy.

kind regards, Antti J Ylikoski
Helsinki, Finland, the EU

Groetjes Albert
 
D

Dennis Lee Bieber

Many ASM languages don't have structured control flow statements but
only jmps, which are roughly equivalent to gotos. A good decompiler will
need to analize the net of jmps and try to rewrite the code using
structured control flow statements.
The idea is to maximize readability, of course.
Never met Sigma's Meta-Symbol <G>

Okay, the machine level code was limited to basic condition/jump...
But a master of Meta-Symbol (I wasn't such -- not in a trimester college
course) could create macros that would make it structured.

In a way, Meta-Symbol wasn't an assembly language so much as a
language for defining assembly languages (I once wasted a few hours at
work writing out the Meta-Symbol definition file needed to produce
absolute-address 8080 output. Even the native Sigma instruction set had
to be loaded into Meta-Symbol before it could process a file).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top