Python's BNF

M

MartinRinehart

I spent too long Googling for Python's BNF. Eventually found it at
Python.org, but only by accident.

I've put Python's BNF here: http://www.martinrinehart.com/articles/python-parse-bnf.html

Extensively cross-referenced.

Addenda:

No, Google, I didn't want the Botswana daily news with its article on
the Botswana
National Front and another on a fellow arrested for having contraband
python skins.

Did this with a Python program. Used to use Perl for this sort of
thing. Doubt I'll ever write another line of Perl.

There's a link to the program, top-right of page. If anyone wants to
look at this no-longer-quite-newbie's code and give a constructive
critique, I'd be grateful.
 
B

Ben Finney

No, Google, I didn't want the Botswana daily news with its article
on the Botswana National Front and another on a fellow arrested for
having contraband python skins.

+1 QOTW
 
G

Gabriel Genellina

I spent too long Googling for Python's BNF. Eventually found it at
Python.org, but only by accident.

There's a link to the program, top-right of page. If anyone wants to
look at this no-longer-quite-newbie's code and give a constructive
critique, I'd be grateful.

- Try not to use so many globals. By example:

ofile = open( '\decaf\python-parse-bnf.html', 'w' )
writeHTML()
ofile.close()

I'd pass ofile as an argument to writeHTML - along with *what* to write.

- The "\" is an escape character. Use either
'\\decaf\\python-parse-bnf.html'
or
r'\decaf\python-parse-bnf.html'
or
'/decaf/python-parse-bnf.html'
(the later works more-or-less with all internal Windows functions, but
usually not from the command line)

- Instead of:

while i < len( bnf ):
s = bnf

use

for s in bnf:

That is, iterate over the list elements (what you want), not over the
indexes. In case you want also the index, use: for i,item in
enumerate(items):

- The fileRead function could be written simply as: return
open(...).readlines(); but you don't have to keep all the contents in
memory at once, so forget about fileRead and make prepareDict iterate over
the file, line by line; that's easy:

for s in open(...):
if startDef:...

- This whole block:

plist = []
for pname in productions:
plist.append( pname )
plist.sort()

is the same as:

plist = sorted(productions)
 
M

MartinRinehart

Implemented all your suggestions, with two exceptions.

Changed file read to readlines(), but not open(...).readlines(). I
love to say file.close(). Gives me a feeling of security. (We could
discuss RAM waste v. I/O speed but this input file is just 10KB, so
neither matters.)

Removed one of the three globals, but left the other two. Couldn't see
any real advantage to passing poor 'ofile' from hand to hand
(writeHTML() to writeBody() to writeEntries() ...) as opposed to
letting him rest easy in his chair, doing a little writing when
called.

Also changed the opening doc comment to give you appropriate credit.
Thanks again.
 
G

Gabriel Genellina

Implemented all your suggestions, with two exceptions.

Changed file read to readlines(), but not open(...).readlines(). I
love to say file.close(). Gives me a feeling of security. (We could
discuss RAM waste v. I/O speed but this input file is just 10KB, so
neither matters.)

Ok, what about this?

with open(...) as f:
for line in f:
...do something with each line...

The with statement guarantees that close (implicit) is always called.
Removed one of the three globals, but left the other two. Couldn't see
any real advantage to passing poor 'ofile' from hand to hand
(writeHTML() to writeBody() to writeEntries() ...) as opposed to
letting him rest easy in his chair, doing a little writing when
called.

Perhaps not in this small script, but as a design principle, having a
global `ofile` object doesn't look good...
Also changed the opening doc comment to give you appropriate credit.

No need for that.
 
S

Steve Holden

Implemented all your suggestions, with two exceptions.

Changed file read to readlines(), but not open(...).readlines(). I
love to say file.close(). Gives me a feeling of security. (We could
discuss RAM waste v. I/O speed but this input file is just 10KB, so
neither matters.)

Removed one of the three globals, but left the other two. Couldn't see
any real advantage to passing poor 'ofile' from hand to hand
(writeHTML() to writeBody() to writeEntries() ...) as opposed to
letting him rest easy in his chair, doing a little writing when
called.
Note that if you call the parameter to the function "ofile" as well, you
don't need to change *any* other code to get correct behavior.

But in general terms it's a practice you should adopt as early as
possible, because it gets very tedious very quickly to have to trace the
use of global variables in programs of any size.

Avoiding the use of globals is just good program hygiene.

regards
Steve
 
M

MartinRinehart

Gabriel and Steve,

Poor globals! They take such a beating and they really don't deserve
it.

The use of globals was deprecated, if memory serves, during the
structured design craze. Using globals is now considered bad practice,
but it's considered bad practice for reasons that don't stand close
scrutiny, this being a perfect example.

ofile = ...

# global
writeHTML()
def writeHTML():
ofile.write( .. )
writeBody()
def writeBody():
ofile.write( ... )
writeEntries()
def writeEntries()
ofile.write( ... )
writeEntry()
def writeEntry():
ofile.write( ... )
....

# "fixed" to eliminate the evil global
writeHTML(ofile)
def writeHTML(ofile):
ofile.write( .. )
writeBody(ofile)
def writeBody(ofile):
ofile.write( ... )
writeEntries(ofile)
def writeEntries(ofile)
ofile.write( ... )
writeEntry(ofile)
def writeEntry(ofile):
ofile.write( ... )
....
# repeat above for another half dozen subs that also use ofile

The code's simpler before the fix.

So, as a nod to the anti-global school of thought, I changed 'ofile'
to 'OFILE' so that it would at least look like a global constant. Then
I changed to '_OFILE' as a reminder that this is a modular, not
global, constant. Ditto for '_PRODUCTIONS'. Modular constants share
exactly none of the coupling problems that globals can have. You'll
let me use modular constants, right?
 
S

Steve Holden

Gabriel and Steve,

Poor globals! They take such a beating and they really don't deserve
it.

The use of globals was deprecated, if memory serves, during the
structured design craze. Using globals is now considered bad practice,
but it's considered bad practice for reasons that don't stand close
scrutiny, this being a perfect example.
Times move on, but some ideas remain bad ideas, and the *uncontrolled*
used of globals is exactly one such bad idea. You seem to believe that
the tenets of structured programming have been deprecated, but in fact
they have been incorporated into the mainstream.
ofile = ...

# global
writeHTML()
def writeHTML():
ofile.write( .. )
writeBody()
def writeBody():
ofile.write( ... )
writeEntries()
def writeEntries()
ofile.write( ... )
writeEntry()
def writeEntry():
ofile.write( ... )
...

# "fixed" to eliminate the evil global
writeHTML(ofile)
def writeHTML(ofile):
ofile.write( .. )
writeBody(ofile)
def writeBody(ofile):
ofile.write( ... )
writeEntries(ofile)
def writeEntries(ofile)
ofile.write( ... )
writeEntry(ofile)
def writeEntry(ofile):
ofile.write( ... )
...
# repeat above for another half dozen subs that also use ofile

The code's simpler before the fix.
It's also more error-prone and more difficult to read. But since you
should probably be encapsulating all this into an HTMLWriter object
class anyway I suppose there's no point trying to convince you any
further. See below for a point that *might* (just possibly) change your
mind, though.
So, as a nod to the anti-global school of thought, I changed 'ofile'
to 'OFILE' so that it would at least look like a global constant. Then
I changed to '_OFILE' as a reminder that this is a modular, not
global, constant. Ditto for '_PRODUCTIONS'. Modular constants share
exactly none of the coupling problems that globals can have. You'll
let me use modular constants, right?

Nope. You are changing the form without changing the substance, so my
objections still stand. Your code makes the assumption that nobody else
will ever want to reuse it, and by doing so almost guarantees that
nobody will. The principles of modularity are there for a reason, and
while I can admit I have written similar programs myself as quick
throwaways I would never dream of promoting them as examples of sound
practice. I can't believe I am going to have to mention "coupling" and
"coherence" as useful design principles yet again.

One final issue: in cases where your functions and methods do
significant work, the ofile argument is passed into the local namespace
of the function or method. This means that no late-binding name lookup
need be performed, which represents a substantial improvement in
execution speed in the event there are many uses of the same global.

Or perhaps you would rather "optimize" that by writing

ofile = ...
writeHTML()
def writeHTML():
global ofile
my_ofile = ofile
my_ofile.write( .. )
writeBody()

I wish you'd stop trying to defend this code and simply admit that it's
just a throwaway program to which no real significance should be
attached. *Then* I'll leave you alone ;-)

regards
Steve
 
P

Paul McGuire

So, as a nod to the anti-global school of thought, I changed 'ofile'
to 'OFILE' so that it would at least look like a global constant.

Unfortunately, it's not constant at all. Actually, what you have done
is worse. Now you have taken a variable that has mutable state, that
changes in various places throughout the module, and made it look like
a nice safe dependable constant.


_OFILE = None
still_reading = False

def open_file(name):
global _OFILE, still_reading
_OFILE = open(name)
still_reading = True

def process_line(function):
function( _OFILE.readline() )

def extract_data(dataline):
global _OFILE, still_reading
if dataline.startswith("END"):
close_file()

def close_file():
global _OFILE, still_reading
OFILE.close()
OFILE = None
still_reading = False

# main code
open_file()
while still_reading:
process_line(extract_data)
close_file()


Of course, this is a semi-contrived example, but the scattered access
of OFILE and still_reading make sorting out this mess a, well, a
mess. And I wouldn't really say that just passing around ofile as an
argument to every function is necessarily a sufficient solution. I've
seen this approach run amok, with a global data structure
(representing the current runtime environment plus recent history,
plus database connection info, plus sundry other parts of the kitchen
sink) that was passed BY PROJECT CODING STANDARDS to EVERY FUNCTION IN
EVERY MODULE! Supposedly, this was done to cure access problems to a
global data structure. In fact, it solved none of the problems, AND
polluted every function signature to boot!

The main point is that access to ofile should be limited to those
functions that actually update it. If ofile is a global, then there
is no control within a module over it, and if not named with a leading
'_', it is even visible externally to the module. As Steve H said,
ideally this would be a hidden attribute within a class, accessed as
self._ofile, so that we would at least know that there is no access
beyond the boundaries of the class to this variable. It would also
enhance the flexibility of the code - now by instantiating a second
instance of this class, I might be able to work with *two* files at
the same time, without any global array or list legerdemain.

-- Paul

class Horse:
def lead_to_water(self):
self.at_water = True
def make_drink(self):
raise AttributeException("can't do this")
 
M

MartinRinehart

Steve said:
I wish you'd stop trying to defend this code and simply admit that it's
just a throwaway program to which no real significance should be
attached. *Then* I'll leave you alone ;-)

You're hurting my program's feelings!

Actually, I intend to keep this program as the nice HTML version of
the BNF that it produces is a big advance over the source from
GvR. I'll rerun this from time to time to keep my BNF up to date.
A second intent was to have the next sorry coder who searched for
'Python BNF' get a better answer than I got, which is already the
case.

The discussion re globals is entirely academic, of course, with this
little program being merely illustrative.
 
M

MartinRinehart

Paul McGuire wrote:
< plus sundry other parts of the kitchen
sink) that was passed BY PROJECT CODING STANDARDS to EVERY FUNCTION IN
EVERY MODULE! Supposedly, this was done to cure access problems to a
global data structure.

Beautiful example of how totally stupid actions can be taken in the
name of
avoiding globals.

Wikipedia and PEP 8 both agree with me, re globals. OK way to
eliminate chained
argument passing, but keep them within one module.

_OFILE, by the way, is the output file. It's contract with the rest of
the module was
to be available for writing to anyone with data to write. I use the
Java convention
of ALLCAPS for naming things that I would declare as CONSTANT if
Python had
such a declaration.
 
G

Gabriel Genellina

_OFILE, by the way, is the output file. It's contract with the rest of
the module was
to be available for writing to anyone with data to write. I use the
Java convention
of ALLCAPS for naming things that I would declare as CONSTANT if
Python had
such a declaration.

(Uhm... I'd consider _OFILE a variable, not a constant!)
What if you want to generate many files, maybe one per declaration? You
might want to reuse some of those functions, but you can't do that with a
global _OFILE.

About the generated page: I think it would be more useful if each symbol
links to its definition, instead of showing an alert(). This way it's
easier to navigate the tree, specially with complex declarations. You can
place the current text into a "title" attribute and most browsers will
show it as a tooltip.
Also it would be nice if at least it were valid HTML, and semantically
correct - using DL, DT and DD tags instead of tables, by example.
 
M

MartinRinehart

Gabriel said:
About the generated page: I think it would be more useful if each symbol
links to its definition, instead of showing an alert(). This way it's
easier to navigate the tree, specially with complex declarations.

That was my first shot. It didn't work. (Every line is its own table
because
you can't put named anchors inside a table, something I really
regret.)
If the production is already viewable, there's no jump when you click
the
link.. If the production is on the last page (many are)
you jump to the last page and then have to hunt down the production.
Remind me to figure out Ajax so you get what we really want: click on
it
and see its definition highlighted.
You can place the current text into a "title" attribute and most browsers
will show it as a tooltip.

Great! Consider it done. Konqueror, Firefox, Opera and even MSIE.

How would <dl>, <dt> and <dd> make our lives easier?

I also tried putting the definitions into the status bar. If you ever
feel
tempted to do something similar, let the urge pass. (This now works
in Firefox 2 if you use the correct settings to allow javascript to
write
to the status bar. It doesn't work in Konqueror or Opera and I can't
even find the settings for MSIE.) Trying to get this to work DID get me
to considerably improve the readability of the HTML, so it wasn't a
total waste.

The tooltips are a big step forward. Thanks again.

Improved version at
http://www.MartinRinehart.com/articles/python-parse-bnf.html
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top