pyparsing Catch-22

7

7stud

To the developer:

1) I went to the pyparsing wiki to download the pyparsing module and
try it
2) At the wiki, there was no index entry in the table of contents for
Downloads. After searching around a bit, I finally discovered a tiny
link buried in some text at the top of the home page.
3) Link goes to sourceforge. At sourceforge, there was a nice, green
'download' button that stood out from the page.
4) I clicked on the download button and got the warning:
-----
You have selected to download the pyparsing-1.4.6 release.

Below is a list of files contained in this release.
Before downloading, you may want to read the release notes.
-----

5) Can't find any release notes, nor any button to click to download
the package.
6) Give up in frustration.
7) A few minutes later, I decide: I will not give up.
8) I go back to sourceforge and start clicking every link on the page.
(Hello, porn sites! Just kidding.) Still no luck.
9) Finally. I click on something and a download begins. I cancel it.
10) Now I know what to click on, and I download the docs and
pyparsing-1.4.6.tar
11) Now what? I'm new to mac os x, and I have no idea what to do.
The wiki is devoid of any installation instructions.
12) I give up again.

For as hard as you push pyparsing on this forum, I would think you
would make it easier to download and install your module. In my
opinion, the wiki should provide detailed installation instructions
for all supported os's, and the sourceforge downloading process is too
complicated.
 
S

Steven Bethard

7stud said:
For as hard as you push pyparsing on this forum, I would think you
would make it easier to download and install your module. In my
opinion, the wiki should provide detailed installation instructions
for all supported os's, and the sourceforge downloading process is too
complicated.

FWIW, here's what works for me::

* Go to http://pyparsing.wikispaces.com/
* Click the link at the top that says "Download from SourceForge"
* Click the big green "Download Python parsing module" button
* Click the big green "Download" button next to "pyparsing-1.4.6"
* Click pyparsing-1.4.6.tar.gz
* Extract the dowloaded .tar.gz
* Use the standard python installation idiom "python setup.py install"

If you're not familiar with the standard Python installation idiom, take
a few minutes to read:

"Installing Python Modules"
http://docs.python.org/inst/inst.html

In particular it starts with "The new standard: Distutils", which tells
you to try::

python setup.py install

HTH,

STeVe
 
7

7stud

FWIW, here's what works for me::

* Go tohttp://pyparsing.wikispaces.com/
* Click the link at the top that says "Download from SourceForge"
* Click the big green "Download Python parsing module" button
* Click the big green "Download" button next to "pyparsing-1.4.6"
* Click pyparsing-1.4.6.tar.gz
* Extract the dowloaded .tar.gz
* Use the standard python installation idiom "python setup.py install"

If you're not familiar with the standard Python installation idiom, take
a few minutes to read:

"Installing Python Modules"http://docs.python.org/inst/inst.html

In particular it starts with "The new standard: Distutils", which tells
you to try::

python setup.py install

HTH,

STeVe

Thanks!

I thought I would write down what I did in case someone else looks
this up:

1) Even though the download at sourceforge said the file name was:

pyparsing-1.4.6.tar.gz

it was downloaded to my Desktop as:

pyparsing-1.4.6.tar

Did os x 10.4.7 automatically unzip it for me? .gz means the file was
compressed with gzip, but I didn't have to do any unzipping.

2) Apparently, a .tar file is not a compressed file--it just organizes
a bunch of files into one big file or "archive". You still need to do
something to extract all the files from the archive. Here is the
command:

$ tar -xvf /Users/me/Desktop/pyparsing-1.4.6.tar

That command extracts the contents into the current directory(i.e. the
directory the prompt is pointing to), which can overwrite files with
the same names. So I created a directory called tar_temp:

/Users/me/tar_temp

and used the cd command to change the prompt to that directory.
(Note: the ~ prompt is shorthand for /Users/YourHomeDirName)

3) To run: python setup.py install, you need to get to the top level
directory of the download. So, I cd'ed to the directory:

~/tar_temp/pyparsing-1.4.6

and then ran the setup command:

~/tar_temp/pyparsing-1.4.6$ python setup.py install

I tested the setup by running the hello world program described in the
document:

~/tar_temp/pyparsing-1.4.6/HowToUsePyparsing.html

and it worked.
 
A

Alex Martelli

7stud said:
1) Even though the download at sourceforge said the file name was:

pyparsing-1.4.6.tar.gz

it was downloaded to my Desktop as:

pyparsing-1.4.6.tar

Did os x 10.4.7 automatically unzip it for me? .gz means the file was
compressed with gzip, but I didn't have to do any unzipping.

You probably have Safari, the browser, set up to do that upon download
(but not set up to also untar the tarfile) -- I believe that those are
its default settings. Still, Safari's just an application, even though
Apple reasonably chooses to bundle it into the OS; you might use
Firefox, Opera, Camino, or any other browser, and the issue of what that
given browser does upon download (and how to change those settings) will
be different each time. It's not really a question at OS level, rather
it depends on each specific browser.


Alex
 
7

7stud

You probably have Safari, the browser, set up to do that upon download
(but not set up to also untar the tarfile) -- I believe that those are
its default settings. Still, Safari's just an application, even though
Apple reasonably chooses to bundle it into the OS; you might use
Firefox, Opera, Camino, or any other browser, and the issue of what that
given browser does upon download (and how to change those settings) will
be different each time. It's not really a question at OS level, rather
it depends on each specific browser.

Alex

Thanks.
 
P

Paul McGuire

To the developer:

1) I went to the pyparsing wiki to download the pyparsing module and
try it
2) At the wiki, there was no index entry in the table of contents for
Downloads. After searching around a bit, I finally discovered a tiny
link buried in some text at the top of the home page.
3) Link goes to sourceforge. At sourceforge, there was a nice, green
'download' button that stood out from the page.
4) I clicked on the download button and got the warning:
-----
You have selected to download the pyparsing-1.4.6 release.

Below is a list of files contained in this release.
Before downloading, you may want to read the release notes.
-----

5) Can't find any release notes, nor any button to click to download
the package.
6) Give up in frustration.
7) A few minutes later, I decide: I will not give up.
8) I go back to sourceforge and start clicking every link on the page.
(Hello, porn sites! Just kidding.) Still no luck.
9) Finally. I click on something and a download begins. I cancel it.
10) Now I know what to click on, and I download the docs and
pyparsing-1.4.6.tar
11) Now what? I'm new to mac os x, and I have no idea what to do.
The wiki is devoid of any installation instructions.
12) I give up again.

For as hard as you push pyparsing on this forum, I would think you
would make it easier to download and install your module. In my
opinion, the wiki should provide detailed installation instructions
for all supported os's, and the sourceforge downloading process is too
complicated.

Me? Push? Boy, a guy posts a couple of examples, tries to help some
people that are stuck with a problem, and what does he get? Called
"pushy"? Sheesh! Fortunately, I get enough positive feedback from
these posts that my feelings are pretty resilient these days.

Anyway, thanks and point taken for the alert on this subject from the
newbie's perspective. When I first wrote these installations and
started the pyparsing project on SF, I was fairly newb myself - I had
to ask Dave Kuhlman to write setup.py for me! So I assumed the target
audience already knew the stuff I was having to learn. I assumed that
setup.py was just common knowledge among the Python world.

I think your suggestion of a Wiki page on this subject should fill
this gap neatly, especially since pyparsing is somewhat targetted at
the newb and near-newb user, one that is struggling with regexp's or
some other parsing technology, and just wants to get some basic code
working. The other posts in this thread contain plenty of material to
start from. Also, thanks for the Mac OS X point of view, most of my
work is on Windows, and a little bit on Linux, but absolutely none on
Mac. And I see that I should not assume knowledge of tar, either, so
I'll be sure to mention its destructive streak, in overwriting
existing files with the same name as those in the archive. Once
untar'ed, there *is* a file named README, with an introduction and
instructions to invoke setup.py properly. But there is little harm in
repeating some of this on the Wiki as well.

I'm glad to see you perservered and got pyparsing installed. You can
also run pyparsing.py itself, which will run a simple SQL parser
test. If you have not yet found the docs or examples, *please* look
over the sample code in the examples directory, and the class-level
documentation in the htmldocs directory. The docs directory should
also include the materials from my PyCon'06 presentations.

Please post back, either here or on the Pyparsing wiki discussion
pages, and let me know how your pyparsing work is progressing.

-- Paul (the developer, but you can call me "Paul")
 
7

7stud

Paul said:
Me? Push? Boy, a guy posts a couple of examples, tries to help some
people that are stuck with a problem, and what does he get? Called
"pushy"? Sheesh!

Hey, I never called you pushy! Ok, maybe I sounded a little harsh--I
was pretty frustrated after all. I guess I should have said something
along the lines of, "If you are going to promote pyparsing, it would
be nice to be able see what it is all about it."
Fortunately, I get enough positive feedback from
these posts that my feelings are pretty resilient these days.

Anyway, thanks and point taken for the alert on this subject from the
newbie's perspective. When I first wrote these installations and
started the pyparsing project on SF, I was fairly newb myself - I had
to ask Dave Kuhlman to write setup.py for me! So I assumed the target
audience already knew the stuff I was having to learn. I assumed that
setup.py was just common knowledge among the Python world.

I think your suggestion of a Wiki page on this subject should fill
this gap neatly, especially since pyparsing is somewhat targetted at
the newb and near-newb user, one that is struggling with regexp's or
some other parsing technology, and just wants to get some basic code
working. The other posts in this thread contain plenty of material to
start from. Also, thanks for the Mac OS X point of view, most of my
work is on Windows, and a little bit on Linux, but absolutely none on
Mac. And I see that I should not assume knowledge of tar, either, so
I'll be sure to mention its destructive streak, in overwriting
existing files with the same name as those in the archive. Once
untar'ed, there *is* a file named README, with an introduction and
instructions to invoke setup.py properly.

Iol. I read it:

---------------------
Installation
============

Do the usual:

python setup.py install

(pyparsing requires Python 2.3.2 or later.)
------------------------

Not much to go on--not even a mention of what directory you should be
in when you run that command. Plus, you need to extract the files
from the .tar file first.
I'm glad to see you perservered and got pyparsing installed. You can
also run pyparsing.py itself, which will run a simple SQL parser
test. If you have not yet found the docs or examples, *please* look
over the sample code in the examples directory, and the class-level
documentation in the htmldocs directory. The docs directory should
also include the materials from my PyCon'06 presentations.

Please post back, either here or on the Pyparsing wiki discussion
pages, and let me know how your pyparsing work is progressing.

-- Paul (the developer, but you can call me "Paul")

I'm pretty facile with regex's, and after looking at some pyparsing
threads over the last week or so, I was interested in trying it.
However, all of the beginning examples use a Word() in the parse
expression, but I couldn't find an adequate explanation of what the
arguments to Word() are and what they mean. I finally found the
information buried in one of the many documents--the one called
"Using the Pyparsing Module". If that seems like an obvious place to
look, I did start there, but I didn't find it at first. I also
scoured the the wiki, and I looked in the file pycon06-
IntroToPyparsing-notes.pdf, which has this:

Basic Pyparsing
Words and Literals
 
7

7stud

Hmmm. My post got cut off. Here's the rest of it:


I'm pretty facile with regex's, and after looking at some pyparsing
threads over the last week or so, I was interested in trying it.
However, all of the beginning examples use a Word() in the parse
expression, but I couldn't find an adequate explanation of what the
arguments to Word() are and what they mean. I finally found the
information buried in one of the many documents--the one called
"Using the Pyparsing Module". If that seems like an obvious place to
look, I did start there, but I didn't find it at first. I also
scoured the the wiki, and I looked in the file pycon06-
IntroToPyparsing-notes.pdf, which has this:
 
7

7stud

Word("ABC", "def") matches "C", "Added", "Beef"
but not "BB", "ACE", "ADD"

That is just baffling. There's no explanation that the characters
specified in the first string are used to match the first char of a
word and that the characters specified in the second string are used
to match the rest of the word. It would also help to know that if
only one string is specified, then the specified characters will be
used to match all the chars in a word. I think you should add a
simple example to your wiki that explains all that.

Also, I think you should state right up front that alphas is a string
made up of the chars a-zA-z and that nums is a string made up of the
chars 0-9. That way when someone sees Word(alphas), they will
understand exactly what that means. Also since matching any char is a
pretty common thing, I think you should mention what printables is as
well.

In any case this is the example I applied pyparsing to:

Given .txt file(all on one line). Requirement--construct a list from
the text:
-------------
mara = [
'U2FsdGVkX185IX5PnFbzUYSKg+wMyYg9',
'U2FsdGVkX1+BCxltXVTQ2+mo83Si9oAV0sasmIGHVyk=',
'U2FsdGVkX18iUS8hYBXgyWctqpWPypVz6Fj49KYsB8s='
]
-----------

and this is what I came up with:

----------
from pyparsing import Word, alphas, commaSeparatedList

name = Word(alphas)
lookFor = name + "=" + "[" + commaSeparatedList + "]"

my_file = open("aaa.txt")
for line in my_file:
alist = lookFor.parseString(line)

globals()[alist[0] ] = [ alist[3].strip("'"), alist[4].strip("'"),
alist[5].strip("'") ]

print mara[2]
 
M

Marc 'BlackJack' Rintsch

However, all of the beginning examples use a Word() in the parse
expression, but I couldn't find an adequate explanation of what the
arguments to Word() are and what they mean. I finally found the
information buried in one of the many documents--the one called
"Using the Pyparsing Module". If that seems like an obvious place to
look, I did start there, but I didn't find it at first.

An obvious place should be the docstring of the `Word` class which says:

Token for matching words composed of allowed character sets.
Defined with string containing all allowed initial characters,
an optional string containing allowed body characters (if omitted,
defaults to the initial character set), and an optional minimum,
maximum, and/or exact length.

Ciao,
Marc 'BlackJack' Rintsch
 
P

Paul McGuire

<sample problem snipped>
Any tips?

7stud -

Here is the modified code, followed by my comments.

Oh, one general comment - you mention that you are quite facile with
regexp's. pyparsing has a slightly different philosophy from that of
regular expressions, especially in the areas of whitespace skipping
and backtracking. pyparsing will automatically skip whitespace
between parsing expressions, whereas regexp's require explicit
'\s*' (unless you specify the magic "whitespace between elements
allowed" attribute which I don't remember its magic attribute
character at the moment, but I rarely see regexp examples use it).
And pyparsing is purely a left-to-right recursive descent parser
generator. It wont look ahead to the next element past a repetition
operation to see when to stop repeating. There's an FAQ on this on
the wiki.

------------------
from pyparsing import Word, alphas, commaSeparatedList, delimitedList,
sglQuotedString, removeQuotes

name = Word(alphas)
lookFor = name + "=" + "[" + commaSeparatedList + "]"

# comment #0
my_file = """\
mara = [
'U2FsdGVkX185IX5PnFbzUYSKg+wMyYg9',
'U2FsdGVkX1+BCxltXVTQ2+mo83Si9oAV0sasmIGHVyk=',
'U2FsdGVkX18iUS8hYBXgyWctqpWPypVz6Fj49KYsB8s='
]"""
my_file = "".join(my_file.splitlines())
# uncomment next line once debugging of grammar is finished
# my_file = open("aaa.txt").read()


# comment #1
#~ my_file = open("aaa.txt")
#~ for line in my_file:
for line in [my_file,]:
alist = lookFor.parseString(line)

globals()[alist[0] ] = [ alist[3].strip("'"), alist[4].strip("'"),
alist[5].strip("'") ]


# comment #2
def stripSingleQuotes(s):
return s.strip("'")
globals()[alist[0] ] = map(stripSingleQuotes, alist[3:-1] )

print mara[2]
mara = None


# comment #3
lookFor = name.setResultsName("var") + "=" + "[" + \
commaSeparatedList.setResultsName("listValues") + "]"
alist = lookFor.parseString(my_file)

# evaluate parsed assignment
globals()[ alist.var ] = map(stripSingleQuotes, alist.listValues )
print len(mara), mara[1]


# comment #4
lookFor = name.setResultsName("var") + "=" + "[" + \
delimitedList( sglQuotedString.setParseAction(removeQuotes) )\
.setResultsName("listValues") + "]"

alist = lookFor.parseString(my_file)
globals()[ alist.var ] = list( alist.listValues )
print len(mara), mara[1]

------------------
Comment #0:
When I am debugging a pyparsing application, I find it easier to embed
the input text, or a subset of it, into the program itself using a
triple-quoted string. Then later, I'll go back and change to reading
data from an input file. Purely a matter of taste, but it simplifies
posting to mailing lists and newsgroups.

Comment #1:
Since you are going line by line in reading the input file, be *sure*
you have the complete assignment expression on each line. Since
pyparsing will read past line breaks for you, and since your input
file contains only this one assignment, you might be better off
calling parseString with: alist =
lookFor.parseString( my_file.read() )

Comment #2:
Your assignment of the "mara" global is a bit clunky on two fronts:
- the explicit accessing of elements 3,4, and 5
- the repeated calls to strip("'")
You can access the pyparsing returned tokens (passed as a ParseResults
object) using slices. In your case, you want the elements 3 through
n-1, so alist[3:-1] will give you this. It's nice to avoid hard-
coding things like list lengths and numbers of list elements. Note
that you can also use len to find out the length of the list.

As for calling strip("'") for each of these elements, have you learned
to use Python's map built-in yet? Define a function or lambda that
takes a single element, return from the function what you want done
with that element, and then call map with that function, and the list
you want to process. This modified version of your call is more
resilient than the original.

Comment #3:
Personally, I am not keen on using too much explicit indexing into the
returned results. This is another area where pyparsing goes beyond
typical lexing and tokenizing. Just as you can assign names to fields
in regexp's, pyparsing allows you to give names to elements within the
parsed results. You can then access these by name, using either dict
or object attribute syntax. This gets rid of most if not all of the
magic numbers from your code, and makes it yet again more resilient in
light of changes in the future. (Say for example you decided to
suppress the "=", "[", and "]" punctuation from the parsed results.
The parsing logic would remain the same, but the returned tokens would
contain only the significant content, the variable name and list
contents. Using explicit list indexing would force you to renumber
the list elements you are extracting, but with results names, no
change would be required.)

Comment #4:
I thought I'd show you an alternative to commaSeparatedList, called
delimitedList. delimitedList is a method that gives you more control
over the elements you expect to find within the list, and what to do
with them when you find them. delimitedList is a constructor method
that takes a pyparsing expression 'expr' and expands it to 'expr +
ZeroOrMore(Suppress(",") + expr)'. You can also change the delimiter
from ',' to some other character, or even to a pyparsing expression.
Pyparsing includes predefined expressions for some common text
patterns, such as single and double quoted strings, and comments of
various forms. Look for a directory called htmldoc in your pyparsing
directory tree, and open the index.html file there to look through the
classes and methods defined for you in pyparsing. Or just type
"help(pyparsing)" in the Python interpreter (after typing "import
pyparsing" first, of course).

Now that we have access to the expression defined to be matched within
the list, we can attach a parse action. A parse action will get run
against the matched tokens at parse time, and can be used to modify
the matched data before continuing. In this example, we'd like to
remove those annoying opening and closing quotation marks. Again,
this is such a common task that pyparsing includes a built-in for
this, called removeQuotes. It is equivalent to the following:

removeQuotes = lambda tokens : tokens[0][1:-1]

What?! No verifying that the first and last characters are in fact
quotation marks? Nope. Another part of the pyparsing philosophy is
that parse actions *know* that they will only be called with text that
matches their associated input pattern. removeQuotes is a parse
action that *knows* that the string passed to it will have opening and
closing "'" or '"' characters. You'll also see this quite often when
parsing integers:

integer = Word(nums).setParseAction(lambda toks: int(toks[0]))

No testing for "are the characters all numeric?" or trapping for
ValueError. We *know* that the only time this lambda will be invoked
is after having matched a word group composed only of numeric digits.

Any way, to wrap up this comment, now that we have attached a parse
action to remove the "'" characters as we parse, the listValues field
is ready to use as is from the parseString method, without having to
clutter our code up with maps, or lambdas, or other post-processing
junk.

Enjoy!

-- Paul
 
P

Paul McGuire

long-windedness snipped

Oh, P.S., There is a list parser example included in the pyparsing
examples directory, called parsePythonValue.py. It will parse nested
lists, dicts, and tuples.

-- Paul
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,898
Latest member
BlairH7607

Latest Threads

Top