RegExp question

M

Michael McGarry

Hi,

I would like to form a regular expression to find a few different
tokens (and, or, xor) followed by some variable number of whitespace
(i.e., tabs and spaces) followed by a hash mark (i.e., #). What would
be the regular expression for this?

Thanks for any help,

Michael
 
T

Tim Chase

I would like to form a regular expression to find a few
different tokens (and, or, xor) followed by some variable
number of whitespace (i.e., tabs and spaces) followed by
a hash mark (i.e., #). What would be the regular
expression for this?


(and|or|xor)\s*#

Unless "varible number of whitespace" means "at least *some*
whitespace", in which case you'd want to use

(and|or|xor)\s+#

Both are beautiful and precise.

-tim
 
M

Michael McGarry

Tim,

for some reason that does not seem to do the trick.

I am testing it with grep. (i.e., grep -e '(and|or|xor)\s*#' myfile)

Michael
 
P

Paul McGuire

Michael McGarry said:
Hi,

I would like to form a regular expression to find a few different
tokens (and, or, xor) followed by some variable number of whitespace
(i.e., tabs and spaces) followed by a hash mark (i.e., #). What would
be the regular expression for this?

Thanks for any help,

Michael
Using pyparsing, whitespace is implicitly ignored. Your expression would
look like:

oneOf("and or xor") + Literal("#")


Here's a complete example:


from pyparsing import *

pattern = oneOf("and or xor") + Literal("#")

testString = """
z = (a and b) and #XVAL;
q = z xor #YVAL;
"""


# use scanString to locate matches
for tokens,start,end in pattern.scanString(testString):
print tokens[0], tokens.asList()
print line(start,testString)
print (" "*(col(start,testString)-1)) + "^"
print
print


# use transformString to locate matches and substitute values
subs = {
'XVAL': 0,
'YVAL': True,
}
def replaceSubs(st,loc,toks):
try:
return toks[0] + " " + str(subs[toks[2]])
except KeyError:
pass

pattern2 = (pattern + Word(alphanums)).setParseAction(replaceSubs)
print pattern2.transformString(testString)

-----------------
Prints:
and ['and', '#']
z = (a and b) and #XVAL;
^

xor ['xor', '#']
q = z xor #YVAL;
^


z = (a and b) and 0;
q = z xor True;


Download pyparsing at http://pyparsing.sourceforge.net.

-- Paul
 
H

Heiko Wundram

Am Dienstag 11 April 2006 21:16 schrieb Michael McGarry:
I am testing it with grep. (i.e., grep -e '(and|or|xor)\s*#' myfile)

Test it with Python's re-module, then. \s for matching Whitespace is specific
to Python (AFAIK). And as you've asked in a Python Newsgroup, you'll get
Python-answers here.

--- Heiko.
 
B

Ben C

Hi,

I would like to form a regular expression to find a few different
tokens (and, or, xor) followed by some variable number of whitespace
(i.e., tabs and spaces) followed by a hash mark (i.e., #). What would
be the regular expression for this?

re.compile(r'(?:and|or|xor)\s*#')
 
R

RunLevelZero

In my opinion you would be best to use a tool like Kiki.
http://project5.freezope.org/kiki/index.html/#

This will allow you to paste in the actual text you want to search and
then play with different RE's and set flags with a simple mouse click
so you can find just what you want. Rember what re.DOTALL does. It
will treat white spaces special and if there are line breaks it will
follow them, otherwise it will not. It's a good idea to have a grasp
of regular expressions or when you come back to your code months /
weeks later, you will be just as lost, and always comment them very
well :).

Just my 2¢
 
B

Ben C

Tim,

for some reason that does not seem to do the trick.

I am testing it with grep. (i.e., grep -e '(and|or|xor)\s*#' myfile)

Try with grep -P, which means use perl-compatible regexes as opposed to
POSIX ones. I only know for sure that -P exists for GNU grep.

I assumed it was a Python question! Unless you're testing your Python
regex with grep, not realizing they're different.

Perl and Python regexes are (mostly?) the same.

I usually grep -P because I know Python regexes better than any other
ones.
 
T

Tim Chase

I am testing it with grep. (i.e., grep -e '(and|or|xor)\s*#' myfile)

Well, you asked for the python regexp...different
environments use different regexp parsing engines. Your
response is akin to saying "the example snippet of python
code you gave me doesn't work in my Pascal program".

For grep:

grep '\(and\|or\|xor\)[[:space:]]*#' myfile

For Vim:

:g/\(and\|or\|xor\)\s*#/

The one I gave originally is a python regexp, and thus
should be tested within python, not grep or vim or emacs or
sed or whatever.

It's always best to test in the real
environment...otherwise, you'll get flakey results.

-tkc
 
J

John Machin

(-:
Sorry about Tim. He's not very imaginative. He presumed that because
you asked on comp.lang.python that you would be testing it with Python.
You should have either (a) asked your question on
comp.toolswithfunnynames.grep or (b) not presumed that grep's re syntax
is the same as Python's.
:)

My grep appears to need something fugly like this:

grep -e "\(and\|or\|xor\)[ \t]*#" grepre.txt

but my grep is a Windows port which identifies itself as "grep (GNU
grep) 2.5.1" so it's definitely not The One True Grep ...

Now that you're here, why don't you try Python? It's not hard, e.g.

#>>> import re
#>>> rs = re.compile(r"(and|or|xor)\s*#").search
#>>> rs("if foo and #continued")
#<_sre.SRE_Match object at 0x00AE66E0>
#>>> rs("if foo and#continued")
#<_sre.SRE_Match object at 0x00AE6620>
#>>> rs("if foo and bar #continued")
#>>> rs("if foo xor # continued")
#<_sre.SRE_Match object at 0x00AE66E0>
#>>>

HTH,
John
 
J

John Machin

Precise? The OP asked for "tokens".

#>>> re.search(r"(and|or|xor)\s*#", "a = the_operand # gotcha!")
#<_sre.SRE_Match object at 0x00AE6620>

Try this:

#>>> re.search(r"\b(and|or|xor)\s*#", "a = the_operand # should fail")
#>>> re.search(r"\b(and|or|xor)\s*#", "and # OK")
#<_sre.SRE_Match object at 0x00AE6E60>
#>>> re.search(r"\b(and|or|xor)\s*#", "blah blah and # OK")
#<_sre.SRE_Match object at 0x00AE66E0>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,900
Latest member
Nell636132

Latest Threads

Top