how to use pyparsing for identifiers that start with a constant string

G

Guest

I am scanning text that has identifiers with a constant prefix string
followed by alphanumerics and underscores. I can't figure out, using
pyparsing, how to match for this. The example expression below seems to
be looking for whitespace between the 'atod' and the rest of the
identifier.

identifier_atod = 'atod' + pp.Word('_' + pp.alphanums)

How can I get pyparsing to match 'atodkj45k' and 'atod_asdfaw', but not
'atgdkasdjfhlksj' and 'atod asdf4er', where the first four characters
must be 'atod', and not followed by whitespace?

Thanks!
 
K

Kent Johnson

I am scanning text that has identifiers with a constant prefix string
followed by alphanumerics and underscores. I can't figure out, using
pyparsing, how to match for this. The example expression below seems to
be looking for whitespace between the 'atod' and the rest of the
identifier.

identifier_atod = 'atod' + pp.Word('_' + pp.alphanums)

How can I get pyparsing to match 'atodkj45k' and 'atod_asdfaw', but not
'atgdkasdjfhlksj' and 'atod asdf4er', where the first four characters
must be 'atod', and not followed by whitespace?

Here is one way using pyparsing.Combine:
>>> from pyparsing import *
>>> tests = [ 'atodkj45k', 'atod_asdfaw', 'atgdkasdjfhlksj', 'atod asdf4er']
>>> ident = Combine(Literal('atod') + Word('_' + alphanums))
>>> for t in tests:
... try:
... print ident.parseString(t)
... except:
... print 'No match', t
...
['atodkj45k']
['atod_asdfaw']
No match atgdkasdjfhlksj
No match atod asdf4er
Kent
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,567
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top