Pyparsing...

Raoul · Sep 21, 2004

I am futzing with pyparsing and for the most part enjoying it.
However, I'm running into trouble with whitespace delimited lists. I
get data in blocks like this:

[QC1]
Type=15
NumberCells=1925
CellHeader=X Y PROBE PLEN ATOM INDEX
Cell1=132 0 N 25 0 132
Cell2=652 0 N 25 0 652
Cell3=648 0 N 25 0 648
....

I'd like to be able to parse this structure.

Ideally, I'd like for a QC node, to have a dictionary with
{'number':1
'type' : 15
'NumberCells' : 1925
'Table' : [{'cell':1,'x':132,'y':0,'probe':25,'plen':0,'atom':132',
'index':None}, {'cell':2 ....

I'm running into the following problems:

1. I can't seem to use delimitList() to define a rule that parses the
right hand side of the table into
['x','y','probe','plen','atom','index']. I think it's because my lists
are whitespace delimited.

2. I can't seem to convert value into an integer, for example, I can
parse each row in the table to :
['Cell','2','=', '652 0 N 25 0 652']
but am unable to get the setParseAction(see below) to convert and
substitute in the right value.

Any hints will help a great deal. Thanks...

Raoul-Sam

I have some ugly non functional code below..

def cdffile_BNF():
global cdfbnf

if not cdfbnf:
makeint = Word(nums).setParseAction( lambda s,l,t:[int(t[0])])
equals = Literal("=").suppress()
nonequals = "".join( [ c for c in printables if c != "=" ] ) +
" \t"

key = Word(nonequals)
value = Word(nonequals)
kvp = Group(key + equals + restOfLine)
kvpBlk = OneOrMore(kvp)

headerCell = delimitedList(Word(alphanums)," ")
rowHeader = Combine( Literal("CellHeader") + equals +
headerCell)
row = Combine(Literal("Cell").suppress() + restOfLine)
rows = OneOrMore(row)

CDF = Literal("[CDF]")
CDFBlk = Group(CDF + kvpBlk)

CHIP = Literal("[CHIP]")
CHIPBlk = Group(CHIP + kvpBlk)
CHIPBlk.setResultsName("chip")

QC = Combine( Literal("[QC").suppress() + Word(nums) +
Literal("]").suppress())
QCBlk = QC + kvp + kvp + rowHeader + rows

cdfbnf = CDFBlk + CHIPBlk + QCBlk

return cdfbnf

Larry Bates · Sep 21, 2004

Note: The Cellheader layout and the cell data
layout don't appear to match properly (when
compared to the data you show in your sample
dictionary). My solution follows the cellheader
layout.

The format of your data file is perfect to be
parsed with ConfigParser with [QC1] as section
name and Type, NumberCells, etc. as options.

import ConfigParser

inputfilename='data.ini' # Insert input filename
INI=ConfigParser.ConfigParser()
INI.read(inputfilename)
data={'type': None, 'numbercells': None, 'table':{}}

section='QC1'
option='type'
try: data['type']=INI.getint(section, option)
except:
#
# Insert code to handle missing type option
#
pass

option='numbercells'
try: data['numbercells']=INI.getint(section, option)
except:
#
# Insert code to handle missing numbercells option
#
pass

option='cellheader'
try: data['cellheader']=INI.get(section, option)
except:
#
# Insert code to handle missing numbercells option
#
pass

CELLS=[x for x in INI.options(section)
if x.startswith('cell')]

#
# Must get rid of 'cellheader' or maybe change the key name?
#
CELLS=[x for x in CELLS if x != 'cellheader']

celldatalist=[]
for CELL in CELLS:
celldata={}
x, y, probe, plen, atom, index=INI.get(section, CELL).split(' ')
celldata['cell']=int(CELL[4:])
celldata['x']=int(x)
celldata['y']=int(y)
celldata['plen']=plen
celldata['atom']=int(atom)
celldata['index']=int(index)
celldatalist.append(celldata)

data['table']=celldatalist

This is tested so it should be close (I do this quite a lot in
my code).

You could wrap a loop around the outside of this if you have
multiple QC instances.

Hope it helps.
Larry Bates

Raoul said:
I am futzing with pyparsing and for the most part enjoying it.
However, I'm running into trouble with whitespace delimited lists. I
get data in blocks like this:

[QC1]
Type=15
NumberCells=1925
CellHeader=X Y PROBE PLEN ATOM INDEX
Cell1=132 0 N 25 0 132
Cell2=652 0 N 25 0 652
Cell3=648 0 N 25 0 648
...

I'd like to be able to parse this structure.

Ideally, I'd like for a QC node, to have a dictionary with
{'number':1
'type' : 15
'NumberCells' : 1925
'Table' : [{'cell':1,'x':132,'y':0,'probe':25,'plen':0,'atom':132',
'index':None}, {'cell':2 ....

I'm running into the following problems:

1. I can't seem to use delimitList() to define a rule that parses the
right hand side of the table into
['x','y','probe','plen','atom','index']. I think it's because my lists
are whitespace delimited.

2. I can't seem to convert value into an integer, for example, I can
parse each row in the table to :
['Cell','2','=', '652 0 N 25 0 652']
but am unable to get the setParseAction(see below) to convert and
substitute in the right value.

Any hints will help a great deal. Thanks...

Raoul-Sam

I have some ugly non functional code below..

def cdffile_BNF():
global cdfbnf

if not cdfbnf:
makeint = Word(nums).setParseAction( lambda s,l,t:[int(t[0])])
equals = Literal("=").suppress()
nonequals = "".join( [ c for c in printables if c != "=" ] ) +
" \t"

key = Word(nonequals)
value = Word(nonequals)
kvp = Group(key + equals + restOfLine)
kvpBlk = OneOrMore(kvp)

headerCell = delimitedList(Word(alphanums)," ")
rowHeader = Combine( Literal("CellHeader") + equals +
headerCell)
row = Combine(Literal("Cell").suppress() + restOfLine)
rows = OneOrMore(row)

CDF = Literal("[CDF]")
CDFBlk = Group(CDF + kvpBlk)

CHIP = Literal("[CHIP]")
CHIPBlk = Group(CHIP + kvpBlk)
CHIPBlk.setResultsName("chip")

QC = Combine( Literal("[QC").suppress() + Word(nums) +
Literal("]").suppress())
QCBlk = QC + kvp + kvp + rowHeader + rows

cdfbnf = CDFBlk + CHIPBlk + QCBlk

return cdfbnf

Raoul · Sep 21, 2004

Larry,

This is an elegant little python! Thanks for you help. I am actually
trying to use pyparsing, not just in my project but as a learning
experience. I think pyparsing has a lot of potential.

I'm making progress but I am running into bugs with my pyparsing
grammar.

R-S

Larry Bates said:
Note: The Cellheader layout and the cell data
layout don't appear to match properly (when
compared to the data you show in your sample
dictionary). My solution follows the cellheader
layout.

The format of your data file is perfect to be
parsed with ConfigParser with [QC1] as section
name and Type, NumberCells, etc. as options.

import ConfigParser

inputfilename='data.ini' # Insert input filename
INI=ConfigParser.ConfigParser()
INI.read(inputfilename)
data={'type': None, 'numbercells': None, 'table':{}}

section='QC1'
option='type'
try: data['type']=INI.getint(section, option)
except:
#
# Insert code to handle missing type option
#
pass

option='numbercells'
try: data['numbercells']=INI.getint(section, option)
except:
#
# Insert code to handle missing numbercells option
#
pass

option='cellheader'
try: data['cellheader']=INI.get(section, option)
except:
#
# Insert code to handle missing numbercells option
#
pass

CELLS=[x for x in INI.options(section)
if x.startswith('cell')]

#
# Must get rid of 'cellheader' or maybe change the key name?
#
CELLS=[x for x in CELLS if x != 'cellheader']

celldatalist=[]
for CELL in CELLS:
celldata={}
x, y, probe, plen, atom, index=INI.get(section, CELL).split(' ')
celldata['cell']=int(CELL[4:])
celldata['x']=int(x)
celldata['y']=int(y)
celldata['plen']=plen
celldata['atom']=int(atom)
celldata['index']=int(index)
celldatalist.append(celldata)

data['table']=celldatalist

This is tested so it should be close (I do this quite a lot in
my code).

You could wrap a loop around the outside of this if you have
multiple QC instances.

Hope it helps.
Larry Bates

Raoul said:

I am futzing with pyparsing and for the most part enjoying it.
However, I'm running into trouble with whitespace delimited lists. I
get data in blocks like this:

[QC1]
Type=15
NumberCells=1925
CellHeader=X Y PROBE PLEN ATOM INDEX
Cell1=132 0 N 25 0 132
Cell2=652 0 N 25 0 652
Cell3=648 0 N 25 0 648
...

I'd like to be able to parse this structure.

Ideally, I'd like for a QC node, to have a dictionary with
{'number':1
'type' : 15
'NumberCells' : 1925
'Table' : [{'cell':1,'x':132,'y':0,'probe':25,'plen':0,'atom':132',
'index':None}, {'cell':2 ....

I'm running into the following problems:

1. I can't seem to use delimitList() to define a rule that parses the
right hand side of the table into
['x','y','probe','plen','atom','index']. I think it's because my lists
are whitespace delimited.

2. I can't seem to convert value into an integer, for example, I can
parse each row in the table to :
['Cell','2','=', '652 0 N 25 0 652']
but am unable to get the setParseAction(see below) to convert and
substitute in the right value.

Any hints will help a great deal. Thanks...

Raoul-Sam

I have some ugly non functional code below..

def cdffile_BNF():
global cdfbnf

if not cdfbnf:
makeint = Word(nums).setParseAction( lambda s,l,t:[int(t[0])])
equals = Literal("=").suppress()
nonequals = "".join( [ c for c in printables if c != "=" ] ) +
" \t"

key = Word(nonequals)
value = Word(nonequals)
kvp = Group(key + equals + restOfLine)
kvpBlk = OneOrMore(kvp)

headerCell = delimitedList(Word(alphanums)," ")
rowHeader = Combine( Literal("CellHeader") + equals +
headerCell)
row = Combine(Literal("Cell").suppress() + restOfLine)
rows = OneOrMore(row)

CDF = Literal("[CDF]")
CDFBlk = Group(CDF + kvpBlk)

CHIP = Literal("[CHIP]")
CHIPBlk = Group(CHIP + kvpBlk)
CHIPBlk.setResultsName("chip")

QC = Combine( Literal("[QC").suppress() + Word(nums) +
Literal("]").suppress())
QCBlk = QC + kvp + kvp + rowHeader + rows

cdfbnf = CDFBlk + CHIPBlk + QCBlk

return cdfbnf

Click to expand...

pyparsing problem	3	Jul 1, 2008
Problem using Optional pyparsing	2	Aug 16, 2007
help with pyparsing	3	Dec 10, 2007
Pyparsing help	9	Mar 22, 2008
pyparsing with nested table	2	Dec 8, 2005
pyparsing: odd listAllMatches behavior	0	Mar 11, 2005
ANN: pyparsing 1.4.8 released	0	Oct 7, 2007
using pyparsing to extract METEO DATAS	0	May 4, 2007

Pyparsing...

Raoul

Larry Bates

Raoul

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads