Accessing "sub elements" with xml.sax ?

E

erikcw

Hi,

I'm trying to use xml.sax (from xml.sax.handler import ContentHandler)
to processes the following data:

<?xml version="1.0" encoding="UTF-8"?>
<report name="yahoo" masterAccountID="666831"
masterAccountName="CraftShowSuccess.Com-US"
dateStart="2008-02-24-0600" dateEnd="2008-02-24-0600"
booksClosedTimestamp="2008-02-25T01:15:00.000-0600" booksClosed="true"
createDate="2008-02-25T02:00:27.041-0600" sortColumn="cost"
sortOrder="desc">
<totals><analytics numImpr="951" ctr="0.0" numClick="0" cost="0.0"
averagePosition="9.305993690851736"/></totals>
<row adName="Craftshows" adGrpName="Craftshows" cmpgnName="craftshows"
tacticName="Paid Placement" qualityScore="2"><analytics numImpr="951"
ctr="0.0" numClick="0" cost="0.0" averagePosition="9.305993690851736"/

</report>

I've figured out how to access the attributes in "row" - but I want to
also access the "analytics" child element.

I've tried:
class YahooHandler(ContentHandler):
ccountNum)

def startElement(self, name, attrs):
if name == 'row' or name == 'analytics':
self.campaign = attrs.get('cmpgnName',"")
self.adgroup = attrs.get('adGrpName',"")
self.headline = attrs.get('adName',"")
self.imps = attrs.get('numImpr',None)
self.clicks = attrs.get('numClick',None)
self.cost = attrs.get('cost',"")

def endElement(self, name):
if name == 'row':
if self.campaign not in self.data:
self.data[self.campaign] = {}
if self.adgroup not in self.data[self.campaign]:
self.data[self.campaign][self.adgroup] = []
self.data[self.campaign][self.adgroup].append({'campaign':
self.campaign,
'adgroup': self.adgroup,
'headline': self.headline,
'imps': self.imps,
'clicks': self.clicks,
'ctr': self.ctr,
'cost': self.cost,
})
print self.data

But it the data comes out as seperate dictionaries - I want the
analytics and the row elements in one dictionary.

What am I doing wrong?

Thanks!
 
D

Diez B. Roggisch

erikcw said:
Hi,

I'm trying to use xml.sax (from xml.sax.handler import ContentHandler)
to processes the following data:

<?xml version="1.0" encoding="UTF-8"?>
<report name="yahoo" masterAccountID="666831"
masterAccountName="CraftShowSuccess.Com-US"
dateStart="2008-02-24-0600" dateEnd="2008-02-24-0600"
booksClosedTimestamp="2008-02-25T01:15:00.000-0600" booksClosed="true"
createDate="2008-02-25T02:00:27.041-0600" sortColumn="cost"
sortOrder="desc">
<totals><analytics numImpr="951" ctr="0.0" numClick="0" cost="0.0"
averagePosition="9.305993690851736"/></totals>
<row adName="Craftshows" adGrpName="Craftshows" cmpgnName="craftshows"
tacticName="Paid Placement" qualityScore="2"><analytics numImpr="951"
ctr="0.0" numClick="0" cost="0.0" averagePosition="9.305993690851736"/

</report>

I've figured out how to access the attributes in "row" - but I want to
also access the "analytics" child element.

I've tried:
class YahooHandler(ContentHandler):
ccountNum)

def startElement(self, name, attrs):
if name == 'row' or name == 'analytics':
self.campaign = attrs.get('cmpgnName',"")
self.adgroup = attrs.get('adGrpName',"")
self.headline = attrs.get('adName',"")
self.imps = attrs.get('numImpr',None)
self.clicks = attrs.get('numClick',None)
self.cost = attrs.get('cost',"")

def endElement(self, name):
if name == 'row':
if self.campaign not in self.data:
self.data[self.campaign] = {}
if self.adgroup not in self.data[self.campaign]:
self.data[self.campaign][self.adgroup] = []
self.data[self.campaign][self.adgroup].append({'campaign':
self.campaign,
'adgroup': self.adgroup,
'headline': self.headline,
'imps': self.imps,
'clicks': self.clicks,
'ctr': self.ctr,
'cost': self.cost,
})
print self.data

But it the data comes out as seperate dictionaries - I want the
analytics and the row elements in one dictionary.

What am I doing wrong?

With sax, you can't access a child directly - you need to build up that
hierarchy yourself, using a stack of elements.

Better go for DOM or better even element-tree, these do that work for
you and you can easily access child elemements.

Diez
 

Members online

Forum statistics

Threads
473,772
Messages
2,569,593
Members
45,111
Latest member
KetoBurn
Top