Help with XML-SAX program ... it's driving me nuts ...

M

mitsura

Hi,

I need to read a simle XML file. For this I use the SAX parser. So far
so good.
The XML file consist out of number of "Service" object with each object
a set of attributes.

I read through the XML file and for each "<Service>" entry I create a
new service object.
When I am in the "<Service>" part of the XML file and I encounter
"<Attributes>" then I store these attributes into a Python dictionary.
At the end of the "</Service>" tag I create the actual object and pass
the attributes directory to it.
The strange thing is that for some reason, the attributes for all the
objects are being updated. I don't understand why this happens.

I included the code and a example of the XML file (kd.xml)
The output should be:
Obj: name2 attribs: {}
Obj: name attribs: {attrib1, val1, atttrib1.1, val1.1}
Obj: name1 attribs: {attrib2, val 2}

but it is:
Obj: name2 attribs: {}
Obj: name attribs: {}
Obj: name1 attribs: {}

It's driving me nuts. I have spend hours going through this very simple
code, but I can't find what's wrong.
Any help is really very much appreciated.

With kind regards,

Kris
XML File (kd.xml):
"
<?xml version='1.0' ?>
<Services>
<Service>
<Name>name</Name>
<Label>label</Label>
<Icon>/opt/OV/www/htdocs/ito_op/images/service.32.gif</Icon>
<Status>
<Normal/>
</Status>
<Depth>1</Depth>
<Attribute>
<Name>Attrib1</Name>
<Value>Val1</Value>
</Attribute>
<Attribute>
<Name>Attrib1.1</Name>
<Value>Val1.1</Value>
</Attribute>
</Service>
<Service>
<Name>name1</Name>
<Label>label1</Label>
<Icon>/opt/OV/www/htdocs/ito_op/images/service.32.gif</Icon>
<Status>
<Normal/>
</Status>
<Depth>1</Depth>
<Attribute>
<Name>Attrib2</Name>
<Value>val 2</Value>
</Attribute>
</Service>
<Service>
<Name>name2</Name>
<Label>label2</Label>
<Icon>/opt/OV/www/htdocs/ito_op/images/service.32.gif</Icon>
<Status>
<Normal/>
</Status>
<Depth>1</Depth>
</Service>
</Services>
"
Program:
"
import sys
import string
import images
import os
import cStringIO
import xml.sax


from wxPython.wx import *
from Main import opj
from xml.sax.handler import *
from xml.sax import make_parser

class ServiceObj:

def __init__(self,name):

# Serivce object properties
self.name = name
self.attributes = {}

class OpenSNXML(ContentHandler):

# Service Object Tags
inService = 0
inServiceAttribute = 0
inServiceName = 0
inAttributeName = 0
inAttributeValue = 0


objName = ""
objLabel = ""

attribs = {}
attribName = ""
attribValue = ""

def startElement(self, name, attrs):

if name == "Service":
self.inService = 1

if self.inService == 1:
if name == "Attribute":
print "Attribute start"
self.inServiceAttribute = 1

if name == "Name" and ( self.inServiceAttribute == 0 ):
self.inServiceName = 1

if name == "Name" and (self.inServiceAttribute == 1):
self.inAttributeName = 1

if name == "Value" and (self.inServiceAttribute == 1):
self.inAttributeValue = 1

def characters(self, characters):
if self.inServiceName == 1:
self.objName = self.objName + characters

if self.inAttributeName == 1:
#print "Attribute Name: ", characters
self.attribName = self.attribName + characters

if self.inAttributeValue == 1:
#print "Attribute Value: ", characters
self.attribValue = self.attribValue + characters

def endElement(self, name):
mD = 1

if name == "Service":

self.inService = 0

# If the object already exists, update the existing object
if AllServiceObjectsFromXMLFile.has_key(self.objName):
#print "Object: ", self.objName, " already defined in exists in
dir"
obj = AllServiceObjectsFromXMLFile[self.objName]
else:
obj = ServiceObj(self.objName)

obj.attributes = self.attribs

AllServiceObjectsFromXMLFile[self.objName] = obj

del obj

self.objName = ""
self.attribs.clear()

if name == "Attribute":
self.inServiceAttribute = 0
print "Attrib name: ", self.attribName
print "Attrib value: ", self.attribValue

self.attribs[self.attribName] = self.attribValue
self.attribName = ""
self.attribValue = ""

print "Attrib dir: ", self.attribs
print "Attribute stop"

if name == "Name":
self.inServiceName = 0
self.inAttributeName = 0

if name == "Value":
self.inAttributeValue = 0


# Main
AllServiceObjectsFromXMLFile = {}

GetAllObjectsFromXMLfilePS = xml.sax.make_parser()
GetAllObjectsFromXMLfileHD = OpenSNXML()
GetAllObjectsFromXMLfilePS.setContentHandler(GetAllObjectsFromXMLfileHD)
GetAllObjectsFromXMLfilePS.parse("kd.xml")

keys = AllServiceObjectsFromXMLFile.keys()

for key in keys:
obj = AllServiceObjectsFromXMLFile[key]
print "Obj: ", obj.name, "attribs: ", obj.attributes
"
 
F

Fredrik Lundh

I need to read a simle XML file. For this I use the SAX parser. So far
so good. The XML file consist out of number of "Service" object with
each object a set of attributes.
The strange thing is that for some reason, the attributes for all the
objects are being updated. I don't understand why this happens.

you're using the same dictionary for all Service elements:

obj.attributes = self.attribs

adds a reference to the attribs dictionary; it doesn't make a copy (if it
did, your code wouldn't work anyway).

changing

self.attribs.clear()

to

self.attribs = {} # use a new dict for the next round

fixes this.
It's driving me nuts. I have spend hours going through this very simple
code, but I can't find what's wrong.

simple? fwiw, here's the corresponding ElementTree solution:

import elementtree.ElementTree as ET

for event, elem in ET.iterparse("kd.xml"):
if elem.tag == "Service":
d = {}
for e in elem.findall("Attribute"):
d[e.findtext("Name")] = e.findtext("Value")
print elem.findtext("Name"), d

(tweak as necessary)

</F>
 
M

mitsura

Thanks for the feedback!
I will certainly look at the elementtree stuff (I am new to Python so I
still need to find my way around)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top