Problem with "&" charater in xml.

K

Kirt

i have walked a directory and have written the foll xml document.
one of the folder had "&" character so i replaced it by "&"
#------------------test1.xml
<Directory>
<dirname>C:\Documents and Settings\Administrator\Desktop\1\bye
w&amp;y </dirname>
<file>
<name>def.txt</name>
<time>200607130417</time>
</file>
</Directory>
<Directory>
<dirname>C:\Documents and Settings\Administrator\Desktop\1\hii
wx</dirname>
<file>
<name>abc.txt</name>
<time>200607130415</time>
</file>
</Directory

now in my python code i want to parse this doc and print the directory
name.
###----------handler------------filename---handler.py
from xml.sax.handler import ContentHandler
class oldHandler(ContentHandler):
def __init__(self):
self.dn = 0
def startElement(self, name, attrs):
if name=='dirname':
self.dn=1

def characters(self,str):
if self.dn:
print str

def endElement(self, name):
if name == 'dirname':
self.dn=0


#---------------------------------------------------------------------
#main code--- fname----art.py
import sys
from xml.sax import make_parser
from handlers import oldHandler

ch = oldHandler()
saxparser = make_parser()

saxparser.setContentHandler(ch)
saxparser.parse(sys.argv[1])
#-----------------------------------------------------------------------------
i run the code as: $python art.py test1.xml

i am getting output as:

C:\Documents and Settings\Administrator\Desktop\1\bye w
&
y
C:\Documents and Settings\Administrator\Desktop\1\hii wx

where as i need an output which should look like this.
C:\Documents and Settings\Administrator\Desktop\1\bye w&y

C:\Documents and Settings\Administrator\Desktop\1\hii wx

Can someone tell me the solution for this.
 
B

Bjoern Hoehrmann

* Kirt wrote in comp.text.xml:
i am getting output as:

C:\Documents and Settings\Administrator\Desktop\1\bye w
&
y
C:\Documents and Settings\Administrator\Desktop\1\hii wx

where as i need an output which should look like this.
C:\Documents and Settings\Administrator\Desktop\1\bye w&y

C:\Documents and Settings\Administrator\Desktop\1\hii wx

Can someone tell me the solution for this.

SAX allows to split characters events as you encounter here. If there is
no switch to force the SAX parser to accumulate the text before calling
the handler, you have to do that yourself.
 
G

George Bina

A SAX parser can notify a text node by calling any number of times the
characters method so you need to accumulate all the information you
receive on the characters method and output the text when you get a
notification different than characters.

Best Regards,
George
 
J

Joe Kesselman

Note that any good SAX tutorial will demonstrate how to buffer the
characters() events, if you don't feel like reinventing the solution
yourself.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,951
Messages
2,570,113
Members
46,698
Latest member
alexxx

Latest Threads

Top