xml.sax removing newlines from attribute value?

G

Grant Edwards

I'm using xml.sax to parse the "datebook" xml file generated by
QTopiaDesktop. When I look at the xml file, some of the
attribute strings have newlines in them (as they are supposed
to).

However, when xml.sax passes the attributes to my
startElement() method the newlines seem to have been deleted.

How do I get the un-munged element attribute values?
 
F

Fredrik Lundh

Grant said:
I'm using xml.sax to parse the "datebook" xml file generated by
QTopiaDesktop. When I look at the xml file, some of the
attribute strings have newlines in them (as they are supposed
to).

However, when xml.sax passes the attributes to my
startElement() method the newlines seem to have been deleted.

How do I get the un-munged element attribute values?

newlines as in chr(10) rather than
?

if so, the only way is to avoid XML:

http://www.w3.org/TR/REC-xml/#AVNormalize

if the "yes, I know, but I have good reasons" approach is okay with you,
and you're big enough to defend yourself against the XML-Is-The-Law
crowd, you can use a "sloppy" XML parsers such as sgmlop to deal with
your files:

http://effbot.org/zone/sgmlop-index.htm

</F>
 
G

Grant Edwards

newlines as in chr(10) rather than
?

Yup, Looks that way.
if so, the only way is to avoid XML:

http://www.w3.org/TR/REC-xml/#AVNormalize

I can't quite find it in the BNF, but I take it that chr(10)
isn't really allowed in XML attribute strings. IOW, the file
generate by Trolltech's app is broken.
if the "yes, I know, but I have good reasons" approach is okay
with you,

I didn't define the file or write the program that generated
it. It's claimed to be "xml", and I'm just trying to parse it.
and you're big enough to defend yourself against the
XML-Is-The-Law crowd, you can use a "sloppy" XML parsers such
as sgmlop to deal with your files:

http://effbot.org/zone/sgmlop-index.htm

Good to know for future reference. For now, I think I'll just
live with the way it works. Everything basically works, except
some strings don't display quite "right". My current app
treats the file as read-only. If I ever get around to
modifying data and writing it back, I'll probably have to deal
with the newline issue at that point.
 
F

Fredrik Lundh

Grant said:
I can't quite find it in the BNF, but I take it that chr(10)
isn't really allowed in XML attribute strings. IOW, the file
generate by Trolltech's app is broken.

it's allowed, but the parser must not pass it on to the application.

(in other words, whitespace in attributes doesn't, in general, survive
roundtripping)

</F>
 
G

Grant Edwards

it's allowed, but the parser must not pass it on to the application.

(in other words, whitespace in attributes doesn't, in general, survive
roundtripping)

Ah, I see. That's good to know.

[This is my first attempt at anything XMLish.]
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,016
Latest member
TatianaCha

Latest Threads

Top