Idempotent XML processing

M

Michael Ekstrand

Hello all,

In my current project, I am working with XML data in a protocol that has
checksum/signature verification of a portion of the document. There is
an envelope with a header element, containing signature data; following
the header is a body. The signatures are computed as cryptographic
checksums of the entire Body element, including start and end tags,
exactly as it appears in the data transmission.

Therefore, I need to extract the entire text of an element of an XML
document. I have a function that scans an XML string and does this, but
it seems like a rather clumsy way to accomplish this task. I've been
playing with xml.dom.minidom and its toxml() method, but to no avail -
the server sends me XML with empty elements as full open/close tags,
but toxml() serializes them to the XML empty element (<Element/>), so
the checksum winds up not matching.

Is there some parsing mechanism (using PyXML or any other freely usable
3rd party library is an option) that will allow me to accomplish this?
Or am I best off sticking with my little string scanning function?

TIA,
Michael
 
W

Will McCutchen

In my current project, I am working with XML data in a protocol that has
checksum/signature verification of a portion of the document.
...
the server sends me XML with empty elements as full open/close tags,
but toxml() serializes them to the XML empty element (<Element/>), so
the checksum winds up not matching.

(This is a horrible response to your question, so I apologize in
advance.)

Does it even make sense to use a checksum to verify XML, since there
are basically[1] infinite ways to validly write equivalent XML data?

I would think the only way that a checksum would be a viable way to
verify a document is if you had some sort of standard normalized
format, and made sure the data was normalized both before you computed
and before you calculated the checksum. That way, you would be sure
that, for instance, all insignificant whitespace was removed and all
empty elements were represented uniformly.

Again, I'm sorry because I didn't provide any real useful information,
I just tried to poke holes in your current project.


Will.

[1] Unless all of your whitespace is significant
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,576
Members
45,054
Latest member
LucyCarper

Latest Threads

Top