Convert raw data to XML

E

elrondrules

Hi

I am running a HTTP server which receives post from a process.
In my do_POST method am receiving raw data.

I know that this raw data has a valid XML content and I need to
convert this into an XML file.

Are there any routines to do this.. if not how to write one..

For example the raw data is as follows

<?xml version="1.0" ?><Blah><ABC><Id id="1"/><Description>SomeText </
Description><Result>PassorFail</Result></ABC></Blah>

without spaces or new lines. I need this to be written into an XML
file as

<?xml version="1.0" ?>
<Blah>
<ABC>
<Id id="1"/>
<Description>
SomeText
</Description>
<Result>
PassorFail
</Result>
</ABC>
</Blah>

The tags in the raw data are not the same alaways.. Hence I need a
generic routine that does the trick

Any pointers for this issue would help

Thanks
 
J

Justin Ezequiel

For example the raw data is as follows

<?xml version="1.0" ?><Blah><ABC><Id id="1"/><Description>SomeText </
Description><Result>PassorFail</Result></ABC></Blah>

without spaces or new lines. I need this to be written into an XML
file as

<?xml version="1.0" ?>
<Blah>
<ABC>
<Id id="1"/>
<Description>
SomeText
</Description>
<Result>
PassorFail
</Result>
</ABC>
</Blah>

raw = r' said:
<Description>SomeText </Description><Result>PassorFail</Result></
ABC></Blah>'
import xml.dom.ext
import xml.dom.minidom
doc = xml.dom.minidom.parseString(raw)
xml.dom.ext.PrettyPrint(doc)
<?xml version='1.0' encoding='UTF-8'?>
<Blah>
<ABC>
<Id id='1'/>
<Description>SomeText </Description>
<Result>PassorFail</Result>
</ABC>
</Blah>
 
G

Gabriel Genellina

For example the raw data is as follows

<?xml version="1.0" ?><Blah><ABC><Id id="1"/><Description>SomeText </
Description><Result>PassorFail</Result></ABC></Blah>

without spaces or new lines. I need this to be written into an XML
file as
[same content but nicely indented]

Is the file supposed to be processed by humans? If not, just write it as
you receive it.
Spaces and newlines and indentation are mostly irrelevant on an xml file.
 
J

John Nagle

Actually, that's not "raw data" coming in, that's valid XML.
Why do you need to indent it? Just write it to a file.

If you really need to indent XML, get BeautifulSoup, read the
XML in with BeautifulStoneSoup, and write it back out with
"prettify()". But if the next thing to see that XML is a program,
not a human, why bother?

John Nagle
 
E

elrondrules

For example the raw data is as follows
<?xml version="1.0" ?><Blah><ABC><Id id="1"/><Description>SomeText </
Description><Result>PassorFail</Result></ABC></Blah>
without spaces or new lines. I need this to be written into an XML
file as
[same content but nicely indented]

Is the file supposed to be processed by humans? If not, just write it as
you receive it.
Spaces and newlines and indentation are mostly irrelevant on an xml file.

the reason I wanted to write it as a file was to parse the file, look
for a specific attribute and execute a set of commands based on the
value of the attribute.. also i needed to display the output of the
http post in a more readable format..
 
J

John Nagle

the reason I wanted to write it as a file was to parse the file, look
for a specific attribute and execute a set of commands based on the
value of the attribute.. also i needed to display the output of the
http post in a more readable format..

That's straightforward. You confused people by asking the
wrong question. You wrote "Convert raw data to XML", but what
you want to do is parse XML and extract data from it.

This will do what you want:

http://www.crummy.com/software/BeautifulSoup/

For starters, try

from BeautifulSoup import BeautifulStoneSoup
xmlstring = somexml ## get your XML into here as one big string
soup = BeautifulStoneSoup(xmlstring) # parse XML into tree
print soup.prettify() # print out in indented format

"soup" is a tree structure representing the XML, and there are
functions to easily find items in the tree by tag name, attribute,
and such. Work on the tree, not a file with the text of the indented
output.


John Nagle
 
E

elrondrules

That's straightforward. You confused people by asking the
wrong question. You wrote "Convert raw data to XML", but what
you want to do is parse XML and extract data from it.

This will do what you want:

http://www.crummy.com/software/BeautifulSoup/

For starters, try

from BeautifulSoup import BeautifulStoneSoup
xmlstring = somexml ## get your XML into here as one big string
soup = BeautifulStoneSoup(xmlstring) # parse XML into tree
print soup.prettify() # print out in indented format

"soup" is a tree structure representing the XML, and there are
functions to easily find items in the tree by tag name, attribute,
and such. Work on the tree, not a file with the text of the indented
output.

John Nagle

is there any other way to do this without using BeautifulStoneSoup..
using existing minidom or ext..
i dont want to install anything new
 
G

Gabriel Genellina

is there any other way to do this without using BeautifulStoneSoup..
using existing minidom or ext..
i dont want to install anything new

It appears that you already know the answer... Look at the minidom
documentation, toprettyxml method.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top