Newbie needs help extracting data from XML

R

Rodney

Hi,

Im a Python newbie and am trying to get the data out of a series of XML
files. So for example the xml is:

<?xml version="1.0" encoding="utf-8"?><soap:Envelope
xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/"
xmlns:tns="http://www.ExchangeNetwork.net/schema/v1.0/node.wsdl"
xmlns:types="http://www.ExchangeNetwork.net/schema/v1.0/node.wsdl"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"><soap:Header><wsu:Timestamp
xmlns:wsu="http://schemas.xmlsoap.org/ws/2002/07/utility"><wsu:Created>2005-12-28T05:59:38Z</wsu:Created><wsu:Expires>2005-12-28T06:04:38Z</wsu:Expires></wsu:Timestamp></soap:Header><soap:Body
soap:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"><q1:NodePingResponse
xmlns:q1="http://www.ExchangeNetwork.net/schema/v1.0/node.xsd"><return
xsi:type="xsd:string">Ready</return></q1:NodePingResponse></soap:Body></soap:Envelope>


and I want to get the value from the element "return" which currently has a
value of "Ready".

Other XML files I want to work with may have several elements I want to pull
data from. This seem relatively easy but I have been reading and cruising
google for hours and none of the examples make any sense.

I appreciate any code writing help with this.
 
P

Paul McGuire

Rodney said:
Hi,

Im a Python newbie and am trying to get the data out of a series of XML
files. So for example the xml is:

<?xml version="1.0" encoding="utf-8"?><soap:Envelope
xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/"
xmlns:tns="http://www.ExchangeNetwork.net/schema/v1.0/node.wsdl"
xmlns:types="http://www.ExchangeNetwork.net/schema/v1.0/node.wsdl"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"><soap:Header><wsu:Timestamp
xmlns:wsu="http://schemas.xmlsoap.org/ws/2002/07/utility"><wsu:Created>2005-

soap:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"><q1:NodePingR
esponse
xmlns:q1="http://www.ExchangeNetwork.net/schema/v1.0/node.xsd"><return
xsi:type="xsd:string">Ready said:
and I want to get the value from the element "return" which currently has a
value of "Ready".

Other XML files I want to work with may have several elements I want to pull
data from. This seem relatively easy but I have been reading and cruising
google for hours and none of the examples make any sense.

I appreciate any code writing help with this.
This is data in a set of files? It looks like the SOAP reply from a Web
Service described in the WSDL found at
http://www.ExchangeNetwork.net/schema/v1.0/node.wsdl, specifically the
NodePingResponse which is the return value from the web operation NodePing.
You might best generate the proper Python parsing classes using SOAPpy,
using the WSDL source.

You can also use Python's XML modules for loading the XML into a DOM tree,
or use Fredrik Lundh's ElementTree utility module.

-- Paul
 
A

Alan Kennedy

[Rodney]
Im a Python newbie and am trying to get the data out of a series of XML
files.

As Paul McGuire already noted, it's unusual to extract information from
a SOAP message this way: it is more usual to use a SOAP toolkit to do
the job for you.

But, assuming that you know what you're doing, and that you're doing it
for good reasons, here's a snippet that uses xpath to do what you want.

#-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
document = """\
<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope
xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/"
xmlns:tns="http://www.ExchangeNetwork.net/schema/v1.0/node.wsdl"
xmlns:types="http://www.ExchangeNetwork.net/schema/v1.0/node.wsdl"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<soap:Header>
<wsu:Timestamp
xmlns:wsu="http://schemas.xmlsoap.org/ws/2002/07/utility">
<wsu:Created>2005-12-28T05:59:38Z</wsu:Created>
<wsu:Expires>2005-12-28T06:04:38Z</wsu:Expires>
</wsu:Timestamp>
</soap:Header>
<soap:Body
soap:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
<q1:NodePingResponse
xmlns:q1="http://www.ExchangeNetwork.net/schema/v1.0/node.xsd">
<return xsi:type="xsd:string">Ready</return>
</q1:NodePingResponse>
</soap:Body>
</soap:Envelope>
"""

import xml.dom.minidom
import xml.xpath

#dom_tree = xml.dom.minidom.parse('my_xml_file.xml')
dom_tree = xml.dom.minidom.parseString(document)
return_node = xml.xpath.Evaluate('//return', dom_tree)[0]
print "Return status is: '%s'" % return_node.childNodes[0].nodeValue
#-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

You have to install PyXML to get xpath support: http://pyxml.sf.net

There are other ways to do it, e.g. using ElementTree, but I'll leave it
to others to suggest the best way to do that.

HTH,
 
R

Rodney

Thanks for the help

This was a SOAP Webservice message. I used httplib instead of SOAPpy or ZSI
because SOAPpy cann't do arrays of complex type and ZSI was confusing.

Thanks again
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,901
Latest member
Noble71S45

Latest Threads

Top