XML help

C

chris

I'm 4 months new to python and 4 hours new to XML. I've been trying to
understand and use the DOM tree walk sample shown at this site:
http://www.rexx.com/~dkuhlman/pyxmlfaq.html to walk through an xml file from
which I need to extract data for subsequent plotting.

I've repeated the functions from the site that I'm using here:

import sys, string
from xml.dom import minidom, Node

def walkTree(node):
if node.nodeType == Node.ELEMENT_NODE:
yield node
for child in node.childNodes:
for n1 in walkTree(child):
yield n1

def test(inFileName):
outFile = sys.stdout
doc = minidom.parse(inFileName)
rootNode = doc.documentElement
level = 0
for node in walkTree(rootNode):
my_processing(node, outFile)

A piece of the XML file I want to process is here:

<XMLDocument>
<!--
****************************************************************** -->
<!--
****************************************************************** -->
<!--File Name: C:\Temp\slit_coarse_isowall_velocity.xml-->
<!--
****************************************************************** -->
<!--
****************************************************************** -->
<HEADER>
<NAME> Simulation Results XML Writer</NAME>
<Version> 1.00</Version>
</HEADER>
<Dataset Name="Velocity" ID="1555">
<DataType> ELDT(Element data)</DataType>
<DeptVar Name="Velocity" Unit="m/s"/>
<NumberOfComponents> 1</NumberOfComponents>
<NumberOfIndpVariables> 2</NumberOfIndpVariables>
<IndpVar Name="Time" Unit="s"/>
<IndpVar Name="Normalized thickness" Unit=""/>
<Blocks>
<NumberOfBlocks> 231</NumberOfBlocks>
<Block Index="1">
<IndpVar Name="Time" Value="0.065320" Unit="s"/>
<IndpVar Name="Normalized thickness" Value="0.000000" Unit=""/>
<NumberOfDependentVariables> 32</NumberOfDependentVariables>
<Data>
<ElementData ID="1">
<DeptValues> 1.5098e+000</DeptValues>
</ElementData>
<ElementData ID="2">
<DeptValues> 1.4991e+000</DeptValues>
</ElementData>
<ElementData ID="7">
<DeptValues> 1.4744e+000</DeptValues>
......
</Data>
</Block>
<Block Index="2">
....

As can be seen data is represented by blocks within which a datapoint exists
for finite element IDs. Number of entries in each block vary and Element IDs
are not necessarily contiguous.

I've managed to test for specific elements and extract values. I want to
place the reults in arrays with array index equal to element ID. So as I
walk the tree I temporarily store IDs and DeptValues in lists. I'm ok so
far. I then intend to create an array of size determined by the maximum
value of ID. So in the sample above the array size will be 8 even though
only three entries exist.

At this point I'm stuck because I want to do this latter array creation and
processing when I "see" the /Block end of block tag. However I can't figure
out how to do that. Obviously I'm not understanding something about XML DOM
trees and Elements because when I try to print all elements I never see an
end tag for any. I'm obviously approaching this from a readline and process
point of view which is probably half the problem.

So how can I initiate array processing at the end of each block prior to
reaching the next block. Of course I'm open to simpler ways too ;)

tia for any advice.
 
D

Diez B. Roggisch

I've managed to test for specific elements and extract values. I want to
place the reults in arrays with array index equal to element ID. So as I
walk the tree I temporarily store IDs and DeptValues in lists. I'm ok so
far. I then intend to create an array of size determined by the maximum
value of ID. So in the sample above the array size will be 8 even though
only three entries exist.

That sounds wrong. If what you want is a mapping between keys (your IDs)
and values, you need to use a dcitionary. Like this:

mapping[myid] = value
At this point I'm stuck because I want to do this latter array creation and
processing when I "see" the /Block end of block tag. However I can't figure
out how to do that. Obviously I'm not understanding something about XML DOM
trees and Elements because when I try to print all elements I never see an
end tag for any. I'm obviously approaching this from a readline and process
point of view which is probably half the problem.

Your misunderstanding the nature of nodes in dom: a node in dom _is_ the
start and end-tag. If you have a dom-tree that you serialize, a node in
there will either be serialized as

<name/>

if it has no childs, or as

<name>
....
</name>

if it has childs. So the other way round, for a well-formed xml document
parsed to a dom, you end up with one node for all pairs of
opening/closing tags.

If a end-tag is what you're after, you migth want to look into the
event-driven XML api, SAX. But then you'll have other tradeoffs compared
to dom': all statekeeping has to be done by yourself. It's up to you to
chose.

Regards,

Diez
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,067
Latest member
HunterTere

Latest Threads

Top