J
Jason Friedman
I have XML which looks like:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE KMART SYSTEM "my.dtd">
<LEVEL_1>
<LEVEL_2 ATTR="hello">
<ATTRIBUTE NAME="Property X" VALUE ="2"/>
</LEVEL_2>
<LEVEL_2 ATTR="goodbye">
<ATTRIBUTE NAME="Property Y" VALUE ="NULL"/>
<LEVEL_3 ATTR="aloha">
<ATTRIBUTE NAME="Property X" VALUE ="3"/>
</LEVEL_3>
<ATTRIBUTE NAME="Property Z" VALUE ="welcome"/>
</LEVEL_2>
</LEVEL_1>
The "Property X" string appears twice times and I want to output the "path"
that leads to all such appearances. In this case the output would be:
LEVEL_1 {}, LEVEL_2 {"ATTR": "hello"}, ATTRIBUTE {"NAME": "Property X",
"VALUE": "2"}
LEVEL_1 {}, LEVEL_2 {"ATTR": "goodbye"}, LEVEL_3 {"ATTR": "aloha"},
ATTRIBUTE {"NAME": "Property X", "VALUE": "3"}
My actual XML file is 2000 lines and contains up to 8 levels of nesting.
I have tried this so far (partial code, using the xml.etree.ElementTree
module):
def get_path(data_dictionary, val, path):
for node in data_dictionary[CHILDREN]:
if node[CHILDREN]:
if not path or node[TAG] != path[-1]:
path.append(node[TAG])
print(CR + "recursing ...")
get_path(node, val, path)
else:
for k,v in node[ATTRIB].items():
if v == val:
print("path- ",path)
print("---- " + node[TAG] + " " + str(node[ATTRIB]))
I'm really not even close to getting the output I am looking for.
Python 3.2.2.
Thank you.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE KMART SYSTEM "my.dtd">
<LEVEL_1>
<LEVEL_2 ATTR="hello">
<ATTRIBUTE NAME="Property X" VALUE ="2"/>
</LEVEL_2>
<LEVEL_2 ATTR="goodbye">
<ATTRIBUTE NAME="Property Y" VALUE ="NULL"/>
<LEVEL_3 ATTR="aloha">
<ATTRIBUTE NAME="Property X" VALUE ="3"/>
</LEVEL_3>
<ATTRIBUTE NAME="Property Z" VALUE ="welcome"/>
</LEVEL_2>
</LEVEL_1>
The "Property X" string appears twice times and I want to output the "path"
that leads to all such appearances. In this case the output would be:
LEVEL_1 {}, LEVEL_2 {"ATTR": "hello"}, ATTRIBUTE {"NAME": "Property X",
"VALUE": "2"}
LEVEL_1 {}, LEVEL_2 {"ATTR": "goodbye"}, LEVEL_3 {"ATTR": "aloha"},
ATTRIBUTE {"NAME": "Property X", "VALUE": "3"}
My actual XML file is 2000 lines and contains up to 8 levels of nesting.
I have tried this so far (partial code, using the xml.etree.ElementTree
module):
def get_path(data_dictionary, val, path):
for node in data_dictionary[CHILDREN]:
if node[CHILDREN]:
if not path or node[TAG] != path[-1]:
path.append(node[TAG])
print(CR + "recursing ...")
get_path(node, val, path)
else:
for k,v in node[ATTRIB].items():
if v == val:
print("path- ",path)
print("---- " + node[TAG] + " " + str(node[ATTRIB]))
I'm really not even close to getting the output I am looking for.
Python 3.2.2.
Thank you.