Re: Finding all instances of a string in an XML file

Discussion in 'Python' started by Peter Otten, Jun 21, 2013.

  1. Peter Otten

    Peter Otten Guest

    Jason Friedman wrote:

    > I have XML which looks like:
    >
    > <?xml version="1.0" encoding="UTF-8"?>
    > <!DOCTYPE KMART SYSTEM "my.dtd">
    > <LEVEL_1>
    > <LEVEL_2 ATTR="hello">
    > <ATTRIBUTE NAME="Property X" VALUE ="2"/>
    > </LEVEL_2>
    > <LEVEL_2 ATTR="goodbye">
    > <ATTRIBUTE NAME="Property Y" VALUE ="NULL"/>
    > <LEVEL_3 ATTR="aloha">
    > <ATTRIBUTE NAME="Property X" VALUE ="3"/>
    > </LEVEL_3>
    > <ATTRIBUTE NAME="Property Z" VALUE ="welcome"/>
    > </LEVEL_2>
    > </LEVEL_1>
    >
    > The "Property X" string appears twice times and I want to output the
    > "path"
    > that leads to all such appearances. In this case the output would be:
    >
    > LEVEL_1 {}, LEVEL_2 {"ATTR": "hello"}, ATTRIBUTE {"NAME": "Property X",
    > "VALUE": "2"}
    > LEVEL_1 {}, LEVEL_2 {"ATTR": "goodbye"}, LEVEL_3 {"ATTR": "aloha"},
    > ATTRIBUTE {"NAME": "Property X", "VALUE": "3"}
    >
    > My actual XML file is 2000 lines and contains up to 8 levels of nesting.


    That's still small, so

    xml = """<?xml version="1.0" encoding="UTF-8"?>
    <!DOCTYPE KMART SYSTEM "my.dtd">
    <LEVEL_1>
    <LEVEL_2 ATTR="hello">
    <ATTRIBUTE NAME="Property X" VALUE ="2"/>
    </LEVEL_2>
    <LEVEL_2 ATTR="goodbye">
    <ATTRIBUTE NAME="Property Y" VALUE ="NULL"/>
    <LEVEL_3 ATTR="aloha">
    <ATTRIBUTE NAME="Property X" VALUE ="3"/>
    </LEVEL_3>
    <ATTRIBUTE NAME="Property Z" VALUE ="welcome"/>
    </LEVEL_2>
    </LEVEL_1>
    """

    import xml.etree.ElementTree as etree

    tree = etree.fromstring(xml)

    def walk(elem, path, token):
    path += (elem,)
    if token in elem.attrib.values():
    yield path
    for child in elem.getchildren():
    for match in walk(child, path, token):
    yield match

    for path in walk(tree, (), "Property X"):
    print(", ".join("{} {}".format(elem.tag, elem.attrib) for elem in path))
     
    Peter Otten, Jun 21, 2013
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. John Wohlbier
    Replies:
    2
    Views:
    370
    Josiah Carlson
    Feb 22, 2004
  2. Jason Friedman
    Replies:
    0
    Views:
    137
    Jason Friedman
    Jun 21, 2013
  3. dieter
    Replies:
    0
    Views:
    115
    dieter
    Jun 21, 2013
  4. Jason Friedman
    Replies:
    0
    Views:
    120
    Jason Friedman
    Jun 21, 2013
  5. Jason Friedman
    Replies:
    0
    Views:
    133
    Jason Friedman
    Jun 23, 2013
Loading...

Share This Page