dealing with nested xml within nested xml within......

Discussion in 'Python' started by Ultrus, Jul 9, 2007.

  1. Ultrus

    Ultrus Guest

    Hello all,
    I don't need specific examples, but I'm trying to wrap my head around
    parsing xml within xml and even further, not limiting how far someone
    will nest xml. I'm already making great use of BeautifulSoup's
    BeautifulStoneSoup to parse xml, but what do I do if I come across
    something like this?

    <random>
    <li>This is a random response (once parsed)</li>
    <li>
    <random>
    <li>This is a random response within a random response</li>
    <li>
    <random>
    <li>This is a random response within a random response,
    within another random response</li>
    <li>Like above, this is another random response.</li>
    </random>
    </li>
    </random>
    </li>
    </random>

    Not knowing how far one will nest random responses, how would one
    manage digging into xml like this? Right now I'm thinking about not
    even going there. I would presently write scripts that would parse 3
    or so levels deep, but no further. :p It would make an interesting
    project, like an interactive adventure story.
    Ultrus, Jul 9, 2007
    #1
    1. Advertising

  2. Ultrus

    Guest

    On Jul 9, 3:03 pm, Ultrus <> wrote:
    > Hello all,
    > I don't need specific examples, but I'm trying to wrap my head around
    > parsing xml within xml and even further, not limiting how far someone
    > will nest xml. I'm already making great use of BeautifulSoup's
    > BeautifulStoneSoup to parse xml, but what do I do if I come across
    > something like this?
    >
    > <random>
    > <li>This is a random response (once parsed)</li>
    > <li>
    > <random>
    > <li>This is a random response within a random response</li>
    > <li>
    > <random>
    > <li>This is a random response within a random response,
    > within another random response</li>
    > <li>Like above, this is another random response.</li>
    > </random>
    > </li>
    > </random>
    > </li>
    > </random>
    >
    > Not knowing how far one will nest random responses, how would one
    > manage digging into xml like this? Right now I'm thinking about not
    > even going there. I would presently write scripts that would parse 3
    > or so levels deep, but no further. :p It would make an interesting
    > project, like an interactive adventure story.


    You'd probably write a function that called itself to parse something
    like this. Unfortunately, I am not a recursion expert. You can read up
    on it though:

    http://www.freenetpages.co.uk/hp/alan.gauld/tutrecur.htm
    http://pythonjournal.cognizor.com/pyj2.2/RecursionByAJChung.html

    I haven't used Beautiful Soup, but I think you can use lxml or
    ElementTree to get a tree object of the XML and then just iterate over
    the tree.

    Mike
    , Jul 9, 2007
    #2
    1. Advertising

  3. Ultrus

    Ultrus Guest

    Ultrus, Jul 9, 2007
    #3
  4. Ultrus wrote:
    > I don't need specific examples, but I'm trying to wrap my head around
    > parsing xml within xml and even further, not limiting how far someone
    > will nest xml. I'm already making great use of BeautifulSoup's
    > BeautifulStoneSoup to parse xml, but what do I do if I come across
    > something like this?
    >
    > <random>
    > <li>This is a random response (once parsed)</li>
    > <li>
    > <random>
    > <li>This is a random response within a random response</li>
    > <li>
    > <random>
    > <li>This is a random response within a random response,
    > within another random response</li>
    > <li>Like above, this is another random response.</li>
    > </random>
    > </li>
    > </random>
    > </li>
    > </random>
    >
    > Not knowing how far one will nest random responses, how would one
    > manage digging into xml like this?


    I don't know what you want to do with this document, but you might want to
    consider using lxml.etree to handle it:

    >>> from lxml import etree
    >>> tree = etree.parse("myfile.xml")


    >>> for random in tree.getiterator("random"):

    ... for li in random:
    ... if li.text.strip():
    ... print li.text


    http://codespeak.net/lxml/

    Stefan
    Stefan Behnel, Jul 9, 2007
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Jesse
    Replies:
    2
    Views:
    325
  2. seeCoolGuy
    Replies:
    1
    Views:
    335
    Andy Dingley
    Aug 3, 2006
  3. none
    Replies:
    2
    Views:
    392
    Steven Cheng [MSFT]
    Jul 21, 2008
  4. Neil Cerutti
    Replies:
    0
    Views:
    259
    Neil Cerutti
    Jan 21, 2011
  5. Saqib Ali
    Replies:
    2
    Views:
    199
    Saqib Ali
    Jan 14, 2011
Loading...

Share This Page