dealing with nested xml within nested xml within......

U

Ultrus

Hello all,
I don't need specific examples, but I'm trying to wrap my head around
parsing xml within xml and even further, not limiting how far someone
will nest xml. I'm already making great use of BeautifulSoup's
BeautifulStoneSoup to parse xml, but what do I do if I come across
something like this?

<random>
<li>This is a random response (once parsed)</li>
<li>
<random>
<li>This is a random response within a random response</li>
<li>
<random>
<li>This is a random response within a random response,
within another random response</li>
<li>Like above, this is another random response.</li>
</random>
</li>
</random>
</li>
</random>

Not knowing how far one will nest random responses, how would one
manage digging into xml like this? Right now I'm thinking about not
even going there. I would presently write scripts that would parse 3
or so levels deep, but no further. :p It would make an interesting
project, like an interactive adventure story.
 
K

kyosohma

Hello all,
I don't need specific examples, but I'm trying to wrap my head around
parsing xml within xml and even further, not limiting how far someone
will nest xml. I'm already making great use of BeautifulSoup's
BeautifulStoneSoup to parse xml, but what do I do if I come across
something like this?

<random>
<li>This is a random response (once parsed)</li>
<li>
<random>
<li>This is a random response within a random response</li>
<li>
<random>
<li>This is a random response within a random response,
within another random response</li>
<li>Like above, this is another random response.</li>
</random>
</li>
</random>
</li>
</random>

Not knowing how far one will nest random responses, how would one
manage digging into xml like this? Right now I'm thinking about not
even going there. I would presently write scripts that would parse 3
or so levels deep, but no further. :p It would make an interesting
project, like an interactive adventure story.

You'd probably write a function that called itself to parse something
like this. Unfortunately, I am not a recursion expert. You can read up
on it though:

http://www.freenetpages.co.uk/hp/alan.gauld/tutrecur.htm
http://pythonjournal.cognizor.com/pyj2.2/RecursionByAJChung.html

I haven't used Beautiful Soup, but I think you can use lxml or
ElementTree to get a tree object of the XML and then just iterate over
the tree.

Mike
 
S

Stefan Behnel

Ultrus said:
I don't need specific examples, but I'm trying to wrap my head around
parsing xml within xml and even further, not limiting how far someone
will nest xml. I'm already making great use of BeautifulSoup's
BeautifulStoneSoup to parse xml, but what do I do if I come across
something like this?

<random>
<li>This is a random response (once parsed)</li>
<li>
<random>
<li>This is a random response within a random response</li>
<li>
<random>
<li>This is a random response within a random response,
within another random response</li>
<li>Like above, this is another random response.</li>
</random>
</li>
</random>
</li>
</random>

Not knowing how far one will nest random responses, how would one
manage digging into xml like this?

I don't know what you want to do with this document, but you might want to
consider using lxml.etree to handle it:
... for li in random:
... if li.text.strip():
... print li.text


http://codespeak.net/lxml/

Stefan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Staff online

Members online

Forum statistics

Threads
473,734
Messages
2,569,441
Members
44,832
Latest member
GlennSmall

Latest Threads

Top