libxml

F

Faith Greenwood

How do I return multiple nodes in an xpath search using LibXML? I have
the following xml:

<entry>
<set>
<nodeA>A</nodeA>
<book><name>Near Dead</name><date>1992</date></book>
<page>1262</page>
</set>

<set>
<nodeA>A</nodeA>
<book><name>Alive and Well</name><date>1973</date></book>
<page></page>
</set>

<set>
<nodeA>A</nodeA>
<book><name>Still Kicking</name><date>1968</date></book>
<page>1598</page>
</set>
</entry>


Here is my perl:

my $parser=XML::LibXML->new();
my $doc=$parser->parse_file("xml.xml");
my @array;
my $search="//entry/set[nodeA/text()='A']/page/text()";
push(@array,$_->data) for ($doc->findnodes($search));
print "@array\n";
#######
When I print out @array, I get "1262 1598", as expected. The problem
is I don't know the book that 1598 belongs to. Does 1598 belong to
"Still Kicking" or "Alive and Well"? Further, I need to keep track of
the page AND the date AND the author, without mixing any of them up.
If I do 2 diff. searches for the page and dates and push those into 2
separate arrays, obviously I won't be able to use them in the correct
order.

Does anyone have any suggestions?

thx!
 
F

Faith Greenwood

How do I return multiple nodes in an xpath search using LibXML? I have
the following xml:

<entry>
<set>
<nodeA>A</nodeA>
<book><name>Near Dead</name><date>1992</date></book>
<page>1262</page>
</set>

<set>
<nodeA>A</nodeA>
<book><name>Alive and Well</name><date>1973</date></book>
<page></page>
</set>

<set>
<nodeA>A</nodeA>
<book><name>Still Kicking</name><date>1968</date></book>
<page>1598</page>
</set>
</entry>

Here is my perl:

my $parser=XML::LibXML->new();
my $doc=$parser->parse_file("xml.xml");
my @array;
my $search="//entry/set[nodeA/text()='A']/page/text()";
push(@array,$_->data) for ($doc->findnodes($search));
print "@array\n";
#######
When I print out @array, I get "1262 1598", as expected. The problem
is I don't know the book that 1598 belongs to. Does 1598 belong to
"Still Kicking" or "Alive and Well"? Further, I need to keep track of
the page AND the date AND the author, without mixing any of them up.
If I do 2 diff. searches for the page and dates and push those into 2
separate arrays, obviously I won't be able to use them in the correct
order.

Does anyone have any suggestions?

thx!

I should also mention that this is only a small subset of the xml I am
dealing with. There are many page nodes that have the same page number
(different book) and many date notes that have the same date. Simply
searching on "1598" would return many books and wouldn't do me any
good. thanks again!
 
F

Faith Greenwood

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

How do I return multiple nodes in an xpath search using LibXML? I have
the following xml:
<entry>
<set>
<nodeA>A</nodeA>
<book><name>Near Dead</name><date>1992</date></book>
<page>1262</page>
</set>
<set>
<nodeA>A</nodeA>
<book><name>Alive and Well</name><date>1973</date></book>
<page></page>
</set>
<set>
<nodeA>A</nodeA>
<book><name>Still Kicking</name><date>1968</date></book>
<page>1598</page>
</set>
</entry>
Here is my perl:
my $parser=XML::LibXML->new();
my $doc=$parser->parse_file("xml.xml");
my @array;
my $search="//entry/set[nodeA/text()='A']/page/text()";
push(@array,$_->data) for ($doc->findnodes($search));
print "@array\n";
#######
When I print out @array, I get "1262 1598", as expected. The problem
is I don't know the book that 1598 belongs to. Does 1598 belong to
"Still Kicking" or "Alive and Well"? Further, I need to keep track of
the page AND the date AND the author, without mixing any of them up.
If I do 2 diff. searches for the page and dates and push those into 2
separate arrays, obviously I won't be able to use them in the correct
order.

  Possibly the problem is in your XPath expression:

my $search="//entry/set[nodeA/text()='A']/page/text()";

  Should be:

my $search="//entry/ancestor::set[nodeA/text()='A']/page/text()";

  Or also:

my $search="//entry/set[nodeA/text()='A']/../page/text()";

  Should work...

  Take a look onhttp://www.w3.org/TR/xpath/as reference.


Does anyone have any suggestions?

Best regards,
- --
| Daniel Molina <dmw [at] coder [dot] cl> |
| IT Consulting & Software Development    |
| Phone: +56 2 9790277 |http://coder.cl/|
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iQIcBAEBCgAGBQJLZfO6AAoJEHxqfq6Y4O5N8H0P/j08wNJMmztPGJm9g1lWVifn
NEwhFeZXpyLQhkF1Bs0fwzr86q413OJFSQ2Pcea3jBzR2Atc6yNk+N4K1Sbx1P8M
Z+0z2XfbbHkmh/2WkSMGNFRbl/clATD3LiX6+0m0mnr8uGRbslqzQlX9/mTGzp2B
bb3QjOuio3uUacLYj2yV3umctJ4SUpIdCGgVJ+d8USXhbHvPd324A58X4KhdWQb6
tmQDeZRBPlkiymNg4ipntj7QN+boj0De9H/0XFKa1fX09+Cwk1HHFTQSq1s5HqTR
y45OQ21a7CdcdqX+W0SW8qKnvMA4VhHIvf3vgYYjwstYVrzeJ85il2BH2Sl3jmew
t3v4UWQYcyZdl73nQHRObzV2RbjLntP99r5xnAQ0OO9iueTNo8tnqn0SDU6qE01S
TGsJeyCctXqIhJgmMAriOiL6mGjkizDg2bEdojLaV1Eoey4zUTDhtsYoFjZW8kun
YFDSKZGtPWAn8AQ8fIskgF7U4vla9MLMHHnrr/EFrIHktGLnJhRuSvcI8TTbI0u+
LlWDIHx/x+BpRB19cgRN9vLJfLZ8cxcv0rvXOoBJ7vQ6pIAV8TaM9r96nQGeWPXF
R86MusIdZ5MHk790NuRV3cM3xBIj5uXIWG40c+nzx91bhewSY0YzAg/qn/U8StM2
5mL+Y3JZsuQaxN3qoTIS
=iXY/
-----END PGP SIGNATURE-----

thx, but my problem isn't with returning simply the page. I need to be
able to return the page, date, and the book in some sort of hash so as
to make sure that the pages and dates returned correspond w/ the right
book.
 
F

Faith Greenwood

Faith Greenwood said:
How do I return multiple nodes in an xpath search using LibXML?

I don't know, but I don't think that you need to...
<entry>
<set>
<nodeA>A</nodeA>
<book><name>Near Dead</name><date>1992</date></book>
<page>1262</page>
</set>
my $search="//entry/set[nodeA/text()='A']/page/text()";
When I print out @array, I get "1262 1598", as expected. The problem
is I don't know the book that 1598 belongs to.

Q: At what level DO you know that a <book> is associated with a <page> ?

A: At the said:
Does 1598 belong to
"Still Kicking" or "Alive and Well"? Further, I need to keep track of
the page AND the date AND the author, without mixing any of them up.

Then you want to search for <set> elements rather than for
the text inside of <page> elements.

Once you have a set of <set>s, loop over them to extract
the further information that you need.

gotcha. thanks. Is there a way to return the tagname? I want to ensure
I don't get the page and dates mixed up...1975 could be either a page
or a date, for instance.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,011
Latest member
AjaUqq1950

Latest Threads

Top