xpath to get the first sibling

yawnmoth · May 22, 2009

<div>
<strong>Name: </strong> John Doe<br />
<strong>Phone Number: </strong> 111-111-1111<br />
<strong>Address: </strong> 111 Anywhere St.
</div>

Say that's my HTML. I can get the node whose value is "Name: " with
"//strong[.='Name: ']", but how do I get " John Doe"? <http://
www.w3schools.com/XPath/xpath_axes.asp> mentions "following-sibling"
but neither "//strong[.='Name: ']//following-sibling" or ""//strong
[.='Name: ']//following-sibling::*" yield any results.

Any ideas?

Joe Kesselman · May 22, 2009

yawnmoth said:
<div>
<strong>Name: </strong> John Doe<br />
<strong>Phone Number: </strong> 111-111-1111<br />
<strong>Address: </strong> 111 Anywhere St.
</div>

Say that's my HTML. I can get the node whose value is "Name: " with
"//strong[.='Name: ']", but how do I get " John Doe"?

* matches the default node-type, which is usually elements. If you want
it to match a text node, you need to say so.

What you're looking for is:
//strong[.='Name: ']/following-sibling::text()[1]
.... the first text node after the <strong> whose value is "Name: '.
(Note that if you don't specify [1], XPath will return all the following
siblings -- which would include the <br/>, the next <strong>, and
everything else up to the </div>.)

Mixed content tends to be a pain to work with in XML and XSLT. You may
want to consider structuring this more semantically, eg as a two-column
table.

Peter Flynn · May 23, 2009

Joe said:
yawnmoth said:

<div>
<strong>Name: </strong> John Doe<br />
<strong>Phone Number: </strong> 111-111-1111<br />
<strong>Address: </strong> 111 Anywhere St.
</div>

Say that's my HTML. I can get the node whose value is "Name: " with
"//strong[.='Name: ']", but how do I get " John Doe"?

Click to expand...

* matches the default node-type, which is usually elements. If you want
it to match a text node, you need to say so.

What you're looking for is:
//strong[.='Name: ']/following-sibling::text()[1]
... the first text node after the <strong> whose value is "Name: '.
(Note that if you don't specify [1], XPath will return all the following
siblings -- which would include the <br/>, the next <strong>, and
everything else up to the </div>.)

Mixed content tends to be a pain to work with in XML and XSLT. You may
want to consider structuring this more semantically, eg as a two-column
table.

Alternative: use
<span class="label">Name:</span> <span class="name">John Doe</span>
etc

The abuse of <strong> emphasis is one of the unfortunate results of the
politically-correct sanitisation of HTML undertaken by the W3C.

Adding leading and trailing spaces to text nodes in mixed content just
to make it look pretty is usually A Bad Idea if the document is going to
be reprocessed.

However, if this is someone else's document, you just have to deal with it.

///Peter

yawnmoth · May 24, 2009

<snip>
However, if this is someone else's document, you just have to deal with it.

That's what it is. If I was parsing my own HTML, I'd just stick with
what I know and wouldn't have put myself in a position to work with
something I don't.

Of course, by doing that, I'd also be missing an opportunity to learn
something new, so I can't be too annoyed with it, heh.

Thanks, Joe and Peter!

fulvius · Sep 10, 2011

Joe Kesselman's syntax above worked for me once I removed the parentheses, as in: //strong[.='Name: ']/following-sibling::text[1] ... so to select the very next following sibling of any tag type, .../following-sibling::*[1]/... could be used, or for instance if you wanted to select not the very next div sibling, but rather the one after that, then .../following-sibling::div[2]/... should work.

Using Xpath to parse a Yahoo Finance page	4	Dec 3, 2012
Problems using the "following-sibling"-expression in XPATH	3	Jun 27, 2003
XPath command to extract the following info?	3	Apr 9, 2009
XPath to get all elements with an attribute starting with "on"	2	Jan 6, 2008
Cand XSLT evaluate XPath in String variable?	4	Jan 4, 2005
Working on mobile css menu with plenty of frustration!	2	Dec 29, 2022
python/xpath issue..	0	Aug 25, 2008
How to find and replace sequences of elements	9	Sep 6, 2005

xpath to get the first sibling

yawnmoth

Joe Kesselman

Peter Flynn

yawnmoth

fulvius

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads