A
ahogue at theory dot lcs dot mit dot edu
Hello -
Is there any way to match complex subtree patterns with XPath? The
functions I see all seem to match along a single path from root to leaf.
I would like to match full subtrees.
For example, given the XHTML:
<html>
<body>
<p>
<a>#text</a>
<br/>
#text
<b>#text</b>
#text
<br/>
<font>
<a>#text</a>
</font>
</p>
<p>
<a>#text</a>
<br/>
#text
<br/>
<font>
<a>#text</a>
</font>
</p>
</body>
</html>
I would like to construct a "pattern" using XPath to match all subtrees
like:
<p>
<a>*</a>
<br/>
*
(<b>*</b>)?
(*)?
<br/>
<font>
<a>*</a>
</font>
</p>
where the "*" means that any text can be matched, and the "?" means that
0 or 1 instances of the item may be matched, similar to a regular
expression.
Is there an easy way to do this kind of "subtree pattern matching" in
XPath? Would I be better off writing a wrapper over XPath and using
several XPath queries to represent and retreive my pattern?
Thanks in advance,
Andrew Hogue
Is there any way to match complex subtree patterns with XPath? The
functions I see all seem to match along a single path from root to leaf.
I would like to match full subtrees.
For example, given the XHTML:
<html>
<body>
<p>
<a>#text</a>
<br/>
#text
<b>#text</b>
#text
<br/>
<font>
<a>#text</a>
</font>
</p>
<p>
<a>#text</a>
<br/>
#text
<br/>
<font>
<a>#text</a>
</font>
</p>
</body>
</html>
I would like to construct a "pattern" using XPath to match all subtrees
like:
<p>
<a>*</a>
<br/>
*
(<b>*</b>)?
(*)?
<br/>
<font>
<a>*</a>
</font>
</p>
where the "*" means that any text can be matched, and the "?" means that
0 or 1 instances of the item may be matched, similar to a regular
expression.
Is there an easy way to do this kind of "subtree pattern matching" in
XPath? Would I be better off writing a wrapper over XPath and using
several XPath queries to represent and retreive my pattern?
Thanks in advance,
Andrew Hogue