how do I match an replace text with XSL

A

Alois Treindl

A simple XSL question from a newbie:

In an xml document which I transform via xsl into html output, I have
some text which I want to be suppressed.

The tags looks like this
<anchor_ref name="#B4">I. Introduction - page 4 </anchor_ref>
<anchor_ref name="#B4">II. Childhood - page 24 </anchor_ref>
<anchor_ref name="#B4">I. Later - page 42 </anchor_ref>

I want to define an xsl rule which gets rid of the page numbers which
make no sense in the html version.

I.e. anything fitting the pattern ' - page NN '
where NN is a single or double digit number should be replaced by nothing.

How would this XSL rule look?

A later complication will be that the word 'page' can also appear in
other languages, e.g. 'Seite 4', 'pagina 4' etc.
 
M

Martin Honnen

Alois Treindl wrote:

In an xml document which I transform via xsl into html output, I have
some text which I want to be suppressed.

The tags looks like this
<anchor_ref name="#B4">I. Introduction - page 4 </anchor_ref>
<anchor_ref name="#B4">II. Childhood - page 24 </anchor_ref>
<anchor_ref name="#B4">I. Later - page 42 </anchor_ref>

I want to define an xsl rule which gets rid of the page numbers which
make no sense in the html version.

I.e. anything fitting the pattern ' - page NN '
where NN is a single or double digit number should be replaced by nothing.

XSLT 1.0/XPath 1.0 are not very powerful when it comes to string
manipulation, string matching, string replacement.
You could write an XSLT template
<xsl:template match="text()[contains(. 'page')]">
to match text nodes which contain the string 'page' but there are no
ways to do regular expression pattern matching for two digits behind 'page'.


So unless you can use XSLT/XPath 2.0 which has regular expression
support you have a lot of code to write in XSLT/XPath 1.0. It might help
to (re)use existing solutions for string replacement, see the replace
and tokenize solutions in EXSLT
<http://www.exslt.org/str/index.html>
 
A

Alois Treindl

Martin said:
Alois Treindl wrote:

In an xml document which I transform via xsl into html output, I have
some text which I want to be suppressed.

The tags looks like this
<anchor_ref name="#B4">I. Introduction - page 4 </anchor_ref>
<anchor_ref name="#B4">II. Childhood - page 24 </anchor_ref>
<anchor_ref name="#B4">I. Later - page 42 </anchor_ref>

I want to define an xsl rule which gets rid of the page numbers which
make no sense in the html version.

I.e. anything fitting the pattern ' - page NN '
where NN is a single or double digit number should be replaced by
nothing.

XSLT 1.0/XPath 1.0 are not very powerful when it comes to string
manipulation, string matching, string replacement.
You could write an XSLT template
<xsl:template match="text()[contains(. 'page')]">
to match text nodes which contain the string 'page' but there are no
ways to do regular expression pattern matching for two digits behind
'page'.


So unless you can use XSLT/XPath 2.0 which has regular expression
support you have a lot of code to write in XSLT/XPath 1.0. It might help
to (re)use existing solutions for string replacement, see the replace
and tokenize solutions in EXSLT
<http://www.exslt.org/str/index.html>


I use xsltproc, which says:
Using libxml 20510, libxslt 10033 and libexslt 722
xsltproc was compiled against libxml 20510, libxslt 10033 and libexslt 722
libxslt 10033 was compiled against libxml 20510
libexslt 722 was compiled against libxml 20510

I don't know whether this is XSLT/XPath 1.0 or 2.0.

If it is 2.0, I would of course be very happy to get explicit xsl rules.

So far, we have built a crutch and do the filtering with good old sed.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top