saxon xsl: handle non escaped input?

J

Johannes Busse

Hello NG,

I'm struggling with the following problem. I think can be
solved quite easily (in fact it should be a FAQ), but it
seems that I cannot solve it myself :-(


my source looks like this:

<?xml version="1.0" encoding="utf-8"?>
<database>
<webpage id="0815" >
<title>some titlw</title>
<url>XXXX</url>
</webpage>
</database>


and I want to have some output like this:

<a href="XXXX">some title</a>


with XXXX like

http://somwhere.nirwana.net/form?__s=2&dsc=anew/...

you can see: XXXX has a lot of non escaped characters.


My XSLT-approach:


<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">

<xsl:template match="/">
<html>
<body>
<xsl:apply-templates select="//webpage" />
</body>
</html>
</xsl:template>

<xsl:template match="webpage">
<p>
<a>
<xsl:attribute name="href"
saxon:disable-output-escaping="yes"
xmlns:saxon="http://icl.com/saxon">
<xsl:copy-of select="url" />
</xsl:attribute>
<xsl:text>: </xsl:text>
<xsl:copy-of select="title" />
</a>
</p>
</xsl:template>

</xsl:stylesheet>


Saxon complains with the following message:

nowhere@web:test> saxon url.xml url.xsl

Error on line 6 column 57 of file url.xml:
Error reported by XML parser: unexpected character
after entity reference (found "=") (expected ";")
Transformation failed: Run-time errors were reported

How can I copy this input-fields?



thanks!
Johannes
www.jbusse.de
 
D

David Carlisle

with XXXX like

http://somwhere.nirwana.net/form?__s=2&dsc=anew/...

you can see: XXXX has a lot of non escaped characters.

No it only has one unescaped character disable-output-escaping is
talking about XML &-escaping, not URI %-escaping.

So the only unescaped character is the & and you don't want that to
appear as an & in either HTML or XHTML it is an error in both cases
(although browsers may silently ignore the error) the & needs to be
&amp; both in the input to your stylesheet and in the output.



Saxon complains with the following message:

nowhere@web:test> saxon url.xml url.xsl

Error on line 6 column 57 of file url.xml:
Error reported by XML parser: unexpected character

in any case your input files need to be well formed XML so the & needs to
be & there.

David
 
J

Johannes Busse

Hi
No it only has one unescaped character disable-output-escaping is
talking about XML &-escaping, not URI %-escaping.

this was helpful, thank you: now I know that there is a
debate about URI-escaping, and that there is a specific
limitation in the function disable-output-escaping.

but i this my problem? I am concerned of the input.
Let me tray an new formulation of this problem:
if I test my string
form?__s=2&dsc=anew/
with
http://webcoder.info/reference/URIEsc.html

I get the other representation
form%3F__s%3D2%26dsc%3Danew

so, webcoder thinks, this is meaningful usage of the term
"escape".
is there some possibility to tell saxon to
perform this sort of encoding to some elements
*instead* of complaining the syntax?
So the only unescaped character is the & and you don't want that to
appear as an & in either HTML or XHTML it is an error in both cases
(although browsers may silently ignore the error) the & needs to be
&amp; both in the input to your stylesheet and in the output.

no: your advice would mean to write
form?__s=2&amp;dsc=anew/
-----
in the output; this seems to be very odd to me?

No, the problem concerns the input:
Saxon complains with the following message:

nowhere@web:test> saxon url.xml url.xsl

Error on line 6 column 57 of file url.xml:
Error reported by XML parser: unexpected character
after entity reference (found "=") (expected ";")

the last line is very important: saxon (6.5) thinks there
is an entity reference. What I want to tell him ist that
there is some cdata -- without the need to prive a full
DTD or schema.

johannes
 
J

Johannes Busse

the problem was solved just one thread ahead some
hours after my mailing (see "embedding xml in xml as non-xml")

my problem was:
my source looks like this:
<webpage id="0815" >
<title>some titlw</title>
<url>XXXX</url>
</webpage>
and I want to have some output like this:
<a href="XXXX">some title</a>
with XXXX like http://somwhere.nirwana.net/form?__s=2&dsc=anew/...
you can see: XXXX has a lot of non escaped characters.
Saxon complains with the following message:
Error ... file url.xml:
Error reported by XML parser: unexpected character
after entity reference (found "=") (expected ";")
Transformation failed: Run-time errors were reported
How can I copy this url-input-fields?

As soon the element url contains the charachter "=", XML assumes
an entity reference. My input was in fact not well formed
XML data.

Solution: mark parts of the input to be cdata like this:

<webpage id="0815" >
<title>some titel</title>
<url><![CDATA[http://somwhere.nirwana.net/form?__s=2&dsc=anew/...]]></url>
</webpage>

and anything is ok.

(Well, i knew the cdata concept within a DTD;
but I didn't know how to use it in the input-stream).

Thank you for help!
Johannes
 
K

Kenneth Stephen

Johannes Busse wrote:

As soon the element url contains the charachter "=", XML assumes
an entity reference. My input was in fact not well formed
XML data.

Solution: mark parts of the input to be cdata like this:

<webpage id="0815" >
<title>some titel</title>
<url><![CDATA[http://somwhere.nirwana.net/form?__s=2&dsc=anew/...]]></url>
</webpage>
Johannes,

No you've misunderstood David's explanation. The '=' is not the
problem. It is the fact that you have an &dsc in the URL (instead of
&dsc;) that is causing the problem. The error message is telling you
just that : the parser found a '=' where it was expecting a ';'. This is
because whenver XML parsers encounter an '&', they assume it is the
start of an entity. For your URL to be correct, you would indeed have to
change the &dsc to &amp;dsc. If that seems odd to you, it only because
you havent fully understood the XML spec yet.

Regards,
Kenneth
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,020
Latest member
GenesisGai

Latest Threads

Top