C
Christian Roth
Hello,
when using this "identity" processing sheet:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl
utput method="xml" encoding="iso-8859-1" />
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
on this XML instance document:
<?xml version="1.0" encoding="iso-8859-1" ?>
<element attr="a tab" />
the result is:
<?xml version="1.0" encoding="iso-8859-1"?>
<element attr="a tab"/>
^^
Tabulator(0x9)--^^
, i.e. the numerical entity from the input document is not
recreated at serialization time, but simply substituted for the real
character, a tab.
Unfortunately, this means that re-applying the identity stylesheet from
above on this document makes the tab character get replaced by a single
space character according to the Attribute-Value Normalization rules
(<http://www.w3.org/TR/REC-xml#AVNormalize>):
<?xml version="1.0" encoding="iso-8859-1"?>
<element attr="a tab"/>
^
Space(0x20)-----^
In short: The above "identity" processing sheet does not deliver a
semantically identical document. Because if it did, the tab character in
the attribute value needed to be written as a numerical entity, so that
a later parser would recreate the tab character in the attribute value
(and normalize it away to a single space).
I'm using the Xalan J2 2.5D1 XSLT processor. Ist this a bug in that
implementation (resp. its XML serializer)?
Regards,
Christian
when using this "identity" processing sheet:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
on this XML instance document:
<?xml version="1.0" encoding="iso-8859-1" ?>
<element attr="a tab" />
the result is:
<?xml version="1.0" encoding="iso-8859-1"?>
<element attr="a tab"/>
^^
Tabulator(0x9)--^^
, i.e. the numerical entity from the input document is not
recreated at serialization time, but simply substituted for the real
character, a tab.
Unfortunately, this means that re-applying the identity stylesheet from
above on this document makes the tab character get replaced by a single
space character according to the Attribute-Value Normalization rules
(<http://www.w3.org/TR/REC-xml#AVNormalize>):
<?xml version="1.0" encoding="iso-8859-1"?>
<element attr="a tab"/>
^
Space(0x20)-----^
In short: The above "identity" processing sheet does not deliver a
semantically identical document. Because if it did, the tab character in
the attribute value needed to be written as a numerical entity, so that
a later parser would recreate the tab character in the attribute value
(and normalize it away to a single space).
I'm using the Xalan J2 2.5D1 XSLT processor. Ist this a bug in that
implementation (resp. its XML serializer)?
Regards,
Christian