D
Dave Matthews
Hi folks,
I'm writing a web-page editing tool for my company which will allow
staff (with no "technical" expertise) to maintain their own Intranet sites.
The content for each webpage is stored in the form of XHTML in an XML
document (which, in turn, is stored in an XML database). So far so good.
However the editing tool must allow users to paste in the contents of MS
Word documents. I soon discovered that Word does not generate
properly-formed HTML, the main problem being that tags that should be nested
are often "overlapped" (as my example below shows). My solution is to store
this "bad" data as CDATA sections, thereby preventing the finished XML
document from being invalidated. My finished XML document looks something
like this:
<page id="0001">
<content>
<p>
<i>
<font face="Arial">Properly-formed HTML</font>
</i>
</p>
<![CDATA[<p><i><font face="Arial">The 'i' and 'font' end-tags are
wrong and there is no end-tag for 'p'</i></font>]]>
<p>
<i>
<font face="Arial">This is OK.</font>
</i>
</p>
</content>
</page>
On retrieving a document for formatting and display within the client
browser, my XSL template for the <content> nodes needs to be able to detect
whether each of its children can be regarded as proper XML (and, therefore,
to transform the it into HTML) or a CDATA section whose contents will simply
be passed straight to the browser. So my template needs to look something
like this:
<xsl:template match="content">
<xsl:for-each select="*">
<xsl:choose>
<xsl:when test="nodetype(.)=cdata()">
<xsl:value-of select=".">
</xsl:when>
<xsl
therwise>
<xsl:apply-templates/>
</xsl
therwise>
</xsl:choose>
</xsl:for-each>
</xsl:template>
Of course it's the fourth line of code - <xsl:when
test="nodetype(.)=cdata()"> - that is giving me problems. Unfortunately I am
stuck with a fairly basic XSLT engine that has none of the fancy additional
functions MSXML, SAXON or Xalan offer. Try as I might, I can't find a way of
getting XSLT to tell when it's dealing with a CDATA section.
(I could simply hold everything as CDATA but in the future I am going to
have to interface with other systems that will demand as much content a
possible be presented as proper XML/XHTML.)
Any ideas would be very much appreciated!
--
Many thanks in advance!
Dave Matthews
'New Avengers' and 'Professionals' sites at:
http://www.mark-1.co.uk
I'm writing a web-page editing tool for my company which will allow
staff (with no "technical" expertise) to maintain their own Intranet sites.
The content for each webpage is stored in the form of XHTML in an XML
document (which, in turn, is stored in an XML database). So far so good.
However the editing tool must allow users to paste in the contents of MS
Word documents. I soon discovered that Word does not generate
properly-formed HTML, the main problem being that tags that should be nested
are often "overlapped" (as my example below shows). My solution is to store
this "bad" data as CDATA sections, thereby preventing the finished XML
document from being invalidated. My finished XML document looks something
like this:
<page id="0001">
<content>
<p>
<i>
<font face="Arial">Properly-formed HTML</font>
</i>
</p>
<![CDATA[<p><i><font face="Arial">The 'i' and 'font' end-tags are
wrong and there is no end-tag for 'p'</i></font>]]>
<p>
<i>
<font face="Arial">This is OK.</font>
</i>
</p>
</content>
</page>
On retrieving a document for formatting and display within the client
browser, my XSL template for the <content> nodes needs to be able to detect
whether each of its children can be regarded as proper XML (and, therefore,
to transform the it into HTML) or a CDATA section whose contents will simply
be passed straight to the browser. So my template needs to look something
like this:
<xsl:template match="content">
<xsl:for-each select="*">
<xsl:choose>
<xsl:when test="nodetype(.)=cdata()">
<xsl:value-of select=".">
</xsl:when>
<xsl
<xsl:apply-templates/>
</xsl
</xsl:choose>
</xsl:for-each>
</xsl:template>
Of course it's the fourth line of code - <xsl:when
test="nodetype(.)=cdata()"> - that is giving me problems. Unfortunately I am
stuck with a fairly basic XSLT engine that has none of the fancy additional
functions MSXML, SAXON or Xalan offer. Try as I might, I can't find a way of
getting XSLT to tell when it's dealing with a CDATA section.
(I could simply hold everything as CDATA but in the future I am going to
have to interface with other systems that will demand as much content a
possible be presented as proper XML/XHTML.)
Any ideas would be very much appreciated!
--
Many thanks in advance!
Dave Matthews
'New Avengers' and 'Professionals' sites at:
http://www.mark-1.co.uk