XSL: automatically wrapping text in <p>

S

Sam Quigley

Hi,

I want to use XSL to wrap paragraphs of text in <p> tags
automatically, basically using the heuristic that two chunks of text
separated by 2 consecutive newlines are to be treated as two <p>s...
I do this right now by putting the input xml (the text to be wrapped)
in a <paraset> tag, and using the following XSL:

<xsl:template match="paraset">
<xsl:call-template name="wrapinp">
<xsl:with-param name="input">
<xsl:value-of select="."/>
</xsl:with-param>
</xsl:call-template>
</xsl:template>

<xsl:template name="wrapinp">
<xsl:param name="input"/>
<xsl:choose>
<!-- if the input string contains two consecutive newlines -->
<xsl:when test="contains($input,'

')">
<!-- wrap first part in a <p> -->
<p>
<xsl:value-of select="substring-before($input,'

')"/>
</p>
<!-- recurse into second part to find further paras -->
<xsl:call-template name="wrapinp">
<xsl:with-param name="input">
<xsl:value-of select="substring-after($input,'

')"/>
</xsl:with-param>
</xsl:call-template>
</xsl:when>
<!-- string does not contain consecutive newlines, so just wrap it
-->
<xsl:eek:therwise>
<p>
<xsl:value-of select="$input"/>
</p>
</xsl:eek:therwise>
</xsl:choose>
</xsl:template>

the thing works nicely, except for one hitch: tags within the
paragraphs don't get interpreted. so, if my input looks like

<paraset>
blah blah <emph>HEY</emph> blah

foo foo <emph>HO</emph> foo
</paraset>

I get as output

<p>blah blah HEY blah</p>
<p>foo foo HO foo</p>

despite an XSL template that turns <emph>s into <i>s:

<xsl:template name="emphasis" match="emph">
<i>
<xsl:apply-templates />
</i>
</xsl:template>

What am I doing wrong? I'm relatively new to XSL, so perhaps I'm just
getting my apply-templates and value-ofs confused -- but any help
would be appreciated...

Also, if there's some magic (not involving XSLT 2.0) that would allow
me to loosen the paragraph heuristic to allow optional whitespace
between newlines, i'd love to hear about it.

Thanks,
-sq
(also: please reply directly via email if possible)
 
M

Martin Honnen

Sam Quigley wrote:

I want to use XSL to wrap paragraphs of text in <p> tags
automatically, basically using the heuristic that two chunks of text
separated by 2 consecutive newlines are to be treated as two <p>s...
I do this right now by putting the input xml (the text to be wrapped)
in a <paraset> tag, and using the following XSL:

<xsl:template match="paraset">
<xsl:call-template name="wrapinp">
<xsl:with-param name="input">
<xsl:value-of select="."/>

Here you pass in the value of a <paraset> element to the named template
wrapinp meaning here you throw out all structure the <paraset> content
might have and reduce it to its string value.
If you want to process <emph> elements for instance you need to make
sure you do process text nodes and element nodes in <paraset>
recursively with e.g. <xsl:apply-templates />.
Of course you want to split text by newlines however that is hard with
XSLT if you have child elements as well, if possible try to fix the
original markup to use elements to indicate the paragraph structure and
not newlines.
I realize I have only explained what goes wrong in your current solution
without providing a better approach with XSLT, I would start looking at
http://www.dpawson.co.uk/xsl/sect2/sect21.html
for a solution.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,050
Latest member
AngelS122

Latest Threads

Top