unwanted blank lines in output when using xalan

J

Jeff Calico

Hello everyone

I am transforming an XML document to text, basically only outputting
a small portion of it. When I run the following XSLT via Xalan's
processor,
I get a bunch of unwanted blank lines in the output.
Here is a simplified XML and XSLT:
(Note the problem does not happen when testing in XMLSpy)

- - - - - - - - - - - - - - - - - - - - - - - -
....
<AAA>
<anne anneId="blah" annName="blah"/>
<anne anneId="blah" annName="blah"/>
</AAA>
<BBB>
junk
</BBB>
<CCC>
junk
</CCC>
....
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - -
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE stylesheet [
<!ENTITY cr "<xsl:text>
</xsl:text>">
]>

<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:eek:utput method="text" version="1.0" encoding="UTF-8" />

<xsl:template match="/AAA/anne">
&cr;
<xsl:value-of select="@anneId"/> <xsl:text> </xsl:text>
<xsl:value-of select="@anneName"/>
</xsl:template>
</xsl:stylesheet>

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - -

I end up with a vast segment of blank lines before and after my data.
It is as if every time the processor reads any element, it's default
behaviour is to output a blank line, and when it actually matches with
the elements I want, then it behaves fine. If I put in "do nothing"
matches for all the other top level elements, then the number of blank
lines
is cut down, but not entirely eliminated:

<xsl:template match="/BBB"></xsl:template>
<xsl:template match="/CCC"></xsl:template>

NOTE: The "cr" entity is necessary to put *desired* blank lines between
my
lines of output data, and it shouldn't be triggered unless I match on
/AAA/anne right?

Does anyone know why I get these blanks, and how I can totally
eliminate them?

thanks,
Jeff
 
J

Joe Kesselman

Remember, whitespace in the source document is a legitimate part of the
document!

Your stylesheet provides an explicit rule for template for the AAA/anne
nodes... but it doesn't provide a template for the root element
(match="/"), which is where processing starts. That means your
stylesheet starts by applying the built-in default templates:

<xsl:template match="*|/">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="text()|@*">
<xsl:value-of select="."/>
</xsl:template>

So, given your input document:

<doc>
<AAA>
<anne anneId="blah" annName="blah"/>
<anne anneId="blah" annName="blah"/>
</AAA>
<BBB>
junk
</BBB>
<CCC>
junk
</CCC>
</doc>

We first match the root node as "/". The built-in template processes
this by running apply-templates against its the children.

That takes us to <doc>. It matches as "*", so we do another level of
apply-templates against the children.

The first child of <doc> is the whitespace text node between <doc> and
<AAA>. This matches as text(), and we output its value -- a newline.

The next child is <AAA>. That matches as "*", so we start applying
templates to its content.

The next child is whitespace again -- between <AAA> and <anne> -- so we
output it. That's another blank line.

And so on. Basically, because you're letting the defaults run, you're
going to get most of the text content of the source document, including
all its newlines.

The best fix is to write your own match="/" template which takes you
more directly to what you want to process. If all you care about is
those <anne> elements, I'd suggest you try adding this to override the
builtin default and take you right to the nodes you want to see:

<xsl:template match="/">
<xsl:apply-templates select="/AAA/anne">
</xsl:template>

By the way, note that your template for the <anne> elements can just say
match="anne". You only need to say "/AAA/anne" if you need to
distinguish these from <anne> elements that might be encountered
elsewhere in the document.
 
J

Jeff Calico

Joe said:
Remember, whitespace in the source document is a legitimate part of the
document!

[snip]

Thanks, Joe! That was a very informative discussion, and the result
works
great!
<xsl:template match="/">
<xsl:apply-templates select="/AAA/anne">
</xsl:template>

By the way, note that your template for the <anne> elements can just say
match="anne". You only need to say "/AAA/anne" if you need to
distinguish these from <anne> elements that might be encountered
elsewhere in the document.

I couldn't get it to work if I just specified the final element "anne"
(I suppose I could
do "//anne", but anyway the full path works, and I had the idea that
doing it that way
saved the XSLT processor from having to search through the DOM tree...

--Jeff
 
J

Joe Kesselman

By the way said:
....
I had the idea that doing it that way
saved the XSLT processor from having to search through the DOM tree...

Nope. If anything, you've added work, because now the processor has to
check that the <anne> element's parent is an <AAA>.

Select does the search. Match confirms that you've found the ones you're
interested in.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,772
Messages
2,569,593
Members
45,111
Latest member
KetoBurn
Top