Problem with CDATAs

C

Collin VanDyck

I'm using Xalan's TransformerIdentityImpl class to provide a nice base
framework upon which to write SAX transformers.

However, using the identiy transform, I'm getting some weird output issues.

My source xml looks like this:

....
<script>
<![CDATA[
<!--
some javascript here
//-->
]]>
</script>

And I am getting something out that looks like this:

<script type="text/javascript" language="JavaScript"><![CDATA[
]]><![CDATA[
<!--]]><![CDATA[
function MM_swapImgRestore() { //v3.0]]><![CDATA[
var i,x,a=document.MM_sr; for(i=0;a&&i<a.length&&(x=a)&&x.oSrc;i++)
x.src=x.oSrc;]]><![CDATA[
}]]><![CDATA[
........ ad nauseum

Of course, my first example left out the javascript, but it was javascript
without any CDATA tags inside of it. The result is that my javascript has
tons of CDATAs everywhere. How do I stop this? Is the source XML in
violation of some rule? This example was created using a plain identity
transform -- nothing should have been changed as I understand it.

Also, my transformers are constructed using:

setOutputProperties(OutputProperties.getDefaultMethodProperties("xml"));

I'd appreciate any help on this -- thanks!

Collin
 
D

Derek Harmon

Collin VanDyck said:
I'm using Xalan's TransformerIdentityImpl class to provide a nice base
framework upon which to write SAX transformers.

However, using the identiy transform, I'm getting some weird output issues.

So you have one <[CDATA[ ... ]]> section wrapping a <script> block, but
Xalan is generating multiple CDataSections for every newline of the <script>
block. It's inefficient, but not incorrect.

<[CDATA[ some text ]]>

is equivalent to ...

<[CDATA[ some]]><[CDATA[ text]]>

as far as it's information content goes. On the SAX end, you'll receive two
consecutive CDATA sections in your LexicalHandler, but you're expected to
handle it just like receiving adjoining characters within ContentHandler (which
you will also likely be experiencing when this is happens.)

What is the xml:space attribute on the source document? Perhaps a setting
of "preserve" here might be a nudge of encouragement to Xalan not to create
a CDATA section for each newline.

You might also want to search/ask on the users mailing list of Xalan-J or
Xalan-C, the archives of which are kept at:

http://marc.theaimsgroup.com/


Derek Harmon
 
C

Collin VanDyck

What is the xml:space attribute on the source document? Perhaps a setting
of "preserve" here might be a nudge of encouragement to Xalan not to create
a CDATA section for each newline.

The xml:space attribute is undeterminate, as my parsing and transforming
must be applied to a wide variety of input XML.
You might also want to search/ask on the users mailing list of Xalan-J or
Xalan-C, the archives of which are kept at:

http://marc.theaimsgroup.com/

Thanks for the heads up, and for the reply.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,770
Messages
2,569,586
Members
45,088
Latest member
JeremyMedl

Latest Threads

Top