javax.xml.transform.Transformer and HTML entities

Discussion in 'Java' started by Aéris, Oct 11, 2011.

  1. Aéris

    Aéris Guest

    -----BEGIN PGP SIGNED MESSAGE-----
    Hash: SHA1

    Hi,

    I have a problem with Java XML Transformer escaping.

    I use Transformer to create a HTML file from an DOM Document.
    But in generated HTML, all « & » on text nodes in the document, which
    are parts of already escaped HTML entities like «   », are
    re-escaped by Transformer.

    See this sample : http://pastebin.com/LfGpWMai
    Instead of expected
    <div>&mdash;</div>
    I get
    <div>&amp;mdash;</div>

    I search on doc and Google, but nothing found to disable escaping.
    Is there anybody to help me ?

    Thanks

    - --
    Aeris
    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.11 (GNU/Linux)
    Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

    iQEcBAEBAgAGBQJOlMm6AAoJEK8zQvxDY4P9S5gIAJ9deHSrFhHAnbxgyhCHHRYB
    sVSUx7G2Wr1CpkM0SMRxhAvzKy09yONqeXaByuTRWwrPzKGRXHoKXTN9hC0jb04C
    QrBKKZq0SSut3KbAcSgaOY2eCHSyPeI6vrQMLyanGUVpvr9J7kzZ7rp7CS2Z+bcY
    9HIOdo93wwvzzRZvdAIaLc3VrkUa4TebXEb+j5QULwlmUnPuRpEEdCCfIJBg2Vmq
    1tYL2XkKUA+xiW5sLK3VVhKskNhlWYop9J2IfoNdg5zS5wQsNNk5Z7KEtDcPoie5
    zUftWJS6j8rvEuhpuDYXezFDVqAdgyQ8gpxnMyUELVOC41YV8oQuByJNjswUMks=
    =0LA3
    -----END PGP SIGNATURE-----
    Aéris, Oct 11, 2011
    #1
    1. Advertising

  2. On 10/11/2011 6:57 PM, Aéris wrote:
    > I use Transformer to create a HTML file from an DOM Document.
    > But in generated HTML, all «& » on text nodes in the document, which
    > are parts of already escaped HTML entities like «&nbsp; », are
    > re-escaped by Transformer.
    >
    > See this sample : http://pastebin.com/LfGpWMai
    > Instead of expected
    > <div>&mdash;</div>
    > I get
    > <div>&amp;mdash;</div>
    >
    > I search on doc and Google, but nothing found to disable escaping.
    > Is there anybody to help me ?


    The code does exactly what it is supposed to do.

    document.createTextNode("&mdash;")

    creates a text node with those 7 characters.

    Try:

    document.createTextNode("\u2014")

    Arne
    Arne Vajhøj, Oct 12, 2011
    #2
    1. Advertising

  3. Aéris

    Jeff Higgins Guest

    On 10/11/2011 06:57 PM, Aéris wrote:
    > -----BEGIN PGP SIGNED MESSAGE-----
    > Hash: SHA1
    >
    > Hi,
    >
    > I have a problem with Java XML Transformer escaping.
    >
    > I use Transformer to create a HTML file from an DOM Document.
    > But in generated HTML, all «& » on text nodes in the document, which
    > are parts of already escaped HTML entities like «&nbsp; », are
    > re-escaped by Transformer.
    >
    > See this sample : http://pastebin.com/LfGpWMai
    > Instead of expected
    > <div>&mdash;</div>
    > I get
    > <div>&amp;mdash;</div>
    >
    > I search on doc and Google, but nothing found to disable escaping.
    > Is there anybody to help me ?
    >
    > Thanks
    >


    I'm sorry I cannot help. I only comment that I am experiencing
    the opposite problem with javax.xml.stream.EventReader. Either I
    haven't figured out how to configure the reader or haven't grokked
    the XML. Best of luck.
    Jeff Higgins, Oct 12, 2011
    #3
  4. Aéris

    markspace Guest

    On 10/11/2011 3:57 PM, Aéris wrote:
    > I use Transformer to create a HTML file from an DOM Document.
    > But in generated HTML, all «& » on text nodes in the document, which
    > are parts of already escaped HTML entities like «&nbsp; », are
    > re-escaped by Transformer.
    >
    > See this sample : http://pastebin.com/LfGpWMai
    > Instead of expected
    > <div>&mdash;</div>
    > I get
    > <div>&amp;mdash;</div>



    I tried this:


    final Writer out = new StringWriter();
    final Source in = new StreamSource(
    new StringReader( "<test><div>&mdash;</div></test>") );

    transformer.transform( in, new StreamResult( out ) );
    System.out.println( out );

    And got an error:

    [Fatal Error] :1:19: The entity "mdash" was referenced, but not declared.
    ERROR: 'The entity "mdash" was referenced, but not declared.'

    So it's been a rather long while since I played with XSLT, but it seems
    to me that it might be your document builder that is protecting you, and
    the XSLT is just spitting out what it gets in. I forget though how to
    get XSLT to recognize the HTML entities though. Search Google might
    offer some clues.
    markspace, Oct 12, 2011
    #4
  5. On 10/11/2011 8:58 PM, markspace wrote:
    > On 10/11/2011 3:57 PM, Aéris wrote:
    >> I use Transformer to create a HTML file from an DOM Document.
    >> But in generated HTML, all «& » on text nodes in the document, which
    >> are parts of already escaped HTML entities like «&nbsp; », are
    >> re-escaped by Transformer.
    >>
    >> See this sample : http://pastebin.com/LfGpWMai
    >> Instead of expected
    >> <div>&mdash;</div>
    >> I get
    >> <div>&amp;mdash;</div>

    >
    > I tried this:
    >
    > final Writer out = new StringWriter();
    > final Source in = new StreamSource(
    > new StringReader( "<test><div>&mdash;</div></test>") );
    >
    > transformer.transform( in, new StreamResult( out ) );
    > System.out.println( out );
    >
    > And got an error:
    >
    > [Fatal Error] :1:19: The entity "mdash" was referenced, but not declared.
    > ERROR: 'The entity "mdash" was referenced, but not declared.'
    >
    > So it's been a rather long while since I played with XSLT, but it seems
    > to me that it might be your document builder that is protecting you, and
    > the XSLT is just spitting out what it gets in. I forget though how to
    > get XSLT to recognize the HTML entities though. Search Google might
    > offer some clues.


    Parsing XML file/string and building a DOM document
    are somewhat different.

    Arne
    Arne Vajhøj, Oct 12, 2011
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Angus Parvis
    Replies:
    0
    Views:
    603
    Angus Parvis
    Aug 26, 2004
  2. Replies:
    1
    Views:
    731
    Esmond Pitt
    Mar 27, 2005
  3. Andreas
    Replies:
    1
    Views:
    908
    Raymond DeCampo
    Sep 4, 2005
  4. Sebastian Fey
    Replies:
    0
    Views:
    408
    Sebastian Fey
    Oct 28, 2004
  5. Jim Higson
    Replies:
    3
    Views:
    215
    Eric Amick
    Jul 25, 2004
Loading...

Share This Page