unintentionally modified whitespace in attribute values

Discussion in 'XML' started by Markus, Feb 9, 2005.

  1. Markus

    Markus Guest

    Hello *,

    doing some experiments with dom4j, I experience modifications of
    - and
    -entities
    in attribute values:
    running a filter, which first deserialises the xml-stream, performs some dom transformations
    and then serialises the dom, in a first run, the entities mentioned get converted to
    line-feeds {a fact, I could live with, since it's near loss-less}, which in a second run
    through the filter will be converted to blanks. And this is, what I don't want to live with.
    BTW: other SGML-entities in attributes are not touched.

    Reading quite a bunch of docs and fiddling with the obvious parameters, I don't find a way
    to let those entities be left unmodified. May be, this behaviour is not only a peculiarity
    of dom4j, but of other XML-processors too.

    Any ideas???

    Markus



    PS: setup of reader and writer:
    -------8<------
    // give reader...
    ; setReader(new SAXReader())
    ; getReader().setStripWhitespaceText(true)
    ; getReader().setMergeAdjacentText(true)
    ; getReader().setStringInternEnabled(true)

    // ... and writer some reasonable defaults:
    ; setEncoding(new String("UTF-8"))
    ; setOutput_format(new OutputFormat("\t", true, getEncoding()))
    ; getOutput_format().setExpandEmptyElements(false)
    ; setWriter(new XMLWriter(getOutput_format()))
    -------8<------
     
    Markus, Feb 9, 2005
    #1
    1. Advertising

  2. In article <>,
    Markus <> wrote:

    >doing some experiments with dom4j, I experience modifications of
    -
    >and
    -entities in attribute values:


    These should be unchanged by the parser. That is, the parsed data
    should contain linefeed and carriage-return characters. It's more
    likely to be a problem with the serializer: serializers tend to be
    less well tested than parsers, and some incorrectly don't output these
    characters as references when they would otherwise be changed on input
    (as they would be in attributes).

    -- Richard
     
    Richard Tobin, Feb 9, 2005
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. DMS
    Replies:
    0
    Views:
    331
  2. Oli Filth
    Replies:
    9
    Views:
    3,350
    Uncle Pirate
    Jan 17, 2005
  3. Markus
    Replies:
    0
    Views:
    391
    Markus
    Feb 9, 2005
  4. Markus
    Replies:
    0
    Views:
    381
    Markus
    Feb 9, 2005
  5. Penguiniator
    Replies:
    2
    Views:
    115
Loading...

Share This Page