Re: Space between ending and starting tag not ignorable in a XMLdocument? ...

Discussion in 'XML' started by Mayeul, Jul 11, 2011.

  1. Mayeul

    Mayeul Guest

    On 11/07/2011 01:38, lbrt chx _ gemale kom wrote:
    > Space between ending and starting tag not ignorable in a XML document? ...
    > ~
    > I am parsing some XML document, which is validated by a schema, using apache Xerces
    > ~
    > What I don't get is that spaces between ending and starting tags is reported by:
    > ~
    > characters(char[] ch, int start, int length)
    > ~
    > instead of:
    > ~
    > ignorableWhitespace(char[] ch, int start, int length)
    > ~
    > I thought one of the aspects of well-formedness is that such spaces are not relevant (not even accessible) in an XML document.
    > ~
    > Have I forgotten to set some flag or something? In case this behavior is per spec. how do you tell apart such textual sequences?


    For starters, well-formedness would not make anything irrelevant nor
    inaccessible. Anything between an end-tag and a start-tag is necessarily
    a direct child of the parent of the elements surrounding it. It is just
    a neighbour of these elements. Remember XHTML? Spaces between inline
    tags are rather relevant, aren't they?
    Bottomline: whitespace between tags is just the same whitespace as
    whitespace within tags, because it is indeed within a tag.

    Which leaves us to the question of ignorable whitespace and not
    ignorable whitespace. Sometimes whitespace is not supposed to be
    ignorable: clue again inline XHTML, as well as any mixed content. Clue
    also PRE-like behaviour.
    Supposedly (and in fact,) whitespace is usually meant ignorable. If it
    should be preserved, the xml:space attribute should be set, thus
    modifying content model.
    So, by default it is ignorable, but the default is enforced only when
    the XML parser is /able at all/ to distinguish between ignorable and not
    ignorable whitespace. Otherwise some important whitespace might be lost.
    A validating parser has to be able to distinguish. A non-validating
    parser may, but does not have to.

    Conclusion: Most likely you need to set your XML parser so that it
    performs DTD validation. Or, if such a thing exists, to set it up so
    that it enforces xml:space (and lack thereof) where it finds it.
    Remember though that DTD may make it a default attribute on some tags,
    so it should still check the DTD when it finds one.

    --
    Mayeul
     
    Mayeul, Jul 11, 2011
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Shoval Tomer
    Replies:
    0
    Views:
    477
    Shoval Tomer
    Jul 9, 2003
  2. Bob
    Replies:
    0
    Views:
    436
  3. Joe Kesselman
    Replies:
    2
    Views:
    914
    Joe Kesselman
    Jul 12, 2011
  4. Mayeul
    Replies:
    0
    Views:
    1,001
    Mayeul
    Jul 12, 2011
  5. Joe Kesselman
    Replies:
    0
    Views:
    771
    Joe Kesselman
    Jul 12, 2011
Loading...

Share This Page