XPath data model and the XML Declaration

Discussion in 'XML' started by David Carlisle, Dec 2, 2004.

  1. Does the XML Declaration map to a Xpath node type or not.
    If yes - what type of node?

    No, it affects the parser (eg telling it what encoding has been used),
    and so affects the construction of the node tree, but no record of the
    declaration is left in the node set.

    That's (normally) what you want. If you have a file in latin1 that
    starts
    <?xml version="1.0" encoding="iso-8859-1"?>
    ...

    and the same data encoded in utf-8 that starts

    <?xml version="1.0" encoding="utf-8"?>

    Then you want these two to be equivalent. The actual data in the nodes
    are just logical unicode characters, the encoding used to store them
    originally in a file shouldn't be relevant to the logical view of the
    XML tree (and isn't available to XPath/XSLT).

    Of course in many cases it's not unrreasonable to want to output a file
    with the same encoding as was used for input, but that information has
    gone, along with information about whether " or ' was used for
    attributes, or where CDATA sections were, etc.

    David
     
    David Carlisle, Dec 2, 2004
    #1
    1. Advertising

  2. Hi

    Compontents of an xml file are mapped to the different node types of the
    xpath data model.
    An element is mapped to an element node, an attribute node represents an
    attribute and its value ... and so on.

    so far so good

    But what happens to the xml declaration (<?xml version="1.0" ......?>).
    Though it looks like a processing instruction it isn't.

    The XPath specification [1] only says that the XML Declaration does not map
    to a the node type processing instruction node (5.5).

    Does the XML Declaration map to a Xpath node type or not.
    If yes - what type of node?

    thx

    Helmut


    [1] XPath Specification http://www.w3c.org/TR/xpath
     
    Helmut Dirtinger, Dec 2, 2004
    #2
    1. Advertising

  3. * Helmut Dirtinger wrote in comp.text.xml:
    >Does the XML Declaration map to a Xpath node type or not.


    No, it does not. The XML declaration does not contain information that
    would make sense to be available in the data model.
    --
    Björn Höhrmann · mailto: · http://bjoern.hoehrmann.de
    Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
    68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
     
    Bjoern Hoehrmann, Dec 2, 2004
    #3
  4. In article <>,
    Bjoern Hoehrmann <> wrote:
    >>Does the XML Declaration map to a Xpath node type or not.

    >
    >No, it does not.


    True.

    >The XML declaration does not contain information that
    >would make sense to be available in the data model.


    But this is an exaggeration. The encoding is a useful hint when
    serializing the data model, and the XML version number affects what
    assumptions you can make about the data (e.g. whether it might have
    C0 control characters in it).

    -- Richard
     
    Richard Tobin, Dec 2, 2004
    #4
  5. * Richard Tobin wrote in comp.text.xml:
    >>The XML declaration does not contain information that
    >>would make sense to be available in the data model.

    >
    >But this is an exaggeration. The encoding is a useful hint when
    >serializing the data model, and the XML version number affects what
    >assumptions you can make about the data (e.g. whether it might have
    >C0 control characters in it).


    The problem is however that the value of the encoding pseudo-attribute
    might not be the actual character encoding of the document, for example
    if the document is delivered via HTTP the acutal encoding might be
    specified in the HTTP header which would then override the value in the
    XML declaration (if any). So it would make more sense to use other means
    to get that information, for example using DOM Level 3 Core methods.

    Regarding C0 control characters, I do not think these may appear in the
    XPath 1.0 data model which is only defined for XML 1.0 (they are not
    allowed to appear in expressions either, strings are restricted to chars
    as defined in XML 1.0 which excludes C0 controls)... and if they are in
    the data model, you can infer that the XML version is 1.1 so this is of
    limited use, too.

    Other uses would be possible too, for example, if you want to write an
    XSLT document that discovers meta-data from XML documents like

    XML Declaration: yes
    Encoding Declaration: ISO-8859-1
    Standalone: no

    Elements:

    1 <html>
    1 <head>
    1 <body>
    32 <div>
    ...

    or whatever, so maybe I should rephrase to, the value such functionality
    would add is not considered worth the additional complication and such
    functionality might contribute to making false assumptions such as that
    the encoding in the XML declaration is the actual document encoding.
    --
    Björn Höhrmann · mailto: · http://bjoern.hoehrmann.de
    Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
    68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
     
    Bjoern Hoehrmann, Dec 2, 2004
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Marvin_123456

    "Memory leak" in javax.xml.xpath.XPath

    Marvin_123456, Jul 29, 2005, in forum: Java
    Replies:
    4
    Views:
    1,983
    jan V
    Jul 29, 2005
  2. Alastair Cameron
    Replies:
    1
    Views:
    7,423
    SQL Server Development Team [MSFT]
    Jul 8, 2003
  3. Anna
    Replies:
    0
    Views:
    532
  4. kelvSYC
    Replies:
    6
    Views:
    7,242
    Richard Herring
    May 17, 2005
  5. Replies:
    1
    Views:
    491
    Andreas Wollschlaeger
    Oct 6, 2006
Loading...

Share This Page