XPath data model and the XML Declaration

D

David Carlisle

Does the XML Declaration map to a Xpath node type or not.
If yes - what type of node?

No, it affects the parser (eg telling it what encoding has been used),
and so affects the construction of the node tree, but no record of the
declaration is left in the node set.

That's (normally) what you want. If you have a file in latin1 that
starts
<?xml version="1.0" encoding="iso-8859-1"?>
...

and the same data encoded in utf-8 that starts

<?xml version="1.0" encoding="utf-8"?>

Then you want these two to be equivalent. The actual data in the nodes
are just logical unicode characters, the encoding used to store them
originally in a file shouldn't be relevant to the logical view of the
XML tree (and isn't available to XPath/XSLT).

Of course in many cases it's not unrreasonable to want to output a file
with the same encoding as was used for input, but that information has
gone, along with information about whether " or ' was used for
attributes, or where CDATA sections were, etc.

David
 
H

Helmut Dirtinger

Hi

Compontents of an xml file are mapped to the different node types of the
xpath data model.
An element is mapped to an element node, an attribute node represents an
attribute and its value ... and so on.

so far so good

But what happens to the xml declaration (<?xml version="1.0" ......?>).
Though it looks like a processing instruction it isn't.

The XPath specification [1] only says that the XML Declaration does not map
to a the node type processing instruction node (5.5).

Does the XML Declaration map to a Xpath node type or not.
If yes - what type of node?

thx

Helmut


[1] XPath Specification http://www.w3c.org/TR/xpath
 
B

Bjoern Hoehrmann

* Helmut Dirtinger wrote in comp.text.xml:
Does the XML Declaration map to a Xpath node type or not.

No, it does not. The XML declaration does not contain information that
would make sense to be available in the data model.
 
R

Richard Tobin

No, it does not.
True.

The XML declaration does not contain information that
would make sense to be available in the data model.

But this is an exaggeration. The encoding is a useful hint when
serializing the data model, and the XML version number affects what
assumptions you can make about the data (e.g. whether it might have
C0 control characters in it).

-- Richard
 
B

Bjoern Hoehrmann

* Richard Tobin wrote in comp.text.xml:
But this is an exaggeration. The encoding is a useful hint when
serializing the data model, and the XML version number affects what
assumptions you can make about the data (e.g. whether it might have
C0 control characters in it).

The problem is however that the value of the encoding pseudo-attribute
might not be the actual character encoding of the document, for example
if the document is delivered via HTTP the acutal encoding might be
specified in the HTTP header which would then override the value in the
XML declaration (if any). So it would make more sense to use other means
to get that information, for example using DOM Level 3 Core methods.

Regarding C0 control characters, I do not think these may appear in the
XPath 1.0 data model which is only defined for XML 1.0 (they are not
allowed to appear in expressions either, strings are restricted to chars
as defined in XML 1.0 which excludes C0 controls)... and if they are in
the data model, you can infer that the XML version is 1.1 so this is of
limited use, too.

Other uses would be possible too, for example, if you want to write an
XSLT document that discovers meta-data from XML documents like

XML Declaration: yes
Encoding Declaration: ISO-8859-1
Standalone: no

Elements:

1 <html>
1 <head>
1 <body>
32 <div>
...

or whatever, so maybe I should rephrase to, the value such functionality
would add is not considered worth the additional complication and such
functionality might contribute to making false assumptions such as that
the encoding in the XML declaration is the actual document encoding.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,014
Latest member
BiancaFix3

Latest Threads

Top