XML version without UTF8

H

Hapa

Hello all,
we are using msxml.dll (version 1) and Visual C++6.0.

There is a way to automatically write the processing instruction prior
saving the XMLDOMDocument.

VARIANT NodeType;
NodeType.vt = VT_I4; V_I4(&NodeType) = MSXML::NODE_PROCESSING_INSTRUCTION;
CComBSTR PITarget = ("xml");

XMLDOMNodePtr pProcInstr;
m_pXMLDocumentNode->createNode(NodeType, PITarget, NULL, &pProcInstr)
m_pXMLDocumentNode->appendChild( pProcInstr, NULL


Question: Why our processing instruction always has the UTF8 missing.
and looking like this. <?xml version="1.0" ?> instead of <?xml
version="1.0" encoding="UTF-8" ?>
This is related to _UNICODE or _MBCS?

Thanks.
hapa
 
M

Martin Honnen

Hapa said:
Hello all,
we are using msxml.dll (version 1) and Visual C++6.0.

Version 1? The latest is MSXML 6, I don't think MSXML 1 is supported.
There is a way to automatically write the processing instruction prior
saving the XMLDOMDocument.

VARIANT NodeType;
NodeType.vt = VT_I4; V_I4(&NodeType) = MSXML::NODE_PROCESSING_INSTRUCTION;
CComBSTR PITarget = ("xml");

XMLDOMNodePtr pProcInstr;
m_pXMLDocumentNode->createNode(NodeType, PITarget, NULL, &pProcInstr)
m_pXMLDocumentNode->appendChild( pProcInstr, NULL


Question: Why our processing instruction always has the UTF8 missing.
and looking like this. <?xml version="1.0" ?> instead of <?xml
version="1.0" encoding="UTF-8" ?>
This is related to _UNICODE or _MBCS?

I don't see you creating any encoding. You could do it using the
createProcessingInstruction method
(http://msdn.microsoft.com/en-us/library/ms755439(VS.85).aspx)
(JScript pseudo code, please translate to C++ yourself)
var pi = doc.createProcessingInstruction('xml',
'version="1.0" encoding="UTF-8"');

That way MSXML will write out the XML declaration with the specified
encoding when saving the DOM document to a file or stream. If you simply
access the xml property to get a string serialization of the DOM
document then I think (depeding on the MSXML version) you will either
not get any encoding shown or you will get encoding="UTF-16" as that is
the encoding of the string.
 
J

Joe Kesselman

I don't see you creating any encoding. You could do it using the
createProcessingInstruction method
(http://msdn.microsoft.com/en-us/library/ms755439(VS.85).aspx)
(JScript pseudo code, please translate to C++ yourself)
var pi = doc.createProcessingInstruction('xml',
'version="1.0" encoding="UTF-8"');

Uhm... No. The XML Declaration, while it has the syntax of a processing
instruction, is not a processing instruction. If MSXML is doing it this
way, MSXML is wrong. Modern versions of XML APIs (DOM, SAX, etc) should
all have an explicit mechanism for specifying encoding, and the
serializer should Do The Right Thing with that information.
 
M

Martin Honnen

Joe said:
Uhm... No. The XML Declaration, while it has the syntax of a processing
instruction, is not a processing instruction. If MSXML is doing it this
way, MSXML is wrong. Modern versions of XML APIs (DOM, SAX, etc) should
all have an explicit mechanism for specifying encoding, and the
serializer should Do The Right Thing with that information.

MSXML has DOM Level 1 but the only way to create an XML declaration is
to create it as a processing instruction, even if it technically is none.
 
M

Martin Honnen

Martin said:
MSXML has DOM Level 1 but the only way to create an XML declaration is
to create it as a processing instruction, even if it technically is none.

And with MSXML it is not only for outputting that you treat the XML
declaration as a pi, if you load an XML document with an XML declaration
then in MSXML's DOM tree the XML declaration shows up as the first child
of the document node and the nodeType is 7 for pi.
 
M

Martin Honnen

And with the DOM Level 2 implementations in Firefox or Safari I think
the only way to suggest an encoding for serialization is to use
createProcessingInstruction to create an XML declaration as a pi and
insert that as the first child of the XML DOM document.

DOM Level 3 Load and Save kind of never made it into browsers I think,
besides Opera which has some support for that.
 
J

Joe Kesselman

Martin said:
MSXML has DOM Level 1

Ah. So they never upgraded to DOM Level 2 or Level 3? That's a distinct
shame -- it means their users are stuck working with a
non-namespace-aware DOM, don't have DOM event handling, don't have the
DOM serialization interface, and are missing some of the other items we
put in because they were clearly needed based on real-world experience
and the evolving standards.

I presume IE's DOM implementation is more up-to-date, though I haven't
checked recently.

(Since I don't use the MS tools, I haven't been keeping track of their
status.)
 
M

Martin Honnen

Joe said:
Ah. So they never upgraded to DOM Level 2 or Level 3? That's a distinct
shame -- it means their users are stuck working with a
non-namespace-aware DOM, don't have DOM event handling, don't have the
DOM serialization interface, and are missing some of the other items we
put in because they were clearly needed based on real-world experience
and the evolving standards.

No, MSXML is namespace aware, but not the way the W3C specifies it in
DOM Level 2 or 3. Instead of createElementNS or createAttributeNS MSXML
has a method createNode on the document interface that allows you to
create elements or attributes in a namespace by passing in the node
type, the name and the namespace if needed. And to find elements or
attributes in a namespace there is no getElementsByTagNameNS but rather
methods like selectNodes and selectSingleNode that take an XPath 1.0
expression.
I presume IE's DOM implementation is more up-to-date, though I haven't
checked recently.

For XML documents IE uses MSXML anyway. In my view the HTML DOM in IE is
also only on DOM Level 1 as far as compliance to the W3C DOM goes but it
has rich proprietary extensions to deal with events, stylesheets, editing.
 
J

Joe Kesselman

Martin said:
No, MSXML is namespace aware, but not the way the W3C specifies it in
DOM Level 2 or 3.

So MS's customers can't code portable solutions. As I said, I consider
that a distinct pity, since the whole point of creating the DOM API was
to allow code to be moved/reused without rewriting.

(I will admit that my own code deliberately violated one detail of
another W3C recommendation ... but it's one that almost nobody uses and
that the W3C itself is now phasing out.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top