XML version without UTF8

Discussion in 'XML' started by Hapa, Jul 28, 2009.

  1. Hapa

    Hapa Guest

    Hello all,
    we are using msxml.dll (version 1) and Visual C++6.0.

    There is a way to automatically write the processing instruction prior
    saving the XMLDOMDocument.

    VARIANT NodeType;
    NodeType.vt = VT_I4; V_I4(&NodeType) = MSXML::NODE_PROCESSING_INSTRUCTION;
    CComBSTR PITarget = ("xml");

    XMLDOMNodePtr pProcInstr;
    m_pXMLDocumentNode->createNode(NodeType, PITarget, NULL, &pProcInstr)
    m_pXMLDocumentNode->appendChild( pProcInstr, NULL


    Question: Why our processing instruction always has the UTF8 missing.
    and looking like this. <?xml version="1.0" ?> instead of <?xml
    version="1.0" encoding="UTF-8" ?>
    This is related to _UNICODE or _MBCS?

    Thanks.
    hapa
     
    Hapa, Jul 28, 2009
    #1
    1. Advertising

  2. Joe Kesselman, Jul 28, 2009
    #2
    1. Advertising

  3. Hapa wrote:
    > Hello all,
    > we are using msxml.dll (version 1) and Visual C++6.0.


    Version 1? The latest is MSXML 6, I don't think MSXML 1 is supported.

    > There is a way to automatically write the processing instruction prior
    > saving the XMLDOMDocument.
    >
    > VARIANT NodeType;
    > NodeType.vt = VT_I4; V_I4(&NodeType) = MSXML::NODE_PROCESSING_INSTRUCTION;
    > CComBSTR PITarget = ("xml");
    >
    > XMLDOMNodePtr pProcInstr;
    > m_pXMLDocumentNode->createNode(NodeType, PITarget, NULL, &pProcInstr)
    > m_pXMLDocumentNode->appendChild( pProcInstr, NULL
    >
    >
    > Question: Why our processing instruction always has the UTF8 missing.
    > and looking like this. <?xml version="1.0" ?> instead of <?xml
    > version="1.0" encoding="UTF-8" ?>
    > This is related to _UNICODE or _MBCS?


    I don't see you creating any encoding. You could do it using the
    createProcessingInstruction method
    (http://msdn.microsoft.com/en-us/library/ms755439(VS.85).aspx)
    (JScript pseudo code, please translate to C++ yourself)
    var pi = doc.createProcessingInstruction('xml',
    'version="1.0" encoding="UTF-8"');

    That way MSXML will write out the XML declaration with the specified
    encoding when saving the DOM document to a file or stream. If you simply
    access the xml property to get a string serialization of the DOM
    document then I think (depeding on the MSXML version) you will either
    not get any encoding shown or you will get encoding="UTF-16" as that is
    the encoding of the string.





    --

    Martin Honnen
    http://msmvps.com/blogs/martin_honnen/
     
    Martin Honnen, Jul 28, 2009
    #3
  4. > I don't see you creating any encoding. You could do it using the
    > createProcessingInstruction method
    > (http://msdn.microsoft.com/en-us/library/ms755439(VS.85).aspx)
    > (JScript pseudo code, please translate to C++ yourself)
    > var pi = doc.createProcessingInstruction('xml',
    > 'version="1.0" encoding="UTF-8"');


    Uhm... No. The XML Declaration, while it has the syntax of a processing
    instruction, is not a processing instruction. If MSXML is doing it this
    way, MSXML is wrong. Modern versions of XML APIs (DOM, SAX, etc) should
    all have an explicit mechanism for specifying encoding, and the
    serializer should Do The Right Thing with that information.
     
    Joe Kesselman, Jul 28, 2009
    #4
  5. Joe Kesselman wrote:
    >> I don't see you creating any encoding. You could do it using the
    >> createProcessingInstruction method
    >> (http://msdn.microsoft.com/en-us/library/ms755439(VS.85).aspx)
    >> (JScript pseudo code, please translate to C++ yourself)
    >> var pi = doc.createProcessingInstruction('xml',
    >> 'version="1.0" encoding="UTF-8"');

    >
    > Uhm... No. The XML Declaration, while it has the syntax of a processing
    > instruction, is not a processing instruction. If MSXML is doing it this
    > way, MSXML is wrong. Modern versions of XML APIs (DOM, SAX, etc) should
    > all have an explicit mechanism for specifying encoding, and the
    > serializer should Do The Right Thing with that information.


    MSXML has DOM Level 1 but the only way to create an XML declaration is
    to create it as a processing instruction, even if it technically is none.

    --

    Martin Honnen
    http://msmvps.com/blogs/martin_honnen/
     
    Martin Honnen, Jul 28, 2009
    #5
  6. Martin Honnen wrote:

    >> Uhm... No. The XML Declaration, while it has the syntax of a
    >> processing instruction, is not a processing instruction. If MSXML is
    >> doing it this way, MSXML is wrong. Modern versions of XML APIs (DOM,
    >> SAX, etc) should all have an explicit mechanism for specifying
    >> encoding, and the serializer should Do The Right Thing with that
    >> information.

    >
    > MSXML has DOM Level 1 but the only way to create an XML declaration is
    > to create it as a processing instruction, even if it technically is none.


    And with MSXML it is not only for outputting that you treat the XML
    declaration as a pi, if you load an XML document with an XML declaration
    then in MSXML's DOM tree the XML declaration shows up as the first child
    of the document node and the nodeType is 7 for pi.

    --

    Martin Honnen
    http://msmvps.com/blogs/martin_honnen/
     
    Martin Honnen, Jul 28, 2009
    #6
  7. Martin Honnen wrote:
    > Martin Honnen wrote:
    >
    >>> Uhm... No. The XML Declaration, while it has the syntax of a
    >>> processing instruction, is not a processing instruction. If MSXML is
    >>> doing it this way, MSXML is wrong. Modern versions of XML APIs (DOM,
    >>> SAX, etc) should all have an explicit mechanism for specifying
    >>> encoding, and the serializer should Do The Right Thing with that
    >>> information.

    >>
    >> MSXML has DOM Level 1 but the only way to create an XML declaration is
    >> to create it as a processing instruction, even if it technically is none.


    And with the DOM Level 2 implementations in Firefox or Safari I think
    the only way to suggest an encoding for serialization is to use
    createProcessingInstruction to create an XML declaration as a pi and
    insert that as the first child of the XML DOM document.

    DOM Level 3 Load and Save kind of never made it into browsers I think,
    besides Opera which has some support for that.


    --

    Martin Honnen
    http://msmvps.com/blogs/martin_honnen/
     
    Martin Honnen, Jul 28, 2009
    #7
  8. Martin Honnen wrote:
    > MSXML has DOM Level 1


    Ah. So they never upgraded to DOM Level 2 or Level 3? That's a distinct
    shame -- it means their users are stuck working with a
    non-namespace-aware DOM, don't have DOM event handling, don't have the
    DOM serialization interface, and are missing some of the other items we
    put in because they were clearly needed based on real-world experience
    and the evolving standards.

    I presume IE's DOM implementation is more up-to-date, though I haven't
    checked recently.

    (Since I don't use the MS tools, I haven't been keeping track of their
    status.)
     
    Joe Kesselman, Jul 28, 2009
    #8
  9. Joe Kesselman wrote:
    > Martin Honnen wrote:
    >> MSXML has DOM Level 1

    >
    > Ah. So they never upgraded to DOM Level 2 or Level 3? That's a distinct
    > shame -- it means their users are stuck working with a
    > non-namespace-aware DOM, don't have DOM event handling, don't have the
    > DOM serialization interface, and are missing some of the other items we
    > put in because they were clearly needed based on real-world experience
    > and the evolving standards.


    No, MSXML is namespace aware, but not the way the W3C specifies it in
    DOM Level 2 or 3. Instead of createElementNS or createAttributeNS MSXML
    has a method createNode on the document interface that allows you to
    create elements or attributes in a namespace by passing in the node
    type, the name and the namespace if needed. And to find elements or
    attributes in a namespace there is no getElementsByTagNameNS but rather
    methods like selectNodes and selectSingleNode that take an XPath 1.0
    expression.

    > I presume IE's DOM implementation is more up-to-date, though I haven't
    > checked recently.


    For XML documents IE uses MSXML anyway. In my view the HTML DOM in IE is
    also only on DOM Level 1 as far as compliance to the W3C DOM goes but it
    has rich proprietary extensions to deal with events, stylesheets, editing.


    --

    Martin Honnen
    http://msmvps.com/blogs/martin_honnen/
     
    Martin Honnen, Jul 29, 2009
    #9
  10. Martin Honnen wrote:
    > No, MSXML is namespace aware, but not the way the W3C specifies it in
    > DOM Level 2 or 3.


    So MS's customers can't code portable solutions. As I said, I consider
    that a distinct pity, since the whole point of creating the DOM API was
    to allow code to be moved/reused without rewriting.

    (I will admit that my own code deliberately violated one detail of
    another W3C recommendation ... but it's one that almost nobody uses and
    that the W3C itself is now phasing out.)
     
    Joe Kesselman, Jul 29, 2009
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. V Green
    Replies:
    0
    Views:
    923
    V Green
    Feb 5, 2008
  2. PA Bear [MS MVP]
    Replies:
    0
    Views:
    1,034
    PA Bear [MS MVP]
    Feb 5, 2008
  3. MowGreen [MVP]
    Replies:
    5
    Views:
    2,080
    PA Bear [MS MVP]
    Feb 9, 2008
  4. Simon Willison
    Replies:
    10
    Views:
    590
    Paul Boddie
    Jul 31, 2008
  5. gry
    Replies:
    2
    Views:
    822
    Alf P. Steinbach
    Mar 13, 2012
Loading...

Share This Page