copy XML file -- three extra bytes???

Discussion in 'ASP .Net' started by martin, Mar 5, 2004.

  1. martin

    martin Guest

    Hi,

    I am copying an xml file like so.

    Dim xmlDoc As New XmlDocument
    xmlDoc.Load("C:\Program Files\Templates\message.msg")
    Console.WriteLine("Tmaplate loaded")
    xmlDoc.Save("C:\Program Files\Templates\copy.xml")
    Console.WriteLine("message saved")

    Now the xml file copies and is capable of being end with IE, however the xml
    file that is prodced is not able to be copied using the method above.

    The reason is the produced xml file has three additional bytes at the start
    of it (ie before the "<xml" part)

    my question is.

    does anybody know why this is and how to get rid of the three additional
    bytes at the start of the file.

    many thanks in advance.

    martin.
     
    martin, Mar 5, 2004
    #1
    1. Advertising

  2. martin

    mikeb Guest

    martin wrote:

    > Hi,
    >
    > I am copying an xml file like so.
    >
    > Dim xmlDoc As New XmlDocument
    > xmlDoc.Load("C:\Program Files\Templates\message.msg")
    > Console.WriteLine("Tmaplate loaded")
    > xmlDoc.Save("C:\Program Files\Templates\copy.xml")
    > Console.WriteLine("message saved")
    >
    > Now the xml file copies and is capable of being end with IE, however the xml
    > file that is prodced is not able to be copied using the method above.
    >
    > The reason is the produced xml file has three additional bytes at the start
    > of it (ie before the "<xml" part)
    >
    > my question is.
    >
    > does anybody know why this is and how to get rid of the three additional
    > bytes at the start of the file.
    >


    The file is being save in a Unicode encoding. The 3 additional byes are
    a Unicode BOM (Byte Order Mark).

    you can probably solve the problem by either specifying that the file is
    encoded with Unicode in the <?xml ...> declaration tag, or by saving the
    file in ASCII:

    dim stream as StreamWriter
    try
    stream = New StreamWriter( "C:\Program Files\Templates\copy.xml",
    false, System.Text.Encoding.Default)

    xmlDoc.Save( stream)
    Console.WriteLine("message saved")
    catch
    Console.WriteLine( "Error saving file")
    finally
    if (Not stream Is nothing)
    stream.Close()
    end if
    end try


    --
    mikeb
     
    mikeb, Mar 5, 2004
    #2
    1. Advertising

  3. martin

    mikeb Guest

    mikeb wrote:

    > martin wrote:
    >
    >> Hi,
    >>
    >> I am copying an xml file like so.
    >>
    >> Dim xmlDoc As New XmlDocument
    >> xmlDoc.Load("C:\Program Files\Templates\message.msg")
    >> Console.WriteLine("Tmaplate loaded")
    >> xmlDoc.Save("C:\Program Files\Templates\copy.xml")
    >> Console.WriteLine("message saved")
    >>
    >> Now the xml file copies and is capable of being end with IE, however
    >> the xml
    >> file that is prodced is not able to be copied using the method above.
    >>
    >> The reason is the produced xml file has three additional bytes at the
    >> start
    >> of it (ie before the "<xml" part)
    >>
    >> my question is.
    >>
    >> does anybody know why this is and how to get rid of the three additional
    >> bytes at the start of the file.
    >>

    >
    > The file is being save in a Unicode encoding. The 3 additional byes are
    > a Unicode BOM (Byte Order Mark).
    >
    > you can probably solve the problem by either specifying that the file is
    > encoded with Unicode in the <?xml ...> declaration tag, or by saving the
    > file in ASCII:
    >
    > dim stream as StreamWriter
    > try
    > stream = New StreamWriter( "C:\Program Files\Templates\copy.xml",
    > false, System.Text.Encoding.Default)
    >
    > xmlDoc.Save( stream)
    > Console.WriteLine("message saved")
    > catch
    > Console.WriteLine( "Error saving file")
    > finally
    > if (Not stream Is nothing)
    > stream.Close()
    > end if
    > end try
    >


    Clarification: the Unicode encoding that you're seeing is probably UTF-8.

    In any case, I played around a little bit more with your sample code,
    and I had to manually change the encoding specified in the input file to
    be incorrect to get xmlDoc.Load() to throw an exception. In other words,
    xmlDoc.Load() does not seem to mind the BOM header, unless the encoding
    attribute in the <?xml ...?> tag is lying.

    Can you post a very, very small XML file that causes the problem you're
    seeing?
    --
    mikeb
     
    mikeb, Mar 5, 2004
    #3
  4. martin

    martin Guest

    You are correct,
    The problem now becomes now to create an xml file with the line

    <?xml version="1.0" encoding="UTF-8" standalone="yes" ?>

    (with the UTF encoding set to 8)

    using xmldocument.load

    or do I just have to revert to your ascii method??

    many thanks for the help, I have included samples below that demonstarte my
    problem.
    The xml file is generted in code rather than include files to this message.


    ================
    Try

    Dim doc As New XmlDocument

    doc.LoadXml("<?xml version=""1.0"" encoding=""UTF-8"" standalone=""yes""?>"
    & _

    "<Message version=""1.1"" id="""">" & _

    "<Attributes>" & _

    "<Priority></Priority>" & _

    "<DeleteAttaches></DeleteAttaches>" & _

    "</Attributes>" & _

    "</Message>")



    doc.Save("C:\Program Files\Templates\ThreeByteError.xml")

    Console.WriteLine("Saved the dodgy xml file")

    doc.LoadXml("<?xml version=""1.0"" standalone=""yes""?>" & _

    "<Message version=""1.1"" id="""">" & _

    "<Attributes>" & _

    "<Priority></Priority>" & _

    "<DeleteAttaches></DeleteAttaches>" & _

    "</Attributes>" & _

    "</Message>")

    doc.Save("C:\Program Files\Templates\NoThreeByteError.xml")

    Console.WriteLine("Saved the fine xml file")

    Console.WriteLine("Press a key to close")

    Console.ReadLine()

    Catch ex As Exception

    Console.WriteLine("***ERROR***")

    Console.WriteLine(ex.Message)

    End Try

    Console.WriteLine("Press a key to close")

    Console.ReadLine()

    End Sub

    ================

    Now run the follwoing at he command line to see the problem

    type "C:\Program Files\Templates\ThreeByteError.xml"

    type "C:\Program Files\Templates\NoThreeByteError.xml"

    fc "C:\Program Files\Templates\NoThreeByteError.xml" "C:\Program
    Files\Templates\ThreeByteError.xml"


    cheers

    martin.
     
    martin, Mar 6, 2004
    #4
  5. martin

    mikeb Guest

    martin wrote:

    > You are correct,
    > The problem now becomes now to create an xml file with the line
    >
    > <?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
    >
    > (with the UTF encoding set to 8)


    Well, the documentation for StreamWriter indicates that a BOM will be
    written unless the encoding used is Encoding.Default.

    However, at least for XmlDocument.Load(), the BOM poses no problem on my
    machine - it loads just fine.

    If there's some other software that you need to load the XML document
    into that does not handle the BOM, I suppose you have a few options:

    - write the file using Encoding.Default.
    - post-process the output file to remove the BOM
    - upgrade the software that doesn't like the BOM to handle it properly

    I'm sure there are others, too.

    >
    > using xmldocument.load
    >
    > or do I just have to revert to your ascii method??
    >
    > many thanks for the help, I have included samples below that demonstarte my
    > problem.
    > The xml file is generted in code rather than include files to this message.
    >
    >
    > ================
    > Try
    >
    > Dim doc As New XmlDocument
    >
    > doc.LoadXml("<?xml version=""1.0"" encoding=""UTF-8"" standalone=""yes""?>"
    > & _
    >
    > "<Message version=""1.1"" id="""">" & _
    >
    > "<Attributes>" & _
    >
    > "<Priority></Priority>" & _
    >
    > "<DeleteAttaches></DeleteAttaches>" & _
    >
    > "</Attributes>" & _
    >
    > "</Message>")
    >
    >
    >
    > doc.Save("C:\Program Files\Templates\ThreeByteError.xml")
    >
    > Console.WriteLine("Saved the dodgy xml file")
    >
    > doc.LoadXml("<?xml version=""1.0"" standalone=""yes""?>" & _
    >
    > "<Message version=""1.1"" id="""">" & _
    >
    > "<Attributes>" & _
    >
    > "<Priority></Priority>" & _
    >
    > "<DeleteAttaches></DeleteAttaches>" & _
    >
    > "</Attributes>" & _
    >
    > "</Message>")
    >
    > doc.Save("C:\Program Files\Templates\NoThreeByteError.xml")
    >
    > Console.WriteLine("Saved the fine xml file")
    >
    > Console.WriteLine("Press a key to close")
    >
    > Console.ReadLine()
    >
    > Catch ex As Exception
    >
    > Console.WriteLine("***ERROR***")
    >
    > Console.WriteLine(ex.Message)
    >
    > End Try
    >
    > Console.WriteLine("Press a key to close")
    >
    > Console.ReadLine()
    >
    > End Sub
    >
    > ================
    >
    > Now run the follwoing at he command line to see the problem
    >
    > type "C:\Program Files\Templates\ThreeByteError.xml"
    >
    > type "C:\Program Files\Templates\NoThreeByteError.xml"
    >
    > fc "C:\Program Files\Templates\NoThreeByteError.xml" "C:\Program
    > Files\Templates\ThreeByteError.xml"
    >
    >
    > cheers
    >
    > martin.
    >
    >



    --
    mikeb
     
    mikeb, Mar 6, 2004
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Yandos
    Replies:
    12
    Views:
    5,156
    Pete Becker
    Sep 15, 2005
  2. karthikbalaguru

    where do the extra bytes go while using Malloc ?

    karthikbalaguru, Oct 23, 2007, in forum: C Programming
    Replies:
    15
    Views:
    659
    karthikbalaguru
    Oct 24, 2007
  3. mathieu
    Replies:
    3
    Views:
    650
    Bo Persson
    Sep 4, 2009
  4. Robert Jackson

    Strange extra f added to bytes object

    Robert Jackson, Oct 6, 2013, in forum: Python
    Replies:
    3
    Views:
    142
    Ian Kelly
    Oct 7, 2013
  5. MRAB
    Replies:
    0
    Views:
    98
Loading...

Share This Page