writing \feff at the begining of a file

Discussion in 'Python' started by Jean-Michel Pichavant, Aug 13, 2010.

  1. Hello python world,

    I'm trying to update the content of a $Microsoft$ VC2005 project files
    using a python application.
    Since those files are XML data, I assumed I could easily do that.

    My problem is that VC somehow thinks that the file is corrupted and
    update the file like the following:

    -<?xml version='1.0' encoding='UTF-8'?>
    +?<feff><?xml version="1.0" encoding="UTF-8"?>


    Actually, <feff> is displayed in a different color by vim, telling me
    that this is some kind of special caracter code (I'm no familiar with
    such thing).
    After googling that, I have a clue : could be some unicode caracter use
    to indicate something ... well I don't know in fact ("UTF-8 files
    sometimes start with a byte-order marker (BOM) to indicate that they are
    encoded in UTF-8.").

    My problem is however simplier : how do I add such character at the
    begining of the file ?
    I tried

    f = open('paf', w)
    f.write(u'\ufeff')

    UnicodeEncodeError: 'ascii' codec can't encode character u'\ufeff' in
    position 0: ordinal not in range(128)

    The error may be explicit but I have no idea how to proceed further. Any
    clue ?

    JM
     
    Jean-Michel Pichavant, Aug 13, 2010
    #1
    1. Advertising

  2. Jean-Michel Pichavant wrote:
    > My problem is however simplier : how do I add such character [a BOM]
    > at the begining of the file ?
    > I tried
    >
    > f = open('paf', w)
    > f.write(u'\ufeff')
    >
    > UnicodeEncodeError: 'ascii' codec can't encode character u'\ufeff' in
    > position 0: ordinal not in range(128)


    Try the codecs module to open the file, which will then do all the
    transcoding between internal texts and external UTF-8 for you.

    Uli

    --
    Sator Laser GmbH
    Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932
     
    Ulrich Eckhardt, Aug 13, 2010
    #2
    1. Advertising

  3. Jean-Michel Pichavant

    Nobody Guest

    On Fri, 13 Aug 2010 11:45:28 +0200, Jean-Michel Pichavant wrote:

    > I'm trying to update the content of a $Microsoft$ VC2005 project files
    > using a python application.
    > Since those files are XML data, I assumed I could easily do that.
    >
    > My problem is that VC somehow thinks that the file is corrupted and
    > update the file like the following:
    >
    > -<?xml version='1.0' encoding='UTF-8'?>
    > +?<feff><?xml version="1.0" encoding="UTF-8"?>
    >
    >
    > Actually, <feff> is displayed in a different color by vim, telling me
    > that this is some kind of special caracter code (I'm no familiar with
    > such thing).


    U+FEFF is a "byte order mark" or BOM. Each Unicode-based encoding (UTF-8,
    UTF-16, UTF-16-LE, etc) will encode it differently, so it enables a
    program reading the file to determine the encoding before reading any
    actual data.

    > My problem is however simplier : how do I add such character at the
    > begining of the file ?
    > I tried


    Either:

    1. Open the file as binary and write '\xef\xbb\xbf' to the file:

    f = open('foo.txt', 'wb')
    f.write('\xef\xbb\xbf')

    [You can also use the constant BOM_UTF8 from the codecs module.]

    2. Open the file as utf-8 and write u'\ufeff' to the file:

    import codecs
    f = codecs.open('foo.txt', 'w', 'utf-8')
    f.write(u'\ufeff')

    3. Open the file as utf-8-sig and don't write anything (or write an empty
    string):

    import codecs
    f = codecs.open('foo.txt', 'w', 'utf-8-sig')
    f.write('')

    The utf-8-sig codec automatically writes a BOM at the beginning of the
    file. It is present in Python 2.5 and later.
     
    Nobody, Aug 13, 2010
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. FEFF behaviour

    , Sep 12, 2006, in forum: C++
    Replies:
    1
    Views:
    406
    Stuart Redmann
    Sep 12, 2006
  2. Adrian
    Replies:
    7
    Views:
    451
    Adrian
    Jul 12, 2007
  3. Tony Andreas

    Java for begining

    Tony Andreas, Dec 25, 2007, in forum: Java
    Replies:
    1
    Views:
    349
  4. Terry Reedy
    Replies:
    3
    Views:
    223
    Martin v. Loewis
    Aug 14, 2010
  5. Evertjan.

    Begining asp begining

    Evertjan., Apr 4, 2010, in forum: ASP General
    Replies:
    1
    Views:
    743
    Haziq
    Apr 5, 2010
Loading...

Share This Page