Re: how to write a unicode string to a file ?

Discussion in 'Python' started by Stef Mientki, Oct 17, 2009.

  1. Stef Mientki

    Stef Mientki Guest

    Stephen Hansen wrote:
    > On Thu, Oct 15, 2009 at 4:43 PM, Stef Mientki <
    > <mailto:>> wrote:
    >
    > hello,
    >
    > By writing the following unicode string (I hope it can be send on
    > this mailing list)
    >
    > Bücken
    >
    > to a file
    >
    > fh.write ( line )
    >
    > I get the following error:
    >
    > UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc'
    > in position 9: ordinal not in range(128)
    >
    > How should I write such a string to a file ?
    >
    >
    > First, you have to understand that a file never really contains
    > unicode-- not in the way that it exists in memory / in python when you
    > type line = u'Bücken'. It contains a series of bytes that are an
    > encoded form of that abstract unicode data.
    >
    > There's various encodings you can use-- UTF-8 and UTF-16 are in my
    > experience the most common. UTF-8 is an ASCII-superset, and its the
    > one I see most often.
    >
    > So, you can do:
    >
    > import codecs
    > f = codecs.open('filepath', 'w', 'utf-8')
    > f.write(line)
    >
    > To read such a file, you'd do codecs.open as well, just with a 'r'
    > mode and not a 'w' mode.

    Thanks guys,
    I didn't know the codecs module,
    and the codecs seems to be a good solution,
    at least it can safely write a file.
    But now I have to open that file in Excel 2000 ... 2007,
    and I get something completely wrong.
    After changing codecs to latin-1 or windows-1252,
    everything works fine.

    Which of the 2 should I use latin-1 or windows-1252 ?

    And a more general question, how should I organize my Python programs ?
    In general I've data coming from Excel, Delphi, SQLite.
    In Python I always use wxPython, so I'm forced to use unicode.
    My output often needs to be exported to Excel, SPSS, SQLite.
    So would this be a good design ?

    Excel | convert wxPython convert Excel
    Delphi |===> to ===> in ===> to ===> SQLite
    SQLite | unicode unicode latin-1 SPSS

    thanks,
    Stef Mientki

    >
    > Now, that uses a file object created with the "codecs" module which
    > operates with theoretical unicode streams. It will automatically take
    > any passed in unicode strings, encode them in the specified encoding
    > (utf8), and write the resulting bytes out.
    >
    > You can also do that manually with a regular file object, via:
    >
    > f.write(line.encode("utf8"))
    >
    > If you are reading such a file later with a normal file object (e.g.,
    > not one created with codecs.open), you would do:
    >
    > f = open('filepath', 'rb')
    > byte_data = f.read()
    > uni_data = byte_data.decode("utf8")
    >
    > That will convert the byte-encoded data back to real unicode strings.
    > Be sure to do this even if it doesn't seem you need to if the file
    > contains encoded unicode data (a thing you can only know based on
    > documentation of whatever produced that file)... for example, a UTF8
    > encoded file might look and work like a completely normal ASCII file,
    > but if its really UTF8... eventually your code will break that one
    > time someone puts in a non-ascii character. Since UTF8 is an ASCII
    > superset, its indistinguishable from ASCII until it contains a
    > non-ASCII character.


    >
    > HTH,
    >
    > --S
     
    Stef Mientki, Oct 17, 2009
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. mimi
    Replies:
    0
    Views:
    359
  2. Rob Knop
    Replies:
    1
    Views:
    320
  3. Stef Mientki
    Replies:
    0
    Views:
    288
    Stef Mientki
    Oct 16, 2009
  4. Benjamin Kaplan

    Re: how to write a unicode string to a file ?

    Benjamin Kaplan, Oct 16, 2009, in forum: Python
    Replies:
    1
    Views:
    606
    Paul Boddie
    Oct 16, 2009
  5. Donn
    Replies:
    2
    Views:
    340
Loading...

Share This Page