how to write unicode to a txt file?

Discussion in 'Python' started by Frank Potter, Jan 17, 2007.

  1. Frank Potter

    Frank Potter Guest

    I want to change an srt file to unicode format so mpalyer can display
    Chinese subtitles properly.
    I did it like this:

    txt=open('dmd-guardian-cd1.srt').read()
    txt=unicode(txt,'gb18030')
    open('dmd-guardian-cd1.srt','w').write(txt)

    But it seems that python can't directly write unicode to a file,
    I got and error at the 3rd line:
    UnicodeEncodeError: 'ascii' codec can't encode characters in position
    85-96: ordinal not in range(128)

    How to save the unicode string to the file, please?
    Thanks!
     
    Frank Potter, Jan 17, 2007
    #1
    1. Advertising

  2. Frank Potter

    Peter Otten Guest

    Frank Potter wrote:

    > I want to change an srt file to unicode format so mpalyer can display
    > Chinese subtitles properly.
    > I did it like this:
    >
    > txt=open('dmd-guardian-cd1.srt').read()
    > txt=unicode(txt,'gb18030')
    > open('dmd-guardian-cd1.srt','w').write(txt)
    >
    > But it seems that python can't directly write unicode to a file,
    > I got and error at the 3rd line:
    > UnicodeEncodeError: 'ascii' codec can't encode characters in position
    > 85-96: ordinal not in range(128)
    >
    > How to save the unicode string to the file, please?
    > Thanks!


    You have to tell Python what encoding to use (i. e how to translate the
    codepoints into bytes):

    >>> txt = u"ähnlicher als gewöhnlich üblich"
    >>> import codecs
    >>> codecs.open("tmp.txt", "w", "utf8").write(txt)
    >>> codecs.open("tmp.txt", "r", "utf8").read()

    u'\xe4hnlicher als gew\xf6hnlich \xfcblich'

    You would perhaps use 'gb18030' instead of 'utf8'.

    Peter
     
    Peter Otten, Jan 17, 2007
    #2
    1. Advertising

  3. Frank Potter wrote:

    > But it seems that python can't directly write unicode to a file,


    You need to use the method open from module codecs:

    >>> import codecs
    >>> a = codecs.open("pru_uni.txt", "w", "utf-8")
    >>> txt = unicode("campeón\n", "utf-8")
    >>> a.write(txt)
    >>> a.close()
    >>>


    So, then, from command line:

    facundo@expiron:~$ file pru_uni.txt
    pru_uni.txt: UTF-8 Unicode text

    :)

    Regards,

    --
    .. Facundo
    ..
    Blog: http://www.taniquetil.com.ar/plog/
    PyAr: http://www.python.org/ar/
     
    Facundo Batista, Jan 17, 2007
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. ml
    Replies:
    0
    Views:
    1,487
  2. Jean-Paul Calderone

    Re: how to write unicode to a txt file?

    Jean-Paul Calderone, Jan 17, 2007, in forum: Python
    Replies:
    0
    Views:
    408
    Jean-Paul Calderone
    Jan 17, 2007
  3. Sameen
    Replies:
    2
    Views:
    475
    Victor Bazarov
    Aug 29, 2005
  4. Jochen Brenzlinger
    Replies:
    7
    Views:
    5,846
    Roedy Green
    Sep 15, 2011
  5. Ram
    Replies:
    3
    Views:
    277
    Tad McClellan
    Apr 26, 2007
Loading...

Share This Page