Changing default encoding

Discussion in 'Python' started by jean.moser, Oct 8, 2003.

  1. jean.moser

    jean.moser Guest

    Hi !

    I need help to solve the problem of the special characters used by european western languages, for example French.
    Word is my word-processing tool.I can save the files in txt format but special characters like é are transformed in \xe9 when I read the files in Python. How do I proceed to get the original files in latin-1 ?
    Thanks for your help.

    Jean
     
    jean.moser, Oct 8, 2003
    #1
    1. Advertising

  2. "jean.moser" <> writes:

    > Word is my word-processing tool.I can save the files in txt format
    > but special characters like é are transformed in \xe9 when I read
    > the files in Python.


    That is not the case. They are not transformed to \xe9. Why do you
    believe such a transformation happens?

    > How do I proceed to get the original files in latin-1 ?


    They still have the original latin-1.

    Regards,
    Martin
     
    Martin v. =?iso-8859-15?q?L=F6wis?=, Oct 9, 2003
    #2
    1. Advertising

  3. Martin v. Löwis wrote:

    >"jean.moser" <> writes:
    >
    >
    >
    >>Word is my word-processing tool.I can save the files in txt format
    >>but special characters like é are transformed in \xe9 when I read
    >>the files in Python.
    >>
    >>

    >
    >That is not the case. They are not transformed to \xe9. Why do you
    >believe such a transformation happens?
    >
    >
    >

    He is likely looking at a repr( value ) and seeing the (safe)
    representation with the hexadecimal escapes. Many people new to
    programming may get confused by this. That is, he sees this:

    >>> 'áí' # implicit repr

    '\xe1\xed'
    >>>


    and doesn't realise that those particular escaped values are the latin-1
    escaped characters, he was expecting the accented characters to show
    up. Doing this will help him see that the data is still in string (not
    unicode) format:

    >>> print 'áí'

    áí

    Knowing that Python supports unicode, new programmers may very easily
    get confused by the escapes and assume they are part of some weird
    "unicode encoding".

    BTW, original poster, the actual encoding is quite probably not latin-1,
    but the default Microsoft Windows encoding, such as 'cp1252'. Luckily,
    as long as you're not trying to convert to Unicode you don't have to
    care :) .

    Enjoy,
    Mike

    _______________________________________
    Mike C. Fletcher
    Designer, VR Plumber, Coder
    http://members.rogers.com/mcfletch/
     
    Mike C. Fletcher, Oct 9, 2003
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Hardy Wang

    Encoding.Default and Encoding.UTF8

    Hardy Wang, Jun 8, 2004, in forum: ASP .Net
    Replies:
    5
    Views:
    19,044
    Jon Skeet [C# MVP]
    Jun 9, 2004
  2. Replies:
    1
    Views:
    23,566
    Real Gagnon
    Oct 8, 2004
  3. =?Utf-8?B?QmVu?=

    Changing Default Advanced Save Encoding

    =?Utf-8?B?QmVu?=, May 21, 2006, in forum: ASP .Net
    Replies:
    0
    Views:
    328
    =?Utf-8?B?QmVu?=
    May 21, 2006
  4. cs_professional
    Replies:
    14
    Views:
    5,557
    cs_professional
    Dec 12, 2010
  5. Replies:
    2
    Views:
    416
Loading...

Share This Page