Re: Encoding conundrum

Discussion in 'Python' started by Dave Angel, Nov 20, 2012.

  1. Dave Angel

    Dave Angel Guest

    On 11/20/2012 04:49 PM, Daniel Klein wrote:
    > With the assistance of this group I am understanding unicode encoding
    > issues much better; especially when handling special characters that are
    > outside of the ASCII range. I've got my application working perfectly now
    > :)
    >
    > However, I am still confused as to why I can only use one specific encoding.


    Who says you can only use one? You need to use the right encoding for
    the device or file you're talking with, and if different devices want
    different encodings, then you must use multiple ones. Only one can be
    the default, however, and that's where some problems come about.

    >
    > I've done some research and it appears that I should be able to use any of
    > the following codecs with codepoints '\xfc' (chr(252)) '\xfd' (chr(253))
    > and '\xfe' (chr(254)) :
    >
    > ISO-8859-1 [ note that I'm using this codec on my Linux box ]
    > cp1252
    > cp437
    > latin1
    > utf-8
    >
    > If I'm not mistaken, all of these codecs can handle the complete 8bit
    > character set.


    What 8 bit character set? This is a nonsense statement. If you mean
    all of them can convert an 8 bit byte to SOME unicode character, then
    fine. But they won't convert each such byte to the SAME unicode
    character, or they'd be the same encoding.


    > However, on Windows 7, I am only able to use 'cp437' to display (print)
    > data with those characters in Python. If I use any other encoding, Windows
    > laughs at me with this error message:
    >
    > File "C:\Python33\lib\encodings\cp437.py", line 19, in encode
    > return codecs.charmap_encode(input,self.errors,encoding_map)[0]
    > UnicodeEncodeError: 'charmap' codec can't encode character '\xfd' in
    > position 3: character maps to <undefined>
    >
    > Furthermore I get this from IDLE:
    >
    >>>> import locale
    >>>> locale.getdefaultlocale()

    > ('en_US', 'cp1252')
    >
    > I also get 'cp1252' when running the same script from a Windows command
    > prompt.
    >
    > So there is a contradiction between the error message and the default
    > encoding.
    >
    > Why am I restricted from using just that one codec? Is this a Windows or
    > Python restriction? Please enlighten me.
    >
    >
    >

    I don't know much about Windows quirks anymore. I haven't had to use it
    much for years.

    --

    DaveA
    Dave Angel, Nov 20, 2012
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Jim Owen

    Clunky Cache Code Conundrum?

    Jim Owen, Jul 3, 2003, in forum: ASP .Net
    Replies:
    4
    Views:
    365
    dave wanta
    Jul 3, 2003
  2. Iain
    Replies:
    0
    Views:
    355
  3. Nobody
    Replies:
    6
    Views:
    466
    Hemal Pandya
    Jul 14, 2005
  4. VisionSet

    RMI conundrum

    VisionSet, Dec 23, 2005, in forum: Java
    Replies:
    2
    Views:
    371
    VisionSet
    Jan 2, 2006
  5. Ian Kelly

    Re: Encoding conundrum

    Ian Kelly, Nov 20, 2012, in forum: Python
    Replies:
    4
    Views:
    147
    Dave Angel
    Nov 21, 2012
Loading...

Share This Page