Python beginner, unicode encode/decode Q

Discussion in 'Python' started by anonymous, Jul 14, 2008.

  1. anonymous

    anonymous Guest

    1 Objective to write little programs to help me learn German. See code
    after numbered comments. //Thanks in advance for any direction or
    suggestions.

    tk

    2 Want keyboard answer input, for example:

    answer_str = raw_input(' Enter answer > ') Herr Üü

    [ I keyboard in the following characters Herr Üü ]
    print answer_str
    Output on screen is > Herr Üü

    3 history 1 and 2 code run interactively under Debian Linux Python
    2.4 and interactively under windows98, first edition IDLE, Python 2.3.5
    and it works.

    4 history 3 and 4 code run from within a .py file produce different
    output from example in book.

    5 want to operate under Debian Linux but because the program failed
    under Linux when I tried to run the code from a file in Linux Python, I
    thougt I should fire up the win98 Idle/python program and try it to see
    if ran there but it failed, too from within a file.

    6 The sample code is from page 108-109 of: "Python for Dummies"
    It says in the book: "Python's file objects and StringIO objects
    don't support raw Unicode; the usual workaround is to encode Unicode as
    UTF-8 before saving it to a file or stringIO object.
    The sample code from the book is French as indicate here but trying
    German produces the same result.

    7 I have searched the net under all the keywords but this is as close as
    I get to accomplishing my task. I suspect I may not be understanding:
    StringIO objects don't support raw Unicode, but I don't know.


    #_*_ coding: utf-8 _*_

    # code run under linux debian interactively from a terminal and works

    print " u'Libert\u00e9' "

    # y = raw_input('Enter >') commented out

    y = u'Lbert\u00e9'
    y.encode('utf-8')
    q = y.encode('utf-8')
    q.decode('utf-8')
    print q.decode('utf-8')

    history 1 works and here is the screen copy of interactive

    >>> y = raw_input ('>')

    >Libert\xc3\xa9
    >>> q = 'Libert\xc3\xa9'
    >>> q.decode('utf-8')

    u'Libert\xe9'
    >>> print q

    Liberté
    >>>


    [ screen output is next line ]

    Lberté



    history 2
    # code run under win98, first edition, within IDLE interactively and
    succeeded in produce correct results.


    # y = raw_input('Enter >') commented out

    y = u'Lbert\u00e9'
    y.encode('utf-8')
    q = y.encode('utf-8')
    q.decode('utf-8')
    print q.decode('utf-8')

    history 1 works and here is the screen copy of interactive

    >>> y = raw_input ('>')

    >Libert\xc3\xa9
    >>> q = 'Libert\xc3\xa9'
    >>> q.decode('utf-8')

    u'Libert\xe9'
    >>> print q

    Liberté
    >>>


    [ screen output is next line ]

    Lberté




    # history 3

    # this code is run from within idle on win98 and inside a python file.
    # The code DOES NOT produce the proper outout.

    #_*_ coding: utf-8 _*_

    # print "u'Libert\u00e9'" printed to screen

    y = raw_input('Enter >')

    # y = u'Lbert\u00e9' commented out

    y.encode('utf-8')
    q = y.encode('utf-8')
    q.decode('utf-8')
    print q.decode('utf-8')

    # output is on the lines below was produced on the screen after run

    enter u'Libert\u00e9' on screen to copy into into y string
    Enter >u'Libert\u00e9'

    u'Libert\u00e9'

    The code DOES NOT produce Liberté but instead produce u'Libert\u00e9'

    # history 4

    # this code is run from within terminal on Debian linux inside a
    python file.
    # The code does not produce proper outout but produces the same output
    as run on
    # windows.

    #_*_ coding: utf-8 _*_

    print "u'Libert\u00e9'" printed to screen

    y = raw_input('Enter >')

    # y = u'Lbert\u00e9' commented out

    y.encode('utf-8')
    q = y.encode('utf-8')
    q.decode('utf-8')
    print q.decode('utf-8')

    # output is on the lines below was produced on the screen after run

    enter u'Libert\u00e9' on screen to copy into into y string
    Enter >u'Libert\u00e9'
    u'Libert\u00e9'

    The code DID NOT produce Liberté but instead produce u'Libert\u00e9'
     
    anonymous, Jul 14, 2008
    #1
    1. Advertising

  2. anonymous

    MRAB Guest

    On Jul 14, 1:51 pm, anonymous <> wrote:
    > 1 Objective to write little programs to help me learn German.  See code
    > after numbered comments. //Thanks in advance for any direction or
    > suggestions.
    >
    > tk
    >
    > 2  Want keyboard answer input, for example:  
    >
    > answer_str  = raw_input(' Enter answer > ') Herr  Üü
    >
    > [ I keyboard in the following characters Herr Üü ]
    > print answer_str
    > Output on screen is > Herr Üü
    >
    > 3   history 1 and 2  code run interactively under Debian Linux Python
    > 2.4 and interactively under windows98, first edition IDLE, Python 2.3.5
    > and it works.
    >
    > 4  history 3 and 4 code run from within a .py file produce different
    > output from example in book.
    >
    > 5 want to operate under Debian Linux but because the program failed
    > under Linux when I tried to run the code from a file in Linux Python, I
    > thougt I should fire up the win98 Idle/python program and try it to see
    > if ran there but it failed, too from within a file.
    >
    > 6 The sample code is from page 108-109 of:   "Python for Dummies"
    >       It says in the book:  "Python's file objects and StringIO objects
    > don't support raw Unicode; the usual workaround is to encode Unicode as
    > UTF-8 before saving it to a file or stringIO object.  
    > The sample code from the book is French as indicate here but trying
    > German produces the same result.
    >
    > 7 I have searched the net under all the keywords but this is as close as
    > I get to accomplishing my task.  I suspect I may not be understanding:
    > StringIO objects don't support raw Unicode, but I don't know.
    >
    > #_*_ coding: utf-8 _*_
    >
    > # code run under linux debian  interactively from a terminal and works
    >
    > print " u'Libert\u00e9' "
    >
    > # y = raw_input('Enter >')  commented out
    >
    > y = u'Lbert\u00e9'
    > y.encode('utf-8')
    > q = y.encode('utf-8')
    > q.decode('utf-8')
    > print q.decode('utf-8')
    >
    > history 1 works and here is the screen copy of interactive
    >
    >  >>> y = raw_input ('>')
    >  >Libert\xc3\xa9
    >  >>> q = 'Libert\xc3\xa9'
    >  >>> q.decode('utf-8')
    > u'Libert\xe9'
    >  >>> print q
    > Liberté
    >  >>>
    >
    > [  screen output is next line ]
    >
    > Lberté
    >
    > history 2
    > # code run under win98, first edition, within IDLE interactively and
    > succeeded in produce correct results.
    >
    > # y = raw_input('Enter >')  commented out
    >
    > y = u'Lbert\u00e9'
    > y.encode('utf-8')
    > q = y.encode('utf-8')
    > q.decode('utf-8')
    > print q.decode('utf-8')
    >
    > history 1 works and here is the screen copy of interactive
    >
    >  >>> y = raw_input ('>')
    >  >Libert\xc3\xa9
    >  >>> q = 'Libert\xc3\xa9'
    >  >>> q.decode('utf-8')
    > u'Libert\xe9'
    >  >>> print q
    > Liberté
    >  >>>
    >
    > [  screen output is next line ]
    >
    > Lberté
    >
    > # history 3
    >
    > # this code is run from within idle on win98 and inside a python file.  
    > #  The code DOES NOT produce the proper outout.
    >
    > #_*_ coding: utf-8 _*_
    >
    > # print "u'Libert\u00e9'"  printed to screen
    >
    > y = raw_input('Enter >')
    >
    > # y = u'Lbert\u00e9' commented out
    >
    > y.encode('utf-8')
    > q = y.encode('utf-8')
    > q.decode('utf-8')
    > print q.decode('utf-8')
    >
    > # output is  on the lines  below was produced on the screen after run
    >
    > enter u'Libert\u00e9' on screen to copy into into y string
    > Enter >u'Libert\u00e9'
    >
    > u'Libert\u00e9'
    >
    > The code DOES NOT produce Liberté but instead produce u'Libert\u00e9'
    >
    > # history 4
    >
    > # this code is run from within terminal on Debian linux   inside a
    > python file.  
    > # The code does not produce proper outout but produces the same output
    > as run on
    > # windows.
    >
    > #_*_ coding: utf-8 _*_
    >
    > print "u'Libert\u00e9'"  printed to screen
    >
    > y = raw_input('Enter >')
    >
    > # y = u'Lbert\u00e9' commented out
    >
    > y.encode('utf-8')
    > q = y.encode('utf-8')
    > q.decode('utf-8')
    > print q.decode('utf-8')
    >
    > # output is  on the lines  below was produced on the screen after run
    >
    > enter u'Libert\u00e9' on screen to copy into into y string
    > Enter >u'Libert\u00e9'
    > u'Libert\u00e9'
    >
    > The code DID NOT produce Liberté but instead produce u'Libert\u00e9'


    raw_input returns what you entered. You entered u'Libert\u00e9' so
    that's what was printed out.

    If you want to be able to enter escape sequences like \u00e9 and have
    them decoded to the appropriate character then you must do something
    like this:

    # The code
    text = raw_input('Enter >')
    decoded_text = text.decode("unicode-escape")
    print decoded_text


    # The output
    Enter >Libert\u00e9
    Liberté

    HTH
     
    MRAB, Jul 14, 2008
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Harald Kirsch
    Replies:
    2
    Views:
    2,139
    Harald Kirsch
    Aug 28, 2003
  2. =?UTF-8?B?UmFmYcWCIE1haiBSYWYyNTY=?=

    c++ support for unicode, utf-8, encode/decode, ifstream, wstream?

    =?UTF-8?B?UmFmYcWCIE1haiBSYWYyNTY=?=, Jan 20, 2006, in forum: C++
    Replies:
    12
    Views:
    6,370
    JustBoo
    Jan 23, 2006
  3. Kless

    Decode/encode Unicode

    Kless, Aug 28, 2008, in forum: Ruby
    Replies:
    4
    Views:
    147
    Kless
    Aug 28, 2008
  4. peter pilsl
    Replies:
    2
    Views:
    147
    peter pilsl
    Oct 1, 2004
  5. Alan Franzoni
    Replies:
    0
    Views:
    209
    Alan Franzoni
    Jul 27, 2012
Loading...

Share This Page