Python beginner, unicode encode/decode Q

A

anonymous

1 Objective to write little programs to help me learn German. See code
after numbered comments. //Thanks in advance for any direction or
suggestions.

tk

2 Want keyboard answer input, for example:

answer_str = raw_input(' Enter answer > ') Herr Üü

[ I keyboard in the following characters Herr Üü ]
print answer_str
Output on screen is > Herr Üü

3 history 1 and 2 code run interactively under Debian Linux Python
2.4 and interactively under windows98, first edition IDLE, Python 2.3.5
and it works.

4 history 3 and 4 code run from within a .py file produce different
output from example in book.

5 want to operate under Debian Linux but because the program failed
under Linux when I tried to run the code from a file in Linux Python, I
thougt I should fire up the win98 Idle/python program and try it to see
if ran there but it failed, too from within a file.

6 The sample code is from page 108-109 of: "Python for Dummies"
It says in the book: "Python's file objects and StringIO objects
don't support raw Unicode; the usual workaround is to encode Unicode as
UTF-8 before saving it to a file or stringIO object.
The sample code from the book is French as indicate here but trying
German produces the same result.

7 I have searched the net under all the keywords but this is as close as
I get to accomplishing my task. I suspect I may not be understanding:
StringIO objects don't support raw Unicode, but I don't know.


#_*_ coding: utf-8 _*_

# code run under linux debian interactively from a terminal and works

print " u'Libert\u00e9' "

# y = raw_input('Enter >') commented out

y = u'Lbert\u00e9'
y.encode('utf-8')
q = y.encode('utf-8')
q.decode('utf-8')
print q.decode('utf-8')

history 1 works and here is the screen copy of interactive

[ screen output is next line ]

Lberté



history 2
# code run under win98, first edition, within IDLE interactively and
succeeded in produce correct results.


# y = raw_input('Enter >') commented out

y = u'Lbert\u00e9'
y.encode('utf-8')
q = y.encode('utf-8')
q.decode('utf-8')
print q.decode('utf-8')

history 1 works and here is the screen copy of interactive

[ screen output is next line ]

Lberté




# history 3

# this code is run from within idle on win98 and inside a python file.
# The code DOES NOT produce the proper outout.

#_*_ coding: utf-8 _*_

# print "u'Libert\u00e9'" printed to screen

y = raw_input('Enter >')

# y = u'Lbert\u00e9' commented out

y.encode('utf-8')
q = y.encode('utf-8')
q.decode('utf-8')
print q.decode('utf-8')

# output is on the lines below was produced on the screen after run

enter u'Libert\u00e9' on screen to copy into into y string
Enter >u'Libert\u00e9'

u'Libert\u00e9'

The code DOES NOT produce Liberté but instead produce u'Libert\u00e9'

# history 4

# this code is run from within terminal on Debian linux inside a
python file.
# The code does not produce proper outout but produces the same output
as run on
# windows.

#_*_ coding: utf-8 _*_

print "u'Libert\u00e9'" printed to screen

y = raw_input('Enter >')

# y = u'Lbert\u00e9' commented out

y.encode('utf-8')
q = y.encode('utf-8')
q.decode('utf-8')
print q.decode('utf-8')

# output is on the lines below was produced on the screen after run

enter u'Libert\u00e9' on screen to copy into into y string
Enter >u'Libert\u00e9'
u'Libert\u00e9'

The code DID NOT produce Liberté but instead produce u'Libert\u00e9'
 
M

MRAB

1 Objective to write little programs to help me learn German.  See code
after numbered comments. //Thanks in advance for any direction or
suggestions.

tk

2  Want keyboard answer input, for example:  

answer_str  = raw_input(' Enter answer > ') Herr  Üü

[ I keyboard in the following characters Herr Üü ]
print answer_str
Output on screen is > Herr Üü

3   history 1 and 2  code run interactively under Debian Linux Python
2.4 and interactively under windows98, first edition IDLE, Python 2.3.5
and it works.

4  history 3 and 4 code run from within a .py file produce different
output from example in book.

5 want to operate under Debian Linux but because the program failed
under Linux when I tried to run the code from a file in Linux Python, I
thougt I should fire up the win98 Idle/python program and try it to see
if ran there but it failed, too from within a file.

6 The sample code is from page 108-109 of:   "Python for Dummies"
      It says in the book:  "Python's file objects and StringIO objects
don't support raw Unicode; the usual workaround is to encode Unicode as
UTF-8 before saving it to a file or stringIO object.  
The sample code from the book is French as indicate here but trying
German produces the same result.

7 I have searched the net under all the keywords but this is as close as
I get to accomplishing my task.  I suspect I may not be understanding:
StringIO objects don't support raw Unicode, but I don't know.

#_*_ coding: utf-8 _*_

# code run under linux debian  interactively from a terminal and works

print " u'Libert\u00e9' "

# y = raw_input('Enter >')  commented out

y = u'Lbert\u00e9'
y.encode('utf-8')
q = y.encode('utf-8')
q.decode('utf-8')
print q.decode('utf-8')

history 1 works and here is the screen copy of interactive

 >>> y = raw_input ('>')
 >Libert\xc3\xa9
 >>> q = 'Libert\xc3\xa9'
 >>> q.decode('utf-8')
u'Libert\xe9'
 >>> print q
Liberté
 >>>

[  screen output is next line ]

Lberté

history 2
# code run under win98, first edition, within IDLE interactively and
succeeded in produce correct results.

# y = raw_input('Enter >')  commented out

y = u'Lbert\u00e9'
y.encode('utf-8')
q = y.encode('utf-8')
q.decode('utf-8')
print q.decode('utf-8')

history 1 works and here is the screen copy of interactive

 >>> y = raw_input ('>')
 >Libert\xc3\xa9
 >>> q = 'Libert\xc3\xa9'
 >>> q.decode('utf-8')
u'Libert\xe9'
 >>> print q
Liberté
 >>>

[  screen output is next line ]

Lberté

# history 3

# this code is run from within idle on win98 and inside a python file.  
#  The code DOES NOT produce the proper outout.

#_*_ coding: utf-8 _*_

# print "u'Libert\u00e9'"  printed to screen

y = raw_input('Enter >')

# y = u'Lbert\u00e9' commented out

y.encode('utf-8')
q = y.encode('utf-8')
q.decode('utf-8')
print q.decode('utf-8')

# output is  on the lines  below was produced on the screen after run

enter u'Libert\u00e9' on screen to copy into into y string
Enter >u'Libert\u00e9'

u'Libert\u00e9'

The code DOES NOT produce Liberté but instead produce u'Libert\u00e9'

# history 4

# this code is run from within terminal on Debian linux   inside a
python file.  
# The code does not produce proper outout but produces the same output
as run on
# windows.

#_*_ coding: utf-8 _*_

print "u'Libert\u00e9'"  printed to screen

y = raw_input('Enter >')

# y = u'Lbert\u00e9' commented out

y.encode('utf-8')
q = y.encode('utf-8')
q.decode('utf-8')
print q.decode('utf-8')

# output is  on the lines  below was produced on the screen after run

enter u'Libert\u00e9' on screen to copy into into y string
Enter >u'Libert\u00e9'
u'Libert\u00e9'

The code DID NOT produce Liberté but instead produce u'Libert\u00e9'

raw_input returns what you entered. You entered u'Libert\u00e9' so
that's what was printed out.

If you want to be able to enter escape sequences like \u00e9 and have
them decoded to the appropriate character then you must do something
like this:

# The code
text = raw_input('Enter >')
decoded_text = text.decode("unicode-escape")
print decoded_text


# The output
Enter >Libert\u00e9
Liberté

HTH
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top