does raw_input() return unicode?

S

Stuart McGraw

In the announcement for Python-2.3
http://groups.google.com/group/comp.lang.python/msg/287e94d9fe25388d?hl=en
it says "raw_input(): can now return Unicode objects".

But I didn't see anything about this in Andrew Kuchling's
"2.3 What's New", nor does the current python docs for
raw_input() say anything about this. A test on a MS
Windows system with a cp932 (japanese) default locale
shows the object returned by raw_input() is a str() object
containing cp932 encoded text. This remained true even
when I set Python's default encoding to cp932 (in
sitecustomize.py).

So, does raw_input() ever return unicode objects and if
so, under what conditions?
 
D

Duncan Booth

Stuart McGraw said:
So, does raw_input() ever return unicode objects and if
so, under what conditions?
It returns unicode if reading from sys.stdin returns unicode.

Unfortunately, I can't tell you how to make sys.stdin return unicode for
use with raw_input. I tried what I thought should work and as you can see
it messed up the buffering on stdin. Does anyone else know how to wrap
sys.stdin so it returns unicode but is still unbuffered?

Python 2.5 (r25:51908, Sep 19 2006, 09:52:17) [MSC v.1310 32 bit (Intel)]
on win32
Type "help", "copyright", "credits" or "license" for more information.hello world
still going?
^Z
^Z
u'hello world'
 
G

Guest

Stuart said:
So, does raw_input() ever return unicode objects and if
so, under what conditions?

At the moment, it only returns unicode objects when invoked
in the IDLE shell, and only if the character entered cannot
be represented in the locale's charset.

Regards,
Martin
 
T

Theerasak Photha

At the moment, it only returns unicode objects when invoked
in the IDLE shell, and only if the character entered cannot
be represented in the locale's charset.

Why only IDLE? Does urwid or another console UI toolkit avoid this somehow?

-- Theerasak
 
F

Fredrik Lundh

Theerasak said:
Why only IDLE? Does urwid or another console UI toolkit avoid this somehow?

Martin was probably thinking of the standard distribution.

The 2.3 note says that "raw_input() *can* return Unicode", not that it
"should" or "must" do it.

</F>
 
T

Theerasak Photha

Martin was probably thinking of the standard distribution.

The 2.3 note says that "raw_input() *can* return Unicode", not that it
"should" or "must" do it.

Practically speaking, at the heart of the matter: as of Python 2.5
final, does or can raw_input() return Unicode under the appropriate
circumstances, according to user wishes?

(Yes, I would test, but I am presently away from my Linux box with
Python, and can't install it here.)

-- Theerasak
 
F

Fredrik Lundh

Theerasak said:
Practically speaking, at the heart of the matter: as of Python 2.5
final, does or can raw_input() return Unicode under the appropriate
circumstances, according to user wishes?

didn't Martin just answer that question?

</F>
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Theerasak said:
Why only IDLE? Does urwid or another console UI toolkit avoid this somehow?

I admit I don't know what urwid is; from a shallow description I find
("a console user interface library") I can't see the connection to
raw_input(). How would raw_input() ever use urwid?

Regards,
Martin
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Theerasak said:
Why only IDLE? Does urwid or another console UI toolkit avoid this somehow?

I admit I don't know what urwid is; from a shallow description I find
("a console user interface library") I can't see the connection to
raw_input(). How would raw_input() ever use urwid?

Regards,
Martin
 
S

Stuart McGraw

Martin v. Löwis said:
At the moment, it only returns unicode objects when invoked
in the IDLE shell, and only if the character entered cannot
be represented in the locale's charset.

Thanks for the answer.

Also, if anyone has a solution for Duncan Booth's attempt
to wrap stdin, I would find it very useful too!

Duncan Booth said:
Stuart McGraw said:
So, does raw_input() ever return unicode objects and if
so, under what conditions?
It returns unicode if reading from sys.stdin returns unicode.

Unfortunately, I can't tell you how to make sys.stdin return unicode for
use with raw_input. I tried what I thought should work and as you can see
it messed up the buffering on stdin. Does anyone else know how to wrap
sys.stdin so it returns unicode but is still unbuffered?

Python 2.5 (r25:51908, Sep 19 2006, 09:52:17) [MSC v.1310 32 bit (Intel)]
on win32
Type "help", "copyright", "credits" or "license" for more information.hello world
still going?
^Z
^Z
u'hello world'
 
T

Theerasak Photha

I admit I don't know what urwid is; from a shallow description I find
("a console user interface library") I can't see the connection to
raw_input(). How would raw_input() ever use urwid?

The other way around: would urwid use raw_input() or other Python
input functions anywhere?

And what causes Unicode input to work in IDLE alone?

-- Theerasak
 
L

Leo Kislov

Theerasak said:
The other way around: would urwid use raw_input() or other Python
input functions anywhere?

And what causes Unicode input to work in IDLE alone?

Other applications except python are actually free to implement unicode
stdin. python cannot do it because of backward compatibility. You can
argue that python interactive console could do it too, but think about
it this way: python interactive console deliberately behaves like a
running python program would.
 
L

Leo Kislov

Duncan said:
It returns unicode if reading from sys.stdin returns unicode.

Unfortunately, I can't tell you how to make sys.stdin return unicode for
use with raw_input. I tried what I thought should work and as you can see
it messed up the buffering on stdin. Does anyone else know how to wrap
sys.stdin so it returns unicode but is still unbuffered?

Considering that all consoles are ascii based, the following should
work where python was able to determine terminal encoding:

class ustdio(object):
def __init__(self, stream):
self.stream = stream
self.encoding = stream.encoding
def readline(self):
return self.stream.readline().decode(self.encoding)

sys.stdin = ustdio(sys.stdin)

answer = raw_input()
print type(answer)
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Theerasak said:
The other way around: would urwid use raw_input() or other Python
input functions anywhere?

Since I still don't know what urwid is, I can't answer the question.
It should be easy enough to grep its source code to find out whether
it ever uses raw_input.
And what causes Unicode input to work in IDLE alone?

Because in IDLE, it is possible to enter characters that are not
in the user's charset. For example, if the user's charset is
cp-1252 (western european), you can still enter cyrillic characters
into IDLE. This is not possible in a regular terminal.

Regards,
Martin
 
N

Neil Cerutti

Considering that all consoles are ascii based, the following
should work where python was able to determine terminal
encoding:

class ustdio(object):
def __init__(self, stream):
self.stream = stream
self.encoding = stream.encoding
def readline(self):
return self.stream.readline().decode(self.encoding)

sys.stdin = ustdio(sys.stdin)

answer = raw_input()
print type(answer)

This interesting discussion led me to a weird discovery:

PythonWin 2.4.3 (#69, Apr 11 2006, 15:32:42) [MSC v.1310 32 bit (Intel)] on win32.
Portions Copyright 1994-2004 Mark Hammond ([email protected]) - see 'Help/About PythonWin' for further copyright information.Traceback (most recent call last):
File "<interactive input>", line 1, in ?
File "C:\edconn32\Python24\Lib\site-packages\pythonwin\pywin\mfc\object.py", line 18, in __getattr__
return getattr(o, attr)
AttributeError: encoding
I'm all mindboggley. Just when I thought I was starting to
understand how this character encoding stuff works. Are
PythonWin's stdout and stdin implementations is incomplete?
 
?

=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=

Neil said:
I'm all mindboggley. Just when I thought I was starting to
understand how this character encoding stuff works. Are
PythonWin's stdout and stdin implementations is incomplete?

Simple and easy: yes, they are.

Regards,
Martin
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,265
Latest member
TodLarocca

Latest Threads

Top