os.popen encoding!

SMALLp · Feb 18, 2009

Hy.
I'm playing with os.popen function.
a = os.popen("somecmd").read()

If one of the lines contains characters like "è", "æ"or any other it loks
line this "velja\xe8a 2009" with that "\xe8". It prints fine if i go:

for i in a:
print i:

How to solve this and where exectly is problem with print or read! Windows
XP, Python 2.5.4

Thanks!

Gabriel Genellina · Feb 18, 2009

En Wed said:
Hy.
I'm playing with os.popen function.
a = os.popen("somecmd").read()

If one of the lines contains characters like "è", "æ"or any other it loks
line this "velja\xe8a 2009" with that "\xe8". It prints fine if i go:

for i in a:
print i:

'\xe8' is a *single* byte (not four). It is the 'LATIN SMALL LETTER E WITH
GRAVE' Unicode code point u'è' encoded in the Windows-1252 encoding (and
latin-1, and others too). This is the usual Windows encoding (in "Western
Europe" but seems to cover a much larger part of the world... most of
America, if not all).

When you *look* at some string in the interpreter, you see its repr()
(note the surrounding quotes). When you *print* some string, you get its
contents:

py> s = "ma mère"
py> s
'ma m\x8are'
py> print s
ma mère
py> print repr(s)
'ma m\x8are'

How to solve this and where exectly is problem with print or read!
Windows
XP, Python 2.5.4

There is *no* problem. You should read the Unicode howto:
<http://docs.python.org/howto/unicode.html>
If you still think there is a problem, please provide more details.

Hrvoje Niksic · Feb 18, 2009

Gabriel Genellina said:
'\xe8' is a *single* byte (not four). It is the 'LATIN SMALL LETTER E
WITH GRAVE' Unicode code point u'Ã¨' encoded in the Windows-1252
encoding (and latin-1, and others too).

Note that it is also 'LATIN SMALL LETTER C WITH CARON' (U+010D or
u'Ä'), encoded in Windows-1250, which is what the OP is likely using.

The rest of your message stands regardless: there is no problem, at
least as long as the OP only prints out the character received from
somecmd to something else that also expects Windows-1250. The problem
would arise if the OP wanted to store the string in a PyGTK label
(which expects UTF8) or send it to a web browser (which expects
explicit encoding, probably defaulting to UTF8), in which case he'd
have to disambiguate whether '\xe8' refers to U+010D or to U+00E8 or
something else entirely.

That is the problem that Python 3 solves by requiring (or strongly
suggesting) that such disambiguation be performed as early in the
program as possible, preferrably while the characters are being read
from the outside source. A similar approach is possible using Python
2 and its unicode type, but since the OP never specified exactly which
problem he had (except for the repr/str confusion), it's hard to tell
if using the unicode type would help.

SMALLp · Feb 19, 2009

Thanks for help!

My problem was actualy:

a = ["velja\xe8a 2009"]
print a #will print ["velja\xe8a 2009"]
Print a[0] #will print

Click to expand...

Click to expand...

veljaèa 2009

Gabriel Genellina · Feb 19, 2009

En Wed said:
Thanks for help!

My problem was actualy:

a = ["velja\xe8a 2009"]
print a #will print ["velja\xe8a 2009"]
Print a[0] #will print

Click to expand...

Click to expand...

veljaèa 2009

And why is that a problem?

Almost the only reason to print a list is when debugging a program. To
print a list, Python uses repr() on each of its elements. Otherwise, [5,
"5", u'5'] would be indistinguishable from [5, 5, 5], and you usually want
to know exactly *what* the list contains.

Perhaps if you tell us what do you want to do exactly someone can offer
more advice.

string to unicode	0	Aug 15, 2011
Avoiding shell metacharacters in os.popen	10	Sep 29, 2004
Python, Tkinter and popen problem	0	May 19, 2009
Python Windows release and encoding	1	May 22, 2013
windows active directory ldap output encoding	2	Jul 8, 2008
japanese encoding iso-2022-jp in python vs. perl	4	Oct 23, 2007
Python, Tkinter and popen problem	10	May 27, 2009
[ENCODING] UTF8 hell	12	Feb 2, 2010

os.popen encoding!

SMALLp

Gabriel Genellina

Hrvoje Niksic

SMALLp

Gabriel Genellina

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads