printing a list with non-ascii strings

H

Helmut Jarausch

Hi,

I don't understand Python's behaviour when printing a list.
The following example uses 2 German non-ascii characters.

#!/usr/bin/python
# _*_ coding: latin1 _*_
L=["abc","süß","def"]
print L[1],L

The output of L[1] is correct, while the output of L shows up as
['abc', 's\xfc\xdf', 'def']

How can this be changed?

Thanks for hint,
Helmut.
 
P

Peter Otten

Helmut said:
I don't understand Python's behaviour when printing a list.
The following example uses 2 German non-ascii characters.

#!/usr/bin/python
# _*_ coding: latin1 _*_
L=["abc","süß","def"]
print L[1],L

The output of L[1] is correct, while the output of L shows up as
['abc', 's\xfc\xdf', 'def']

How can this be changed?

Use unicode and follow Martin's recipe:

http://mail.python.org/pipermail/python-list/2011-January/1263783.html
 
A

Arnaud Delobelle

Helmut Jarausch said:
Hi,

I don't understand Python's behaviour when printing a list.
The following example uses 2 German non-ascii characters.

#!/usr/bin/python
# _*_ coding: latin1 _*_
L=["abc","süß","def"]
print L[1],L

The output of L[1] is correct, while the output of L shows up as
['abc', 's\xfc\xdf', 'def']

How can this be changed?

Thanks for hint,
Helmut.

That's because when you print a list, the code executed is roughly:

print "[" + ", ".join(repr(x) for x in L) + "]"

Now try:

print repr("süß")

I don't think this can be changed in Python 2.X. I vaguely remember
discussions about this issue for Python 3 I think, but I can't remember
the outcome and it is different anyway as Python 3 strings are not the
same as Python 2 strings (they are the same as Python 2 unicode strings).

The issue though is that the python interpreter doesn't know what
encoding is supposed to be used for a string - a string in Python 2.X is
a sequence of bytes. If you print the string, then the terminal encodes
the bytes according to its settings, which has nothing to do with python
- so the appearance will differ according to the locale configuration of
the terminal. However, the repr() of a string needs to be consistent
irrespective of the configuration of the terminal - so the only viable
option is to use nothing but ASCII characters. Hence the difference.

HTH
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top