unicode to human readable format

tomasz.kaczorek · Dec 22, 2013

Hi,
i'm looking for solution the unicode string translation to the more readable format.
I've got string like s=s=[u'\u0105\u017c\u0119\u0142\u0144'] and have no idea how to change to the human readable format. please help!

regards,
tomasz

Chris â€œKwpolskaâ€ Warrick · Dec 22, 2013

Hi,
i'm looking for solution the unicode string translation to the more readable format.
I've got string like s=s=[u'\u0105\u017c\u0119\u0142\u0144'] and haveno idea how to change to the human readable format. please help!

regards,
tomasz

While printing the string, instead of the list/seeing the listâ€™s repr,
Python shows a nice human-friendly representation.

s=[u'\u0105\u017c\u0119\u0142\u0144']
s [u'\u0105\u017c\u0119\u0142\u0144']
s[0] u'\u0105\u017c\u0119\u0142\u0144'
print s [u'\u0105\u017c\u0119\u0142\u0144']
print s[0]

Click to expand...

Click to expand...

Ä…Å¼Ä™Å‚Å„

However, that is only the case with Python 2, as Python 3 has a
human-friendly representation in the repr, too:

s=[u'\u0105\u017c\u0119\u0142\u0144']
s

Click to expand...

Click to expand...

['Ä…Å¼Ä™Å‚Å„']

Peter Otten · Dec 22, 2013

Hi,
i'm looking for solution the unicode string translation to the more
readable format. I've got string like
s=s=[u'\u0105\u017c\u0119\u0142\u0144'] and have no idea how to change to
the human readable format. please help!

No, you have a list of strings:

list_of_strings = [u'\u0105\u017c\u0119\u0142\u0144']
print list_of_strings

Click to expand...

Click to expand...

[u'\u0105\u017c\u0119\u0142\u0144']

When a list is printed the individual items are converted to strings with
repr() to avoid ambiguous output e. g. for strings with embeded commas.

If you want human readable strings print them individually instead of the
whole list at once:
.... print string
....
Ä…Å¼Ä™Å‚Å„

tomasz.kaczorek · Dec 27, 2013

hello,
can I ask you for help? when I try to print s[0] i vane the message: UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128).
how to solve my problem, please?

regards,
t.

Steven D'Aprano · Dec 27, 2013

hello,
can I ask you for help? when I try to print s[0] i vane the message:
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1:
ordinal not in range(128). how to solve my problem, please?

What version of Python?

What operating system?

What environment are you running in? IDLE? The shell or cmd.exe? Powershell?
xterm? Something else?

Please copy and paste the complete traceback, starting from the line

Traceback (most recent call last):

to the end.

Please print repr(s[0]) and show us the output.

Ned Batchelder · Dec 27, 2013

hello,
can I ask you for help? when I try to print s[0] i vane the message: UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128).
how to solve my problem, please?

regards,
t.

For help with the fundamentals, you can read or watch this PyCon
presentation: Pragmatic Unicode, or, How Do I Stop the Pain?
http://nedbatchelder.com/text/unipain.html

Dave Angel · Dec 27, 2013

can I ask you for help? when I try to print s[0] i vane the

message: UnicodeEncodeError: 'ascii' codec can't encode characters in
position 0-1: ordinal not in range(128).

how to solve my problem, please?

First, what version of what os, and what version of python?

Next, what terminal are you running, or what ide, and do you have
stdout redirected?

Finally what does your program look like, or at least tell us the
type and represents of s [0].

Bottom line is that s [0] contains a code point that's larger than 7f
and print is convinced that your terminal can handle only ASCII.

wxjmfauth · Dec 28, 2013

Le vendredi 27 dÃ©cembre 2013 12:37:17 UTC+1, Steven D'Aprano a Ã©critÂ :

(e-mail address removed) wrote:

hello,

Click to expand...

can I ask you for help? when I try to print s[0] i vane the message:

Click to expand...

UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1:

Click to expand...

ordinal not in range(128). how to solve my problem, please?

Click to expand...

What version of Python?

What operating system?

What environment are you running in? IDLE? The shell or cmd.exe? Powershell?

xterm? Something else?

Please copy and paste the complete traceback, starting from the line

Traceback (most recent call last):

to the end.

Please print repr(s[0]) and show us the output.

What do you expect?
The representation is - and should be -

print repr(s[0])

Click to expand...

Click to expand...

u'\u0105\u017c\u0119\u0142\u0144'

independently of the tool one uses to process such
a code.

Now, if one prints s[0], the result may be - and should be -
different from the tool.

win console, cp850

print s[0]

Click to expand...

Click to expand...

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "c:\python27\lib\encodings\cp850.py", line 12, in encode
return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-4: cha

win console, cp1252

print s[0]

Click to expand...

Click to expand...

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "c:\python27\lib\encodings\cp1252.py", line 12, in encode
return codecs.charmap_encode(input,errors,encoding_table)
UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-4: cha

win console, cp1250

s = [u'\u0105\u017c\u0119\u0142\u0144']
print s[0] Ä…Å¼Ä™Å‚Å„

Click to expand...

Click to expand...

SciTE editor, output pane "locale", cp1252 for me.

Traceback (most recent call last):
File "utrick.py", line 18, in <module>
print u'\u0105\u017c\u0119\u0142\u0144'
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-4: ordinal not in range(128)

Exit code: 1

SciTE editor, output pane 65001

Traceback (most recent call last):
File "utrick.py", line 18, in <module>
print u'\u0105\u017c\u0119\u0142\u0144'
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-4: ordinal not in range(128)

Exit code: 1

Now in IDLE, Western European version of Windows,
one get this

print s[0]

Click to expand...

Click to expand...

Ä…Å¼Ä™Å‚Å„

Note, by chance it is printing something. It may
come it does not print, understand, render chars
at all. *This is wrong*.

My interactive interpreter I wrote for Py2.*
(full of dirty tricks).

print repr(s[0]) u'\u0105\u017c\u0119\u0142\u0144'
print s[0]

Click to expand...

Click to expand...

?????

*This is correct*, it is an expected result and it
works for all chars.

A (the) correct way to print s[0] with a console (all
platforms).

print s[0].encode(sys.stdout.encoding, 'replace') ?????

Click to expand...

Click to expand...

See the another thread about printing repr().

jmf

Graduation project (Human Activity Recognition)	0	Oct 9, 2022
Turning an AST node / subnodes into something human-readable	2	Feb 19, 2014
Translating pysnmp oids to human readable strings	15	Mar 5, 2009
Human word reader	6	May 15, 2010
Human readable number formatting	9	Sep 28, 2005
How to convert x (elapsed) milliseconds into human-readable format "..d..h..m..s" ?	0	Sep 13, 2009
Unicode codepoints	5	Jun 22, 2011
Python Unicode handling wins again -- mostly	67	Nov 30, 2013

unicode to human readable format

tomasz.kaczorek

Chris â€œKwpolskaâ€ Warrick

Peter Otten

tomasz.kaczorek

Steven D'Aprano

Ned Batchelder

Dave Angel

wxjmfauth

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads