J
JKPeck
I am trying to understand why, with nonwestern strings, I sometimes get
a hex display and sometimes get the string printed as characters.
With my Python locale set to Japanese and with or without a # coding of
cp932 (this is Windows) at the top of the file, I read a list of
Japanese strings into a list, say, catlis.
With this code
for item in catlis:
print item
print catlis
print " ".join(catlis)
the first print (print item) displays Japanese text as characters..
The second print (print catlis) displays a list with the double byte
characters in hex notation.
The third print (print " ".join(catlis)) prints a combined string of
Japanese characters properly.
According to the print documentation,
"If an object is not a string, it is first converted to a string using
the rules for string conversions"
but the result is different with a list of strings.
The hex display looks like this:
['id', '\x90\xab\x95\xca', '\x90\xb6\x94N\x8c\x8e\x93\xfa',
'\x8fA\x8aw\x94N\x90\x94', '\x90E\x8e\xed', '\x8b\x8b\x97^',
'\x8f\x89\x94C\x8b\x8b', '\x8d\xdd\x90\xd0\x8c\x8e\x90\x94',
'\x90E\x96\xb1\x8co\x97\xf0', '\x90l\x8e\xed']
and correctly shows the hex values of the Japanese characters.
Why are these different?
TIA,
Jon Peck
a hex display and sometimes get the string printed as characters.
With my Python locale set to Japanese and with or without a # coding of
cp932 (this is Windows) at the top of the file, I read a list of
Japanese strings into a list, say, catlis.
With this code
for item in catlis:
print item
print catlis
print " ".join(catlis)
the first print (print item) displays Japanese text as characters..
The second print (print catlis) displays a list with the double byte
characters in hex notation.
The third print (print " ".join(catlis)) prints a combined string of
Japanese characters properly.
According to the print documentation,
"If an object is not a string, it is first converted to a string using
the rules for string conversions"
but the result is different with a list of strings.
The hex display looks like this:
['id', '\x90\xab\x95\xca', '\x90\xb6\x94N\x8c\x8e\x93\xfa',
'\x8fA\x8aw\x94N\x90\x94', '\x90E\x8e\xed', '\x8b\x8b\x97^',
'\x8f\x89\x94C\x8b\x8b', '\x8d\xdd\x90\xd0\x8c\x8e\x90\x94',
'\x90E\x96\xb1\x8co\x97\xf0', '\x90l\x8e\xed']
and correctly shows the hex values of the Japanese characters.
Why are these different?
TIA,
Jon Peck