How to decode a string

L

Lad

To be able to decode a string successfully, I need to know what coding
it is in.
The string can be coded in utf8 or in windows-1250 or in another
coding.
Is there a method how to find out the string coding.
Thank you for help
L.
 
F

Fredrik Lundh

Lad said:
To be able to decode a string successfully, I need to know what coding
it is in.

ask whoever provided the string.
The string can be coded in utf8 or in windows-1250 or in another
coding. Is there a method how to find out the string coding.

in general, no. if you have enough text, you may guess, but the right
approach for that depends on the application.

</F>
 
L

Lad

Fredrik said:
ask whoever provided the string.


in general, no. if you have enough text, you may guess, but the right
approach for that depends on the application.

</F>
Fredrik,
Thank you for your reply
The text is from Mysql table field that uses utf8_czech_ci collation,
but when I try
`RealName`.decode('utf8'),where RealName is that field of MySQL

I will get:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3:
ordinal
not in range(128)

Can you please suggest the solution?
Thank you
L.
 
M

Marc 'BlackJack' Rintsch

The text is from Mysql table field that uses utf8_czech_ci collation,
but when I try
`RealName`.decode('utf8'),where RealName is that field of MySQL

I will get:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3:
ordinal
not in range(128)

Can you please suggest the solution?

Do you get this from converting the value from the database or from trying
to print the unicode string? Can you give us the output of

print repr(RealName)

Ciao,
Marc 'BlackJack' Rintsch
 
L

Lad

Marc said:
Do you get this from converting the value from the database or from trying
to print the unicode string? Can you give us the output of

print repr(RealName)

Ciao,
Marc 'BlackJack' Rintsch

for
print repr(RealName) command
I will get

P?ibylov\xe1 Ludmila
where instead of ? should be also a character
Thank you for help
L.
 
F

Fredrik Lundh

Lad said:
for
print repr(RealName) command
I will get

P?ibylov\xe1 Ludmila
where instead of ? should be also a character

that's not very likely; repr() always includes quotes, always escapes
non-ASCII characters, and optionally includes a Unicode prefix.

please try this

print "*", repr(RealName), type(RealName), "*"

and post the entire output; that is, *everything* between the asterisks.

</F>
 
L

Lad

Fredrik said:
that's not very likely; repr() always includes quotes, always escapes
non-ASCII characters, and optionally includes a Unicode prefix.

please try this

print "*", repr(RealName), type(RealName), "*"

and post the entire output; that is, *everything* between the asterisks.
The result of print "*", repr(RealName), type(RealName), "*" is

* 'Fritschov\xe1 Laura' <type 'str'> *


Best regards,
L
 
F

Fredrik Lundh

Lad said:
The result of print "*", repr(RealName), type(RealName), "*" is

* 'Fritschov\xe1 Laura' <type 'str'> *

looks like the MySQL interface is returning 8-bit strings using ISO-8859-1
encoding (or some variation of that; \xE1 is "LATIN SMALL LETTER A
WITH ACUTE" in 8859-1).

have you tried passing "use_unicode=True" to the connect() call ?

</F>
 
L

Lad

Fredrik said:
looks like the MySQL interface is returning 8-bit strings using ISO-8859-1
encoding (or some variation of that; \xE1 is "LATIN SMALL LETTER A
WITH ACUTE" in 8859-1).

have you tried passing "use_unicode=True" to the connect() call ?

</F>

Frederik,
Thank you for your reply.
I found out that if I do not decode the string at all, it looks
correct. But I do not know why it is ok without decoding.
I use Django and I do not use use_unicode=True" to the connect() call.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,578
Members
45,052
Latest member
LucyCarper

Latest Threads

Top