How to decode a string

Discussion in 'Python' started by Lad, Aug 21, 2006.

  1. Lad

    Lad Guest

    To be able to decode a string successfully, I need to know what coding
    it is in.
    The string can be coded in utf8 or in windows-1250 or in another
    coding.
    Is there a method how to find out the string coding.
    Thank you for help
    L.
     
    Lad, Aug 21, 2006
    #1
    1. Advertising

  2. Lad wrote:

    > To be able to decode a string successfully, I need to know what coding
    > it is in.


    ask whoever provided the string.

    > The string can be coded in utf8 or in windows-1250 or in another
    > coding. Is there a method how to find out the string coding.


    in general, no. if you have enough text, you may guess, but the right
    approach for that depends on the application.

    </F>
     
    Fredrik Lundh, Aug 21, 2006
    #2
    1. Advertising

  3. Lad

    Lad Guest

    Fredrik Lundh wrote:
    > Lad wrote:
    >
    > > To be able to decode a string successfully, I need to know what coding
    > > it is in.

    >
    > ask whoever provided the string.
    >
    > > The string can be coded in utf8 or in windows-1250 or in another
    > > coding. Is there a method how to find out the string coding.

    >
    > in general, no. if you have enough text, you may guess, but the right
    > approach for that depends on the application.
    >
    > </F>

    Fredrik,
    Thank you for your reply
    The text is from Mysql table field that uses utf8_czech_ci collation,
    but when I try
    `RealName`.decode('utf8'),where RealName is that field of MySQL

    I will get:
    UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3:
    ordinal
    not in range(128)

    Can you please suggest the solution?
    Thank you
    L.
     
    Lad, Aug 21, 2006
    #3
  4. In <>, Lad wrote:

    > The text is from Mysql table field that uses utf8_czech_ci collation,
    > but when I try
    > `RealName`.decode('utf8'),where RealName is that field of MySQL
    >
    > I will get:
    > UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3:
    > ordinal
    > not in range(128)
    >
    > Can you please suggest the solution?


    Do you get this from converting the value from the database or from trying
    to print the unicode string? Can you give us the output of

    print repr(RealName)

    Ciao,
    Marc 'BlackJack' Rintsch
     
    Marc 'BlackJack' Rintsch, Aug 21, 2006
    #4
  5. Lad

    Lad Guest

    Marc 'BlackJack' Rintsch wrote:
    > In <>, Lad wrote:
    >
    > > The text is from Mysql table field that uses utf8_czech_ci collation,
    > > but when I try
    > > `RealName`.decode('utf8'),where RealName is that field of MySQL
    > >
    > > I will get:
    > > UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3:
    > > ordinal
    > > not in range(128)
    > >
    > > Can you please suggest the solution?

    >
    > Do you get this from converting the value from the database or from trying
    > to print the unicode string? Can you give us the output of
    >
    > print repr(RealName)
    >
    > Ciao,

    Marc 'BlackJack' Rintsch

    for
    print repr(RealName) command
    I will get

    P?ibylov\xe1 Ludmila
    where instead of ? should be also a character
    Thank you for help
    L.
     
    Lad, Aug 22, 2006
    #5
  6. Lad wrote:

    > for
    > print repr(RealName) command
    > I will get
    >
    > P?ibylov\xe1 Ludmila
    > where instead of ? should be also a character


    that's not very likely; repr() always includes quotes, always escapes
    non-ASCII characters, and optionally includes a Unicode prefix.

    please try this

    print "*", repr(RealName), type(RealName), "*"

    and post the entire output; that is, *everything* between the asterisks.

    </F>
     
    Fredrik Lundh, Aug 22, 2006
    #6
  7. Lad

    Lad Guest

    Fredrik Lundh wrote:
    > Lad wrote:
    >
    > > for
    > > print repr(RealName) command
    > > I will get
    > >
    > > P?ibylov\xe1 Ludmila
    > > where instead of ? should be also a character

    >
    > that's not very likely; repr() always includes quotes, always escapes
    > non-ASCII characters, and optionally includes a Unicode prefix.
    >
    > please try this
    >
    > print "*", repr(RealName), type(RealName), "*"
    >
    > and post the entire output; that is, *everything* between the asterisks.
    >

    The result of print "*", repr(RealName), type(RealName), "*" is

    * 'Fritschov\xe1 Laura' <type 'str'> *


    Best regards,
    L
     
    Lad, Aug 22, 2006
    #7
  8. "Lad" wrote:

    > The result of print "*", repr(RealName), type(RealName), "*" is
    >
    > * 'Fritschov\xe1 Laura' <type 'str'> *


    looks like the MySQL interface is returning 8-bit strings using ISO-8859-1
    encoding (or some variation of that; \xE1 is "LATIN SMALL LETTER A
    WITH ACUTE" in 8859-1).

    have you tried passing "use_unicode=True" to the connect() call ?

    </F>
     
    Fredrik Lundh, Aug 22, 2006
    #8
  9. Lad

    Lad Guest

    Fredrik Lundh wrote:
    > "Lad" wrote:
    >
    > > The result of print "*", repr(RealName), type(RealName), "*" is
    > >
    > > * 'Fritschov\xe1 Laura' <type 'str'> *

    >
    > looks like the MySQL interface is returning 8-bit strings using ISO-8859-1
    > encoding (or some variation of that; \xE1 is "LATIN SMALL LETTER A
    > WITH ACUTE" in 8859-1).
    >
    > have you tried passing "use_unicode=True" to the connect() call ?
    >
    > </F>


    Frederik,
    Thank you for your reply.
    I found out that if I do not decode the string at all, it looks
    correct. But I do not know why it is ok without decoding.
    I use Django and I do not use use_unicode=True" to the connect() call.
     
    Lad, Aug 22, 2006
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Guoqi Zheng

    decode base64 string

    Guoqi Zheng, Sep 26, 2004, in forum: ASP .Net
    Replies:
    1
    Views:
    4,858
    Rick Strahl [MVP]
    Sep 27, 2004
  2. Ramprasad A Padmanabhan

    function to url decode a string

    Ramprasad A Padmanabhan, Jul 23, 2003, in forum: C Programming
    Replies:
    11
    Views:
    800
    Michael B Allen
    Jul 27, 2003
  3. aurora
    Replies:
    2
    Views:
    557
    aurora
    Jan 14, 2006
  4. John Dalberg

    How to decode this partially encoded string?

    John Dalberg, Mar 8, 2007, in forum: ASP .Net
    Replies:
    2
    Views:
    497
    John Dalberg
    Mar 8, 2007
  5. sumit
    Replies:
    0
    Views:
    384
    sumit
    Mar 10, 2012
Loading...

Share This Page