Re: problem with unicode

Discussion in 'Python' started by John Machin, Apr 25, 2008.

  1. John Machin

    John Machin Guest

    On Apr 25, 9:15 pm, ""
    <> wrote:
    > Hi everybody,
    >
    > I'm using the win32 console and have the following short program
    > excerpt
    >
    > # media is a binary string (mysql escaped zipped file)
    >
    > >> print media

    >
    > xワユロ[ヨ ...
    > (works)
    >
    > >> print unicode(media)

    >
    > UnicodeDecodeError: 'ascii' codec can't decode byte 0x9c in position
    > 1: ordinal not in range(128)
    > (ok i guess print assumes you want to print to ascii)


    Guessing is no substitute for reading the manual.

    print has nothing to do with your problem; the problem is
    unicode(media) -- as you specified no encoding, it uses the default
    encoding, which is ascii [unless you have been mucking about, which is
    not recommended]. As the 2nd byte is 0x9c, ascii is going nowhere.


    >
    > >> print unicode(media).encode('utf-8')

    >
    > UnicodeDecodeError: 'ascii' codec can't decode byte 0x9c in position
    > 1: ordinal not in range(128)
    > (why does this not work?)


    Already unicode(media) "doesn't work", so naturally(?)
    unicode(media).whatever() won't be better -- whatever won't be called.

    >
    > # mapString is a unicode string (i think at least)>> print "'" + mapString + "'"
    >
    > ' yu_200703_hello\ 831 v1234.9874 '
    >
    > >> mystr = "%s %s" % (mapString, media)

    >
    > UnicodeDecodeError: 'ascii' codec can't decode byte 0x9c in position
    > 1: ordinal not in range(128)
    >
    > >> mystr = "%s %s" % (mapString.encode('utf-8'), media.encode('utf-8'))

    >
    > UnicodeDecodeError: 'ascii' codec can't decode byte 0x9c in position
    > 1: ordinal not in range(128)


    This is merely repeating the original problem.

    >
    > I don't know what to do. I just want to concatenate two string where
    > apparently one is a binary string, the other one is a unicode string
    > and I always seem to get this error.
    >
    > Any help is appreciated :)


    We need a clue or two; do this and let us know what it says:

    print type(media), repr(media)
    print type(mapString), repr(mapString)
    import sys; print sys.stdout.encoding

    Also you say that "print media" works. Do you mean that it produces
    some meaningful text that you understand? What I see on the screen in
    Google Groups is the following 6 characters:
    LATIN SMALL LETTER X
    KATAKANA LETTER WA
    KATAKANA LETTER YU
    KATAKANA LETTER RO
    LEFT SQUARE BRACKET
    KATAKANA LETTER YO
    Is that what you see?

    What is it that you call "win32 console"?
     
    John Machin, Apr 25, 2008
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Robert Mark Bram
    Replies:
    0
    Views:
    4,042
    Robert Mark Bram
    Sep 28, 2003
  2. ygao

    unicode wrap unicode object?

    ygao, Apr 8, 2006, in forum: Python
    Replies:
    6
    Views:
    590
    =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=
    Apr 8, 2006
  3. Gabriele *darkbard* Farina

    Unicode digit to unicode string

    Gabriele *darkbard* Farina, May 16, 2006, in forum: Python
    Replies:
    2
    Views:
    561
    Gabriele *darkbard* Farina
    May 16, 2006
  4. gabor
    Replies:
    13
    Views:
    585
    Leo Kislov
    Nov 18, 2006
  5. Jean-Paul Calderone
    Replies:
    23
    Views:
    723
    Leo Kislov
    Nov 21, 2006
Loading...

Share This Page