UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position

Discussion in 'Python' started by iMath, Dec 6, 2012.

  1. iMath

    iMath Guest

    the following code originally from http://zetcode.com/databases/mysqlpythontutorial/
    within the "Writing images" part .


    import MySQLdb as mdb
    import sys

    try:
    fin = open("Chrome_Logo.svg.png",'rb')
    img = fin.read()
    fin.close()

    except IOError as e:

    print ("Error %d: %s" % (e.args[0],e.args[1]))
    sys.exit(1)


    try:
    conn = mdb.connect(host='localhost',user='testuser',
    passwd='test623', db='testdb')
    cursor = conn.cursor()
    cursor.execute("INSERT INTO Images SET Data='%s'" % \
    mdb.escape_string(img))

    conn.commit()

    cursor.close()
    conn.close()

    except mdb.Error as e:

    print ("Error %d: %s" % (e.args[0],e.args[1]))
    sys.exit(1)


    I port it to python 3 ,and also change
    fin = open("chrome.png")
    to
    fin = open("Chrome_Logo.png",'rb')
    but when I run it ,it gives the following error :

    Traceback (most recent call last):
    File "E:\Python\py32\itest4.py", line 20, in <module>
    mdb.escape_string(img))
    UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte

    so how to fix it ?
    iMath, Dec 6, 2012
    #1
    1. Advertising

  2. iMath

    Terry Reedy Guest

    Re: UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in

    On 12/6/2012 5:07 AM, iMath wrote:
    > the following code originally from http://zetcode.com/databases/mysqlpythontutorial/
    > within the "Writing images" part .
    >
    >
    > import MySQLdb as mdb


    Not part of stdlib. 'MySQLdb' should be in the subject line to get
    attention of someone who is familiar with it. I am not.

    > import sys
    >
    > try:
    > fin = open("Chrome_Logo.svg.png",'rb')
    > img = fin.read()
    > fin.close()
    >
    > except IOError as e:
    >
    > print ("Error %d: %s" % (e.args[0],e.args[1]))
    > sys.exit(1)
    >
    >
    > try:
    > conn = mdb.connect(host='localhost',user='testuser',
    > passwd='test623', db='testdb')
    > cursor = conn.cursor()
    > cursor.execute("INSERT INTO Images SET Data='%s'" % \
    > mdb.escape_string(img))


    From the name, I would expect that excape_string expects text. From the
    error, it seems to specifically expect utf-8 encoded bytes. After
    decoding, I expect that it does some sort of 'escaping'. An image does
    not qualify as that sort of input. If escape_string takes an encoding
    arg, latin1 *might* work.

    > conn.commit()
    >
    > cursor.close()
    > conn.close()
    >
    > except mdb.Error as e:
    >
    > print ("Error %d: %s" % (e.args[0],e.args[1]))
    > sys.exit(1)
    >
    >
    > I port it to python 3 ,and also change
    > fin = open("chrome.png")
    > to
    > fin = open("Chrome_Logo.png",'rb')
    > but when I run it ,it gives the following error :
    >
    > Traceback (most recent call last):
    > File "E:\Python\py32\itest4.py", line 20, in <module>
    > mdb.escape_string(img))
    > UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte
    >
    > so how to fix it ?
    >



    --
    Terry Jan Reedy
    Terry Reedy, Dec 6, 2012
    #2
    1. Advertising

  3. iMath

    Hans Mulder Guest

    On 6/12/12 11:07:51, iMath wrote:
    > the following code originally from http://zetcode.com/databases/mysqlpythontutorial/
    > within the "Writing images" part .
    >
    >
    > import MySQLdb as mdb
    > import sys
    >
    > try:
    > fin = open("Chrome_Logo.svg.png",'rb')
    > img = fin.read()
    > fin.close()
    >
    > except IOError as e:
    >
    > print ("Error %d: %s" % (e.args[0],e.args[1]))
    > sys.exit(1)
    >
    >
    > try:
    > conn = mdb.connect(host='localhost',user='testuser',
    > passwd='test623', db='testdb')
    > cursor = conn.cursor()
    > cursor.execute("INSERT INTO Images SET Data='%s'" % \
    > mdb.escape_string(img))


    You shouldn't call mdb.escape_string directly. Instead, you
    should put placeholders in your SQL statement and let MySQLdb
    figure out how to properly escape whatever needs escaping.

    Somewhat confusingly, placeholders are written as %s in MySQLdb.
    They differ from strings in not being enclosed in quotes.
    The other difference is that you'd provide two arguments to
    cursor.execute; the second of these is a tuple; in this case
    a tuple with only one element:

    cursor.execute("INSERT INTO Images SET Data=%s", (img,))

    > conn.commit()
    >
    > cursor.close()
    > conn.close()
    >
    > except mdb.Error as e:
    >
    > print ("Error %d: %s" % (e.args[0],e.args[1]))
    > sys.exit(1)
    >
    >
    > I port it to python 3 ,and also change
    > fin = open("chrome.png")
    > to
    > fin = open("Chrome_Logo.png",'rb')
    > but when I run it ,it gives the following error :
    >
    > Traceback (most recent call last):
    > File "E:\Python\py32\itest4.py", line 20, in <module>
    > mdb.escape_string(img))
    > UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte
    >
    > so how to fix it ?


    Python 3 distinguishes between binary data and Unicode text.
    Trying to apply string functions to images or other binary
    data won't work.

    Maybe correcting this bytes/strings confusion and porting
    to Python 3 in one go is too large a transformation. In
    that case, your best bet would be to go back to Python 2
    and fix all the bytes/string confusion there. When you've
    got it working again, you may be ready to port to Python 3.


    Hope this helps,

    -- HansM
    Hans Mulder, Dec 6, 2012
    #3
  4. Re: UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in

    On Thu, 06 Dec 2012 02:07:51 -0800, iMath wrote:

    > the following code originally from
    > http://zetcode.com/databases/mysqlpythontutorial/ within the "Writing
    > images" part .
    >
    >
    > import MySQLdb as mdb
    > import sys
    >
    > try:
    > fin = open("Chrome_Logo.svg.png",'rb')
    > img = fin.read()
    > fin.close()
    > except IOError as e:
    > print ("Error %d: %s" % (e.args[0],e.args[1]))
    > sys.exit(1)


    Every time a programmer catches an exception, only to merely print a
    vague error message and then exit, God kills a kitten. Please don't do
    that.

    If all you are going to do is print an error message and then exit,
    please don't bother. All you do is make debugging harder. When Python
    detects an error, by default it prints a full traceback, which gives you
    lots of information to track down the error. By catching that exception
    as you do, you lose that information and make it harder to debug.

    Moving on to the next thing:


    [snip code]
    > I port it to python 3 ,and also change fin = open("chrome.png")
    > to
    > fin = open("Chrome_Logo.png",'rb')
    > but when I run it ,it gives the following error :
    >
    > Traceback (most recent call last):
    > File "E:\Python\py32\itest4.py", line 20, in <module>
    > mdb.escape_string(img))
    > UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0:
    > invalid start byte
    >
    > so how to fix it ?


    I suggest you start by reading the documentation for
    MySQLdb.escape_string. What does it do? What does it expect? A byte
    string or a unicode text string?

    It seems very strange to me that you are reading a binary file, then
    passing it to something which appears to be expecting a string. It looks
    like what happens is that the PNG image starts with a 0x89 byte, and the
    escape_string function tries to decode those bytes into Unicode text:

    py> img = b"\x89\x00\x23\xf2" # fake PNG binary data
    py> img.decode('utf-8') # I'm expecting text
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0:
    invalid start byte

    Without knowing more about escape_string, I can only make a wild guess.
    Try this:

    import base64
    img = fin.read() # read the binary data of the PNG file
    data = base64.encodebytes(img) # turn the binary image into text
    cursor.execute("INSERT INTO Images SET Data='%s'" % \
    mdb.escape_string(data))


    and see what that does.


    --
    Steven
    Steven D'Aprano, Dec 6, 2012
    #4
  5. iMath

    iMath Guest

    Re: UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in

    在 2012å¹´12月6日星期四UTC+8下åˆ7æ—¶07分35秒,Hans Mulder写é“:
    > On 6/12/12 11:07:51, iMath wrote:
    >
    > > the following code originally from http://zetcode.com/databases/mysqlpythontutorial/

    >
    > > within the "Writing images" part .

    >
    > >

    >
    > >

    >
    > > import MySQLdb as mdb

    >
    > > import sys

    >
    > >

    >
    > > try:

    >
    > > fin = open("Chrome_Logo.svg.png",'rb')

    >
    > > img = fin.read()

    >
    > > fin.close()

    >
    > >

    >
    > > except IOError as e:

    >
    > >

    >
    > > print ("Error %d: %s" % (e.args[0],e.args[1]))

    >
    > > sys.exit(1)

    >
    > >

    >
    > >

    >
    > > try:

    >
    > > conn = mdb.connect(host='localhost',user='testuser',

    >
    > > passwd='test623', db='testdb')

    >
    > > cursor = conn.cursor()

    >
    > > cursor.execute("INSERT INTO Images SET Data='%s'" % \

    >
    > > mdb.escape_string(img))

    >
    >
    >
    > You shouldn't call mdb.escape_string directly. Instead, you
    >
    > should put placeholders in your SQL statement and let MySQLdb
    >
    > figure out how to properly escape whatever needs escaping.
    >
    >
    >
    > Somewhat confusingly, placeholders are written as %s in MySQLdb.
    >
    > They differ from strings in not being enclosed in quotes.
    >
    > The other difference is that you'd provide two arguments to
    >
    > cursor.execute; the second of these is a tuple; in this case
    >
    > a tuple with only one element:
    >
    >
    >
    > cursor.execute("INSERT INTO Images SET Data=%s", (img,))
    >
    >

    thanks,but it still doesn't work
    >
    > > conn.commit()

    >
    > >

    >
    > > cursor.close()

    >
    > > conn.close()

    >
    > >

    >
    > > except mdb.Error as e:

    >
    > >

    >
    > > print ("Error %d: %s" % (e.args[0],e.args[1]))

    >
    > > sys.exit(1)

    >
    > >

    >
    > >

    >
    > > I port it to python 3 ,and also change

    >
    > > fin = open("chrome.png")

    >
    > > to

    >
    > > fin = open("Chrome_Logo.png",'rb')

    >
    > > but when I run it ,it gives the following error :

    >
    > >

    >
    > > Traceback (most recent call last):

    >
    > > File "E:\Python\py32\itest4.py", line 20, in <module>

    >
    > > mdb.escape_string(img))

    >
    > > UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0:invalid start byte

    >
    > >

    >
    > > so how to fix it ?

    >
    >
    >
    > Python 3 distinguishes between binary data and Unicode text.
    >
    > Trying to apply string functions to images or other binary
    >
    > data won't work.
    >
    >
    >
    > Maybe correcting this bytes/strings confusion and porting
    >
    > to Python 3 in one go is too large a transformation. In
    >
    > that case, your best bet would be to go back to Python 2
    >
    > and fix all the bytes/string confusion there. When you've
    >
    > got it working again, you may be ready to port to Python 3.
    >
    >
    >
    >
    >
    > Hope this helps,
    >
    >
    >
    > -- HansM
    iMath, Dec 7, 2012
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Robin Siebler
    Replies:
    4
    Views:
    25,963
    Tim Peters
    Oct 8, 2004
  2. Oleg  Parashchenko
    Replies:
    4
    Views:
    1,004
    Paul Boddie
    Mar 31, 2007
  3. Îίκος
    Replies:
    67
    Views:
    896
  4. Îίκος Gr33k
    Replies:
    0
    Views:
    98
    Îίκος Gr33k
    Jul 5, 2013
  5. Îίκος
    Replies:
    58
    Views:
    493
    Îίκος
    Oct 2, 2013
Loading...

Share This Page