Strange unicode / no unicode phenomen with mysql

Discussion in 'Python' started by Hans Müller, Dec 3, 2009.

  1. Hans Müller

    Hans Müller Guest

    I have a strange unicode problem with mySQL and sqlite.

    In my application I get a table as a sqlite table which is being compared to an existing mySQL Table.

    The sqlite drive returns all strings from the table as a unicode string which is Ok.
    The mysql drive returns all strings as utf-8 coded strings (no unicode!).

    When opening the mySQL database, use unicode is set to true, so the driver should return
    unicode strings.

    Any ideas ?

    This is the mySQL table definition:
    CREATE TABLE `USERNAMES` (
    `NAME` varchar(256) COLLATE utf8_bin NOT NULL,
    `ID` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
    PRIMARY KEY (`NAME`),
    KEY `BYID` (`ID`)
    ) ENGINE=MyISAM AUTO_INCREMENT=59325 DEFAULT CHARSET=utf8 COLLATE=utf8_bin COMMENT='Table for mapping user names to IDs'


    The sqlite Table was created this way:

    sq3Cursor.execute("create table USERNAMES(NAME text, ID integer)")

    When I query a value from both tables I get:

    sqlite:
    >>> SrcCursor.execute("select * from USERNAMES where ID=49011")

    <sqlite3.Cursor object at 0x2b6096bfc240>
    >>> SrcCursor.fetchone()

    (u'J\xd6RG R\xd6\xdfMANN', 49011)
    >>> print u'J\xd6RG R\xd6\xdfMANN'.encode("utf8")

    JÖRG RÖßMANN

    This is Ok.

    Now mysql:

    >>> DstCursor.execute("select * from USERNAMES where ID=49011")

    1L
    >>> DstCursor.fetchone()

    ('J\xc3\x96RG R\xc3\x96\xc3\x9fMANN', 49011)
    This is the same result, but returned as a utf-8 coded string, not unicode
    >>> 'J\xc3\x96RG R\xc3\x96\xc3\x9fMANN'.decode("utf8")

    u'J\xd6RG R\xd6\xdfMANN'


    The mySQL database has been opened this way:

    DstCon = MySQLdb.connect(host = DstServer, user = config["DBUser"], passwd = config["DBPasswd"], db = DstDBName, use_unicode = True, charset = "utf8")
    DstCursor = DstCon.cursor()

    Since use_unicode is set to True, I expect query results to be unicode (for string data types).

    Trying another table,
    the result for a query is as aspected a unicode string.
    Hans Müller, Dec 3, 2009
    #1
    1. Advertising

  2. Hans Müller

    Hans Müller Guest

    I found the bug, it's in the mysql module.

    When a column has COLLATE=utf8_bin set,

    the column is NOT returned as unicode.

    It's a known bug #541198

    Thanks all for reading.

    Greetings

    Hans
    Hans Müller, Dec 3, 2009
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. JL
    Replies:
    0
    Views:
    1,126
  2. Robert Mark Bram
    Replies:
    0
    Views:
    3,897
    Robert Mark Bram
    Sep 28, 2003
  3. Ravi
    Replies:
    6
    Views:
    1,390
    Suchandra Thapa
    Jul 21, 2003
  4. Hans Müller
    Replies:
    1
    Views:
    1,729
    Dennis Lee Bieber
    Dec 4, 2009
  5. Grzegorz ¦liwiñski
    Replies:
    2
    Views:
    927
    Grzegorz ¦liwiñski
    Jan 19, 2011
Loading...

Share This Page