Strange MySQL / sqlite3 Problem with unicode

H

Hans Müller

I have a strange unicode problem with mySQL and sqlite.

In my application I get a table as a sqlite table which is being compared to an existing mySQL Table.

The sqlite drive returns all strings from the table as a unicode string which is Ok.
The mysql drive returns all strings as utf-8 coded strings (no unicode!).

When opening the mySQL database, use unicode is set to true, so the driver should return
unicode strings.

Any ideas ?

This is the mySQL table definition:
CREATE TABLE `USERNAMES` (
`NAME` varchar(256) COLLATE utf8_bin NOT NULL,
`ID` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`NAME`),
KEY `BYID` (`ID`)
) ENGINE=MyISAM AUTO_INCREMENT=59325 DEFAULT CHARSET=utf8 COLLATE=utf8_bin COMMENT='Table for mapping user names to IDs'


The sqlite Table was created this way:

sq3Cursor.execute("create table USERNAMES(NAME text, ID integer)")

When I query a value from both tables I get:

sqlite:JÖRG RÖßMANN

This is Ok.

Now mysql:
('J\xc3\x96RG R\xc3\x96\xc3\x9fMANN', 49011)
This is the same result, but returned as a utf-8 coded string, not unicodeu'J\xd6RG R\xd6\xdfMANN'


The mySQL database has been opened this way:

DstCon = MySQLdb.connect(host = DstServer, user = config["DBUser"], passwd = config["DBPasswd"], db = DstDBName, use_unicode = True, charset = "utf8")
DstCursor = DstCon.cursor()

Since use_unicode is set to True, I expect query results to be unicode (for string data types).
 
D

Dennis Lee Bieber

When opening the mySQL database, use unicode is set to true, so the driver should return
unicode strings.
From the MySQLdb documentation:
http://mysql-python.sourceforge.net/MySQLdb.html

doc> use_unicode
doc>
doc> If True, CHAR and VARCHAR and TEXT columns are returned as
Unicode strings, using the configured character set.

Note the last clause. And...

doc> charset
doc>
doc> If present, the connection character set will be changed to
this character set, if they are not equal.

And for MySQL itself:
http://dev.mysql.com/doc/refman/5.0/en/charset-unicode.html

doc> MySQL 5.0 supports two character sets for storing Unicode data:
doc>
doc> * ucs2, the UCS-2 encoding of the Unicode character set
using 16 bits per character
doc> * utf8, a UTF-8 encoding of the Unicode character set
using one to three bytes per character
Since use_unicode is set to True, I expect query results to be unicode (for string data types).

Based upon the documentation I referenced, you are getting exactly
what you asked for -- string data returned as unicode encoded in UTF8
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,763
Messages
2,569,563
Members
45,039
Latest member
CasimiraVa

Latest Threads

Top