Re: 'ascii' codec can't encode character u'\xf3'

O

oziko

I solve the problem using

print str.encode('iso-8859-1')

Now I can print the tags with no aparent problem. But now whe I tried to
insert that value into a PostgreSQL data base I get the same error. I
create the PostgreSQL database with default Unicode with

createdb -E UNICODE oggtest

The data T am putting into de database si in the u'Perfeccion' format so
I understand it is UNICODE, but I get the same error:

Traceback (most recent call last):
File "./ogg2sql.py", line 82, in ?
db_cursor.execute(do)
File "/usr/lib/python2.3/site-packages/pyPgSQL/PgSQL.py", line 3035,
in execute
_qstr = self.__unicodeConvert(_qstr)
File "/usr/lib/python2.3/site-packages/pyPgSQL/PgSQL.py", line 2740,
in __unicodeConvert
return obj.encode(*self.conn.client_encoding)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf3' in
position 102: ordinal not in range(128)


my insert query is:

tracks_insert_values =(unicode(coments['TITLE']),coments['TRACKNUMBER'])

y also tried with:

tracks_insert_values=(coments['TITLE'].encode('utf-8'),coments['TRACKNUMBER'])

insert_query = '''insert into tracks(titulo,no_pista)values(%s %i)''' %
tracks_insert_values
 
D

Diez B. Roggisch

oziko said:
Now I can print the tags with no aparent problem. But now whe I tried to
insert that value into a PostgreSQL data base I get the same error. I
create the PostgreSQL database with default Unicode with

There seems to be a general misunderstanding about what unicode, an encoding
and all that together in python means.

Unicode is only an abstract definition of character-sets - the usual
suspects like what is in ascii, but also nearly everything somebody on this
planet of ours cares to write down once in a while.

Now an actual encoding is how these totally abstract character sets are
mapped to actual values. So for the capital letter "A", the ascii encoding
maps it to the well known value 65.

BUT: You can define another encoding, call it oziko or whatever, and map "A"
to 1 - if you like it.

Now UTF-8 is also only an encoding - with the capability to map most of
ascii on the usual numbers where you expect them, and a few escape chars
that allow for multi-byte seqhences to appear in the text that encode one
character. So it can encode the whole unicode set, on the price of not
beeing able to determine the length of a string by dividing the number of
bytes it contains it by the number of bytes a character uses - usual one.

So this is an extremely important lesson: unicode is _not_ - I repeat, _not_
- UTF-8.

Now python has unicode objects. They are sequences of characters - what
shape these internally have is opaque to you and not of your concern. They
are _not_ strings!!!! strings in python are sequences of bytes - as we are
used to from C.

Now whenever you want to use a string that is encoded in a special encoding,
you can get it from a unicode-object by invoking encode on it. Thats what

u.encode('iso-8859-1')

does, if s is a unicode object.

The other way round, if you have a byte-sequence - conveniently stored in a
string - and want to get a unicode object from it, use decode

s.decode('iso-8859-1')

Now if you pass a unicode object to a function that wants a _string_, python
applies for you an automatic encode - with the default encoding!!!! As this
is usually ascii, you get the problems you had.

So what do you need to solve your problem at hand? You need to know which
encoding the sql driver wants for transmitting strings - most probably
utf-8, so they can encode all possible characters. And thus you have to
encode tthe strings you pass beforehand, or set the default encoding
properly.

The last thing is to explain where the u''-thingies fit in. They are a
shortcut for getting a unicode object - whatever characters are encountered
inside the u'', is interpreted with the encoding the python interpreter
uses to parse file at hand. Which one that is can either be specified
implicit (system settings) or explicit using the


-*- coding: <codec> -*-

line on top of the source file.

You might want to start reading about unicode and python on the net, google
is as always your friend.
 
D

Diez B. Roggisch

So what do you need to solve your problem at hand? You need to know which
encoding the sql driver wants for transmitting strings - most probably
utf-8, so they can encode all possible characters. And thus you have to
encode tthe strings you pass beforehand, or set the default encoding
properly.

Just saw that setting the encoding doesn't work - sorry for suggesting it.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top