Unicode in writing to a file

C

Carbon Man

Py 2.5
Trying to write a string to a file.
self.dataUpdate.write(u"\nentry."+node.tagName+ u" = " + cValue)
cValue contains a unicode character. node.tagName is also a unicode string
though it has no special characters in it.
Getting the error:
UnicodeEncodeError: 'ascii' codec can't encode character u'\x93' in position
22: ordinal not in range(128)

Thanks.
 
M

Marco Mariani

Carbon said:
Py 2.5
Trying to write a string to a file.
self.dataUpdate.write(u"\nentry."+node.tagName+ u" = " + cValue)
cValue contains a unicode character. node.tagName is also a unicode string
though it has no special characters in it.

So what's the encoding of your file?

If you didn't open dataUpdate with codecs.open, and you don't encode the
string someway - i.e. text.encode('utf-8') python has no way to know it.
 
U

Ulrich Eckhardt

Carbon said:
self.dataUpdate.write(u"\nentry."+node.tagName+ u" = " + cValue)
cValue contains a unicode character. node.tagName is also a unicode string
though it has no special characters in it.
Getting the error:
UnicodeEncodeError: 'ascii' codec can't encode character u'\x93' in
position 22: ordinal not in range(128)

There are two operations:
1. Concatenating the strings.
2. Invoking write().

First step I would make is to find out which of the two is raising the
exception. Anyway, it probably is the second one, i.e. the call to write().
What is happening there is that the file's codec is trying to convert the
Unicode string to bytes for the configured encoding ('ascii') but fails
because there is no representation for u'\x93' there (Note: ASCII only uses
the byte values from 0 to 126, the above is 147).

The remedy is to set the output encoding to e.g. UTF-8 (default for XML) or
ISO8859-1 (default for HTML) or whichever encoding you want. Otherwise,
just throw the error message at the search engine of your least distrust to
find a bazillion of other users that had similar problems. ;)

Uli
 
D

Diez B. Roggisch

Carbon said:
Py 2.5
Trying to write a string to a file.
self.dataUpdate.write(u"\nentry."+node.tagName+ u" = " + cValue)
cValue contains a unicode character. node.tagName is also a unicode string
though it has no special characters in it.
Getting the error:
UnicodeEncodeError: 'ascii' codec can't encode character u'\x93' in
position 22: ordinal not in range(128)

Please don't confuse utf-8 with unicode. The former is an encoding of the
latter. Which is a crucial difference, as something being encoded (either
in utf-8 or any other encoding) needs to be *decoded* before being dealt
with in python as unicode-object.

Which is what's causing your troubles here. cValue is a *byte*string
containing some non-ascii-characters.

If you are sure cValue is utf-8-encoded, you can do

u" = " + cValue.decode("utf-8")

to remedy the problem.

Diez
 
C

Carbon Man

Thanks yes that did it.

Peter Otten said:
You have to decide in what encoding you want to store the data in your
file.
UTF-8 is usually a good choice. Then open it with codecs.open() instead of
the built-in open():

import codecs

f = codecs.open(filename, "w", "UTF-8")
f.write(u"\nentry." + node.tagName + u" = " + cValue)

Peter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top