UnicodeError for join()

T

tmallen

This line of code is throwing a UnicodeError for a handful of the few
hundred files I'm processing:

rc_file.write("\n\n".join([self.title, "### BEGIN CONTENT ###",
self.content]))
..................
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position
442: ordinal not in range()

What should I change to make this unicode-safe?

Thanks,
Thomas
 
M

Martin v. Löwis

One of self.title and self.content is a Unicode string, the other is
How can I do that?

First, you need to find out what the respective types are:

print type(self.title), type(self.content), repr(self.title),
repr(self.content)

With that information, as a very important next step, you need to
understand why the error occurs.

Then, you need to fix it, e.g. by converting all strings to byte
strings.
Suppose title is a unicode string, and further suppose the output
is to be encoded in cp1252, then you change the line to

rc_file.write(u"\n\n".join([self.title.encode("cp1252"),
"### BEGIN CONTENT ###",
self.content]))

Regards,
Martin
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,773
Messages
2,569,594
Members
45,121
Latest member
LowellMcGu
Top