Is str/unicode.encode supposed to work? with replace/ignore

B

BerlinBrown

With this code, ignore/replace still generate an error

# Encode to simple ascii format.
field.full_content = field.full_content.encode('ascii', 'replace')

Error:

[0/1] 'ascii' codec can't decode byte 0xe2 in position 14317: ordinal
not in ran
ge(128)

The document in question; is a wikipedia document. I believe they use
latin-1 unicode or something similar. I thought replace and ignore
were supposed to replace and ignore?
 
M

Matt Nordhoff

BerlinBrown said:
With this code, ignore/replace still generate an error

# Encode to simple ascii format.
field.full_content = field.full_content.encode('ascii', 'replace')

Error:

[0/1] 'ascii' codec can't decode byte 0xe2 in position 14317: ordinal
not in ran
ge(128)

The document in question; is a wikipedia document. I believe they use
latin-1 unicode or something similar. I thought replace and ignore
were supposed to replace and ignore?

Is field.full_content a str or a unicode? You probably haven't decoded
it from a byte string yet.

Why do you want to use ASCII? UTF-8 is great. :)
--
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top