Error when trying to write unicode xml to zipfile

M

Martin

I get below error when trying to write unicode xml to a zipfile.

zip.writestr('content.xml', content.toxml())
File "/usr/lib/python2.4/zipfile.py", line 460, in writestr
zinfo.CRC = binascii.crc32(bytes) # CRC-32 checksum
UnicodeEncodeError: 'ascii' codec can't encode character u'\u25cf' in
position 2848: ordinal not in range(128)

Any ideas?

Martin
 
G

Gabriel Genellina

I get below error when trying to write unicode xml to a zipfile.

zip.writestr('content.xml', content.toxml())
File "/usr/lib/python2.4/zipfile.py", line 460, in writestr
zinfo.CRC = binascii.crc32(bytes) # CRC-32 checksum
UnicodeEncodeError: 'ascii' codec can't encode character u'\u25cf' in
position 2848: ordinal not in range(128)

Any ideas?

Encode before writing. Assuming you want to use utf-8:
zip.writestr('content.xml', content.toxml().encode('utf-8'))

In general, when working with unicode, it's best to decode bytes into
unicode as early as possible (when reading input), process only unicode
inside the program, and encode into bytes at the last step (when writing
output).
Some non-unicode-aware libraries may interfere with this flow,
unfortunately.
 
G

Gabriel Genellina

I get below error when trying to write unicode xml to a zipfile.

zip.writestr('content.xml', content.toxml())
File "/usr/lib/python2.4/zipfile.py", line 460, in writestr
zinfo.CRC = binascii.crc32(bytes) # CRC-32 checksum
UnicodeEncodeError: 'ascii' codec can't encode character u'\u25cf' in
position 2848: ordinal not in range(128)

Any ideas?

Encode before writing. Assuming you want to use utf-8:
zip.writestr('content.xml', content.toxml().encode('utf-8'))

In general, when working with unicode, it's best to decode bytes into
unicode as early as possible (when reading input), process only unicode
inside the program, and encode into bytes at the last step (when writing
output).
Some non-unicode-aware libraries may interfere with this flow,
unfortunately.
 
S

Stefan Behnel

Gabriel said:
Encode before writing. Assuming you want to use utf-8:
zip.writestr('content.xml', content.toxml().encode('utf-8'))

Unless, obviously, you were serialising to a non-utf8 encoding. But since the
"toxml()" method seems to return unicode here (which sounds surprising), I
expect it a) to provide no XML declaration at all or b) to be broken anyway.

Stefan
 
?

=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=

Unless, obviously, you were serialising to a non-utf8 encoding. But since the
"toxml()" method seems to return unicode here (which sounds surprising), I
expect it a) to provide no XML declaration at all or b) to be broken anyway.

Or c) the user forgot to specify the encoding= parameter in toxml().

Regards,
Martin
 
S

Stefan Behnel

Martin said:
Or c) the user forgot to specify the encoding= parameter in toxml().

Then I would expect it a) to serialise to a UTF-8 compatible encoding that
does not require a declaration (which excludes Python unicode) or b) to be
broken. :)

Stefan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Staff online

Members online

Forum statistics

Threads
473,769
Messages
2,569,577
Members
45,052
Latest member
LucyCarper

Latest Threads

Top