B
Bruno Desthuilliers
Johannes Bauer a écrit :
This might get you started:
"""decode(...)
S.decode([encoding[,errors]]) -> object
Decodes S using the codec registered for encoding. encoding defaults
to the default encoding. errors may be given to set a different error
handling scheme. Default is 'strict' meaning that encoding errors raise
a UnicodeDecodeError. Other possible values are 'ignore' and 'replace'
as well as any other name registered with codecs.register_error that is
able to handle UnicodeDecodeErrors.
"""
HTH
Dear all,
I've some applciations which fetch HTML docuemnts off the web, parse
their content and do stuff with it. Every once in a while it happens
that the web site administrators put up files which are encoded in a
wrong manner.
Thus my Python script dies a horrible death:
File "./update_db", line 67, in <module>
for line in open(tempfile, "r"):
File "/usr/local/lib/python3.1/codecs.py", line 300, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x92 in position
3286: unexpected code byte
This is well and ok usually, but I'd like to be able to tell Python:
"Don't worry, some idiot encoded that file, just skip over such
parts/replace them by some character sequence".
Is that possible? If so, how?
This might get you started:
"""decode(...)
S.decode([encoding[,errors]]) -> object
Decodes S using the codec registered for encoding. encoding defaults
to the default encoding. errors may be given to set a different error
handling scheme. Default is 'strict' meaning that encoding errors raise
a UnicodeDecodeError. Other possible values are 'ignore' and 'replace'
as well as any other name registered with codecs.register_error that is
able to handle UnicodeDecodeErrors.
"""
HTH