codecs.register_error for "strict", unicode.encode() and str.decode()

Discussion in 'Python' started by Alan Franzoni, Jul 27, 2012.

  1. Hello,
    I think I'm missing some piece here.

    I'm trying to register a default error handler for handling exceptions
    for preventing encoding/decoding errors (I know how this works and that
    making this global is probably not a good practice, but I found this
    strange behaviour while writing a proof of concept of how to let Python
    work in a more forgiving way).

    What I discovered is that register_error() for "strict" seems to work in
    the way I expect for string decoding, not for unicode encoding.

    That's what happens on Mac, Python 2.7.1 from Apple:

    melquiades:tmp alan$ cat minimal_test_encode.py
    # -*- coding: utf-8 -*-

    import codecs

    def handle_encode(e):
    return ("ASD", e.end)

    codecs.register_error("strict", handle_encode)

    print u"à".encode("ascii")

    melquiades:tmp alan$ python minimal_test_encode.py
    Traceback (most recent call last):
    File "minimal_test_encode.py", line 10, in <module>
    u"à".encode("ascii")
    UnicodeEncodeError: 'ascii' codec can't encode character u'\xe0' in
    position 0: ordinal not in range(128)


    OTOH this works properly:

    melquiades:tmp alan$ cat minimal_test_decode.py
    # -*- coding: utf-8 -*-

    import codecs

    def handle_decode(e):
    return (u"ASD", e.end)

    codecs.register_error("strict", handle_decode)

    print "à".decode("ascii")

    melquiades:tmp alan$ python minimal_test_decode.py
    ASDASD


    What piece am I missing? The doc at
    http://docs.python.org/library/codecs.html says " For
    encoding /error_handler/ will be called with a UnicodeEncodeError
    <http://docs.python.org/library/exceptions.html#exceptions.UnicodeEncodeError> instance,
    which contains information about the location of the error.", is there
    any reason why the standard "strict" handler cannot be replaced?


    Thanks for any clue.

    File links:
    https://dl.dropbox.com/u/249926/minimal_test_decode.py
    https://dl.dropbox.com/u/249926/minimal_test_encode.py

    --
    Alan Franzoni
    contact me at public@[mysurname].eu
    Alan Franzoni, Jul 27, 2012
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Harald Kirsch
    Replies:
    2
    Views:
    2,109
    Harald Kirsch
    Aug 28, 2003
  2. aurora
    Replies:
    2
    Views:
    538
    aurora
    Jan 14, 2006
  3. =?UTF-8?B?UmFmYcWCIE1haiBSYWYyNTY=?=

    c++ support for unicode, utf-8, encode/decode, ifstream, wstream?

    =?UTF-8?B?UmFmYcWCIE1haiBSYWYyNTY=?=, Jan 20, 2006, in forum: C++
    Replies:
    12
    Views:
    6,330
    JustBoo
    Jan 23, 2006
  4. Karl Knechtel
    Replies:
    2
    Views:
    357
    Walter Dörwald
    Jul 10, 2012
  5. Peter Otten
    Replies:
    0
    Views:
    191
    Peter Otten
    Jul 27, 2012
Loading...

Share This Page