Re: A 'raw' codec for binary "strings" in Python?

Discussion in 'Python' started by Jeff Epler, Mar 1, 2004.

  1. Jeff Epler

    Jeff Epler Guest

    You have to understand the difference between
    "\xc0".encode('US-ASCII', 'replace')
    and
    u"\xc0".encode('US-ASCII', 'replace')
    ... the latter returns the string '?', the former probably throws an
    error assuming that tour default encoding is 'ascii'.
    That's because ''.encode(...) is really the same as
    ''.decode(sys.getdefaultencoding()).encode(...) It's in the decode step
    that the error is being raised.

    You could use
    "\xc0".decode("iso-8859-1").encode('US-ASCII', 'replace')
    or you could use ''.translate:
    s = ''.join([chr(x) for x in range(128,256)])
    t = '?' * 128
    replace_map = string.maketrans(s, t)

    >>> "abc\xc0\xff".translate(replace_map)

    'abc??'

    Jeff
     
    Jeff Epler, Mar 1, 2004
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Fredrik Lundh

    codec to parse raw UCS data?

    Fredrik Lundh, Aug 19, 2003, in forum: Python
    Replies:
    1
    Views:
    312
    Oleg Leschov
    Aug 20, 2003
  2. Bill Janssen
    Replies:
    2
    Views:
    791
    Michael Hudson
    Mar 2, 2004
  3. Bill Janssen
    Replies:
    2
    Views:
    532
    Francis Avila
    Mar 3, 2004
  4. Nagarajan
    Replies:
    4
    Views:
    334
    Nagarajan
    Aug 23, 2007
  5. John Nagle
    Replies:
    3
    Views:
    662
    Waldemar Osuch
    Nov 10, 2007
Loading...

Share This Page