replace text in unicode string

Discussion in 'Python' started by Svennglenn, May 14, 2005.

  1. Svennglenn

    Svennglenn Guest

    I'm having problems replacing text in a
    unicode string.
    Here's the code:

    # -*- coding: cp1252 -*-

    titel = unicode("ä", "iso-8859-1")
    print titel
    print type(titel)

    titel.replace("ä", "a")


    When i run this program I get this error:

    titel.replace("ä", "a")
    UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position 0:
    ordinal not in range(128)


    How can i replace text in the Unicode string?
     
    Svennglenn, May 14, 2005
    #1
    1. Advertising

  2. Svennglenn

    Dan Bishop Guest

    Svennglenn wrote:
    > I'm having problems replacing text in a
    > unicode string.
    > Here's the code:
    >
    > # -*- coding: cp1252 -*-
    >
    > titel = unicode("ä", "iso-8859-1")
    > print titel
    > print type(titel)
    >
    > titel.replace("ä", "a")
    >
    > When i run this program I get this error:
    >
    > titel.replace("ä", "a")
    > UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position

    0:
    > ordinal not in range(128)
    >
    > How can i replace text in the Unicode string?


    titel = titel.replace(u"ä", "a")
     
    Dan Bishop, May 14, 2005
    #2
    1. Advertising

  3. Svennglenn

    John Machin Guest

    On 14 May 2005 02:23:55 -0700, "Dan Bishop" <> wrote:

    >Svennglenn wrote:
    >> I'm having problems replacing text in a
    >> unicode string.
    >> Here's the code:
    >>
    >> # -*- coding: cp1252 -*-
    >>
    >> titel = unicode("ä", "iso-8859-1")


    To the OP:
    This is not causing the later problem, but it's evidence of the wrong
    mindset for a start. You have just lied to the interpreter. You said
    that your script was encoded using cp1252, but then you tried to pass
    off a string constant as iso-8859-1!!! They are not exactly the same
    repertoire. You need to make up your mind what character repertoire
    your application should be confined to, and then apply that
    restriction rigorously.

    To get over the error message, all you need to do is this:

    titel = u"ä"

    .... and didn't I (and/or somebody else) tell you this only a few days
    ago?


    >> print titel
    >> print type(titel)
    >>
    >> titel.replace("ä", "a")
    >>
    >> When i run this program I get this error:
    >>
    >> titel.replace("ä", "a")
    >> UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position

    >0:
    >> ordinal not in range(128)
    >>
    >> How can i replace text in the Unicode string?

    >
    >titel = titel.replace(u"ä", "a")


    To Dan:
    Fortuitously this works but if the OP wanted to change it to (say) an
    umlauted-u then it would have thrown another UnicodeDecodeError.

    Everybody please get into the habit of using u"blah blah" when you're
    working with Unicode.

    Like this:

    titel = titel.replace(u"ä", u"a")
     
    John Machin, May 14, 2005
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Eric Lilja
    Replies:
    8
    Views:
    698
    Eric Lilja
    Feb 22, 2005
  2. Alun
    Replies:
    3
    Views:
    4,651
    Masudur
    Feb 18, 2008
  3. Prasad S
    Replies:
    2
    Views:
    258
    Dr John Stockton
    Aug 27, 2004
  4. mscir
    Replies:
    0
    Views:
    332
    mscir
    Oct 12, 2005
  5. V S Rawat
    Replies:
    5
    Views:
    345
    Richard Cornford
    Jul 3, 2007
Loading...

Share This Page