Re: how to decode rtf characterset ?

Discussion in 'Python' started by MRAB, Feb 1, 2010.

  1. MRAB

    MRAB Guest

    Stef Mientki wrote:
    > hello,
    >
    > I want to translate rtf files to unicode strings.
    > I succeeded in remove all the tags,
    > but now I'm stucked to the special accent characters,
    > like :
    >
    > "Vóór"
    >
    > the character "ó" is represented by the string r"\'f3",
    > or in bytes: 92, 39,102, 51
    >
    > so I think I need a way to translate that into the string r"\xf3"
    > but I can't find a way to accomplish that.
    >
    > a
    > Any suggestions are very welcome.
    >

    Change r"\'f3" to r"\xf3" and then decode to Unicode:

    >>> s = r"\'f3"
    >>> s = s.replace(r"\'", r"\x").decode("unicode_escape")
    >>> print s

    ó
     
    MRAB, Feb 1, 2010
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. j
    Replies:
    1
    Views:
    745
    Kevin Spencer
    Jul 22, 2003
  2. Zsolt
    Replies:
    0
    Views:
    3,302
    Zsolt
    Apr 6, 2004
  3. picayunish

    Re: characterset declaration

    picayunish, Aug 10, 2003, in forum: HTML
    Replies:
    2
    Views:
    495
    picayunish
    Aug 10, 2003
  4. Martin Honnen

    Incorrect HttpWebResponse.CharacterSet

    Martin Honnen, Sep 23, 2006, in forum: ASP .Net
    Replies:
    6
    Views:
    1,029
    Leon_Amirreza
    Sep 25, 2006
  5. Tony
    Replies:
    2
    Views:
    304
Loading...

Share This Page