unicode string alteration

Discussion in 'Python' started by BAvant Garde, Aug 12, 2010.

  1. BAvant Garde

    BAvant Garde Guest

    HELP!!!
    I need help with a unicode issue that has me stumped. I must be doing something  wrong because I don't believe this condition would have slipped thru testing.

    Wherever the string u'\udbff\udc00' occurs u'\U0010fc00' or unichr(1113088) is substituted and the file loses 1 character resulting in all trailing characters being shifted out of position. No other corrupt strings have been detected.
       
    The condition was noticed while testing in Python 2.6.5 on Ubuntu 10.04 where the maximum ord # is 1114111 (wide Python build).
       
    Using Python 2.5.4 on Windows-ME where the maximum ord # is 65535 (narrow Python build) the string u'\U0010fc00' also occurs and it "seems" that the substitution takes place but no characters are lost and file sizes are ok. Note that ord(u'\U0010fc00')
    causes the following error:
                 "TypeError: ord() expected a character, but string of length 2 found"
    The condition is otherwise invisible in 2.5.4 and is handled internally without any apparent effect on processing with characters u'\udbff' and u'\udc00' each being separately accessible.

    The first part of the attachment repeats this email but also has examples and illustrates other related oddities.
       
    Any help would be greatly appreciated.
    Bruce
    BAvant Garde, Aug 12, 2010
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    0
    Views:
    751
  2. Joe Patrick

    Dropdown Alteration

    Joe Patrick, Sep 6, 2003, in forum: HTML
    Replies:
    2
    Views:
    329
    Joe Patrick
    Sep 8, 2003
  3. Chris W
    Replies:
    2
    Views:
    410
    Chris W
    Oct 11, 2008
  4. Replies:
    3
    Views:
    305
  5. nate

    interface alteration

    nate, May 19, 2006, in forum: ASP .Net Building Controls
    Replies:
    1
    Views:
    98
    Nathan Sokalski
    May 29, 2006
Loading...

Share This Page