Unicode 4.0 updates to unicodedata?

Discussion in 'Python' started by David Opstad, Sep 18, 2003.

  1. David Opstad

    David Opstad Guest

    Hi, all! I'm relatively new to Python, but have definitely fallen in
    love with it. It reminds me of Mesa (old Xerox development language) and
    LISP a bit.

    Anyway, on to the question. Now that Unicode 4.0 has been released (just
    got my copy today), any guesses on how long before the unicodedata
    module will be updated to include all the new names? How do things like
    that work, anyway; is there somebody whose task it is to update that, or
    are they awaiting volunteers to help out? And once the module is
    updated, is it generally usable on earlier Python releases (I'm running
    the 2.2 that came with the OS X developer package for Jaguar)?

    Cheers!

    Dave Opstad
     
    David Opstad, Sep 18, 2003
    #1
    1. Advertising

  2. David Opstad <> writes:

    > Anyway, on to the question. Now that Unicode 4.0 has been released (just
    > got my copy today), any guesses on how long before the unicodedata
    > module will be updated to include all the new names?


    It might happen for Python 2.4, but by the time Python 2.4 is
    released, the Unicode 4.0 database might get skipped, and Python might
    incorporate Unicode 4.2 (or some such) instead.

    The tricky part is that IDNA specifies Unicode 3.2 as the basis of
    international domain names, so some technology must be found to
    incorporate two versions of the database in Python, without adding too
    much overhead.

    > How do things like that work, anyway; is there somebody whose task
    > it is to update that, or are they awaiting volunteers to help out?


    In general, it would be somebody's task (i.e. mine) to incorporate a
    new version. However, since this is more than running the generator
    again (as actual code changes have to go with it), contributions are
    welcome.

    > And once the module is updated, is it generally usable on earlier
    > Python releases (I'm running the 2.2 that came with the OS X
    > developer package for Jaguar)?


    If you want to backport that database yourself, you could just as well
    create your own version of the Unicode 4.0 database. Just run the
    generator, and rename the unicodedata module to unicodedata40 (inside
    the module's source code). Python won't then use this database
    internally (for .is*, and .upper, ...), but you could readily invoke
    the unicodedata40 functions yourself.

    Regards,
    Martin
     
    Martin v. =?iso-8859-15?q?L=F6wis?=, Sep 19, 2003
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Ken Beesley

    unicodedata name for \u000a

    Ken Beesley, Aug 21, 2004, in forum: Python
    Replies:
    7
    Views:
    7,334
    Peter Otten
    Aug 22, 2004
  2. Ken Beesley

    Re: unicodedata name for \u000a

    Ken Beesley, Aug 22, 2004, in forum: Python
    Replies:
    1
    Views:
    458
    =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=
    Aug 22, 2004
  3. Christos TZOTZIOY Georgiou

    unicodedata . normalize (NFD - NFC) inconsistency

    Christos TZOTZIOY Georgiou, Nov 8, 2004, in forum: Python
    Replies:
    3
    Views:
    907
    Christos TZOTZIOY Georgiou
    Nov 10, 2004
  4. Xah Lee
    Replies:
    5
    Views:
    456
    Xah Lee
    Mar 16, 2005
  5. Xah Lee
    Replies:
    3
    Views:
    127
    Xah Lee
    Mar 16, 2005
Loading...

Share This Page