Unicode 4.0 updates to unicodedata?

David Opstad · Sep 18, 2003

Hi, all! I'm relatively new to Python, but have definitely fallen in
love with it. It reminds me of Mesa (old Xerox development language) and
LISP a bit.

Anyway, on to the question. Now that Unicode 4.0 has been released (just
got my copy today), any guesses on how long before the unicodedata
module will be updated to include all the new names? How do things like
that work, anyway; is there somebody whose task it is to update that, or
are they awaiting volunteers to help out? And once the module is
updated, is it generally usable on earlier Python releases (I'm running
the 2.2 that came with the OS X developer package for Jaguar)?

Cheers!

Dave Opstad

Martin v. =?iso-8859-15?q?L=F6wis?= · Sep 19, 2003

David Opstad said:
Anyway, on to the question. Now that Unicode 4.0 has been released (just
got my copy today), any guesses on how long before the unicodedata
module will be updated to include all the new names?

It might happen for Python 2.4, but by the time Python 2.4 is
released, the Unicode 4.0 database might get skipped, and Python might
incorporate Unicode 4.2 (or some such) instead.

The tricky part is that IDNA specifies Unicode 3.2 as the basis of
international domain names, so some technology must be found to
incorporate two versions of the database in Python, without adding too
much overhead.

How do things like that work, anyway; is there somebody whose task
it is to update that, or are they awaiting volunteers to help out?

In general, it would be somebody's task (i.e. mine) to incorporate a
new version. However, since this is more than running the generator
again (as actual code changes have to go with it), contributions are
welcome.

And once the module is updated, is it generally usable on earlier
Python releases (I'm running the 2.2 that came with the OS X
developer package for Jaguar)?

If you want to backport that database yourself, you could just as well
create your own version of the Unicode 4.0 database. Just run the
generator, and rename the unicodedata module to unicodedata40 (inside
the module's source code). Python won't then use this database
internally (for .is*, and .upper, ...), but you could readily invoke
the unicodedata40 functions yourself.

Regards,
Martin

[perl-python] unicode study with unicodedata module	5	Mar 15, 2005
Unicode 4.0 support	1	Dec 5, 2003
A dateutil error has appeared, due to updates? How to fix?	2	Sep 23, 2012
ANN: Unum 4.0 beta	3	Nov 16, 2003
trying to understand unicode	1	Apr 20, 2005
ANN: eGenix mxODBC 3.2.0 - Python ODBC Database Interface	0	Aug 28, 2012
Use of logging module to track TODOs	0	Nov 27, 2013
ANN: eGenix mxODBC Connect - Python Database Interface 2.0.2	0	Dec 14, 2012

Unicode 4.0 updates to unicodedata?

David Opstad

Martin v. =?iso-8859-15?q?L=F6wis?=

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads