Vyz said:
Its a module to transliterate from telugu language written in roman
script into native unicode. right now its running in a browser window
at
www.lekhini.org I Intend to translate it into python so that I can
integrate into other tools I have. I am able to pass arguments and get
output from the script also would be OK. or how about ways to wrap
these javascript functions with python.
Leaving aside the code the manipulated the display and user interaction,
the code should be pretty straightforward logic (if-else statements) and
table lookups, so translation to Python should be straightforward also.
I checked parser.js. I don't know javascript but it looks to me like a
mixture of C and Python. The for loop headers have to be rewritten, and
the switch changed to if-elif. What looks different is the attachment as
attributes of method functions to functions rather than classes.
As for 'wrapping': can you get a standard javascript interpreter? If so,
you could possibly adjust the js so you can pipe a roman string to the js
program and have it pipe back the telegu unicode version.
unicode.js is mostly a few hundred verbose lines like
Unicode.codePoints[Padma.lang_TELUGU].letter_PHA = "\u0C2B";
that setup the translation dict. Because the object model is different, I
suspect that these all need to be changed, but, I also suspect, in a
mechanical way.
If one were starting in Python, one might either just define a dict more
compactly like
TEL_uni = {letter_PHA:"\u0C2B", ...}
*or* probably better, use the builtin unicodedata module as much as
possible.
u'\u0c2b'
I don't know what you do with js statement like this:
Unicode.toPadma[Unicode.codePoints[Padma.lang_TELUGU].misc_VIRAMA +
Unicode.codePoints[Padma.lang_TELUGU].letter_KA] = Padma.vattu_KA;
where a constant seems to be assigned to a sum. But whatever these do
might correspond to the u.normalize function.
This appears to be based on a generic Indian-script transliteration program
(Padma), so there may be functions not really needed for Telegu. (I am
familiar with Devanagri but know nothing of Telegu and its script except
that it is Dravidian rather than Indo-European-Sanskritic.)
Good luck.
Terry Jan Reedy