B
Ben Bullock
Does anyone know of a parser for Jim Breen's kanjidic written in Perl?
David said:| Does anyone know of a parser for Jim Breen's kanjidic written in
| Perl?
<URL: http://search.cpan.org> is a nice tool for finding all things
perl. Maybe you can use the module Lingua::JP::Kanjidic by Simon
Cozens?
Apud Ben Bullock said:Does anyone know of a parser for Jim Breen's kanjidic written in Perl?
Why parse kanjidic, when there is an XML edition available? (see
http://www.csse.monash.edu.au/~jwb/kanjidic2/index.html)
There are lashings of XML parsers.
I read that page yesterday and saw the comment
"At this stage the KANJIDIC2 file is officially released, but please
understand that it is still early days for the project and changes in the
structure may occur, so don't assume anything is set in concrete if you use
the file in a project."
So, I assumed the format of kanjidic is more stable.
Can you tell us if the format of the XML kanjidic is likely to change enough
to break existing software?
Also, while I'm at it, a small erratum. On both the kanjidic and kanjidic2
documentation pages, De Roo's kanji book is listed as being published by
Bojinsha, but this should be "Bonjinsha".
Thanks.
In the end I copied out an old C file from my former cjdic project which
contained most of the codes for kanjidic, and edited it to parse kanjidic
completely. The next job is to plug the information into MySQL.
In sci.lang.japan said:Not to mention lashings of ginger beer.
Hurrah!
I don't know anything about XML, but surely the same amount of
parsing work is required for either format.
Want to reply to this thread or ask your own question?
You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.