(
[email protected] uttered
I've taken a look at perlunicdoe and it seems to me that it's possible
to match the japanese characters by checking the class property.
I'm just wondering whether there is a way to check if the string
contains Japanese characters but not Chinese characters since some
Japanese characters are also Chinese characters.
Unicode uses the same code point for a given character regardless of
what language it's in. So, for instance, the character
QQQa
QQf
QQf
QQf
QQf qaa
??????????????QQP?????????????'
QQf
QQf]Q
QQf Q
]QQ ?ap
]QQ ?4ba
QQf ]QQQ
qaQ?' )?QQbaa
aaJ?? ?4QQQ?'
is Unicode 0x5927 regardless whether you're writing Chinese or Japanese.
As I understand it, all the kanji characters (along with others) are
members of the Han Unicode script, so \p{Han} will match them
regardless of whether they are used in Japanese, Chinese, both, or
neither. If you want to differentiate them, it looks as though you
are going to have to compile (or find) lists of what you consider to
be Chinese Chinese characters and Japanese Chinese characters. =)
Rick