Decomposing combined characters

I

Ian Pilcher

Does the Java API provide a way to identify and/or decompose combined
characters? For example, I need to identify characters such as U+00C2
(LATIN CAPITAL LETTER A WITH CIRCUMFLEX) and decompose it into U+0041
(LATIN CAPITAL LETTER A) and U+0302 (COMBINING CIRCUMFLEX ACCENT).

Thanks!
 
J

John O'Conner

Ian said:
Does the Java API provide a way to identify and/or decompose combined
characters? For example, I need to identify characters such as U+00C2
(LATIN CAPITAL LETTER A WITH CIRCUMFLEX) and decompose it into U+0041
(LATIN CAPITAL LETTER A) and U+0302 (COMBINING CIRCUMFLEX ACCENT).

Thanks!


Although there is no public API for this, you could try
sun.text.Normalizer...again, no API is documented. However, I can tell
you that thre is a normalize method.

Alternatively, you can use a public library available from IBM...known
as ICU4J.

Regards,
John O'Conner
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top