Unicode chinese

J

John W. Kennedy

Crouchez said:
Crouchez said:
cheers.

If I do

byte[] b = chinese.getBytes( "UTF-8" );

b.length = 6. But why 6 when I thought chinese characters take up 2 bytes
per character?

So chinese characters take up 3 bytes with utf-8 and 2 with 'native
encodings'?? Imagine the extra bandwidth for a chinese server if it uses
UTF-8! +0.5!

Which is why sensible people do not use UTF-8 for Chinese. UTF-8 is
designed to be efficient for text that is mostly ASCII, but sometimes
not. That does not describe Chinese. Use UTF-16.

--
John W. Kennedy
"The pathetic hope that the White House will turn a Caligula into a
Marcus Aurelius is as naïve as the fear that ultimate power inevitably
corrupts."
-- James D. Barber (1930-2004)
 
B

bugbear

Crouchez said:
I prefer the experiments personally - those technical manuals are usually
way to wordy

Them's the breaks.

Unless you're patient enough to experiement diligently,
there will probably be cases you haven't considered.

I'm not sure how long you'd have had to experiment
to discover that getBytes is local dependent.

BugBear
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,578
Members
45,052
Latest member
LucyCarper

Latest Threads

Top