utf-8 conversion

G

Gerry Lawrence

Newbie question: I have a method that takes utf-8 data as input, but
it is not working correctly. Can someone tell me what variable type I
should use to accept the utf-8 properly and then what I need to do to
use it with the StringTokenizer class? I'm currently using:

public static void MyParse(String line)

Thanks,

Gerry
 
M

Minh Tran-Le

Gerry Lawrence said the following on 2003-12-03 16:18:
Newbie question: I have a method that takes utf-8 data as input, but
it is not working correctly. Can someone tell me what variable type I
should use to accept the utf-8 properly and then what I need to do to
use it with the StringTokenizer class? I'm currently using:

public static void MyParse(String line)

Thanks,

Gerry

In Java the String class is already internally a unicode string. So if
you really getting input in utf-8 encoding you will need to put it in a
byte[] array then create the String.

Something like this:
byte[] bytes = new byte[256];
... read your input into the bytes buffer
String myString = new String(bytes, "utf-8");
... at this point your string contains unicode characters.

The StringTokenizer deal with characters and does see the byte level
details.

If you need to convert back to utf-8 byte array you can use:
bytes = myString.getBytes("utf-8");
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,432
Messages
2,571,680
Members
48,796
Latest member
Greg L.

Latest Threads

Top