Newbie problem: conversion with unicode

F

Francois

Hi all,

I have caracters represented with escape sequence in a file:
\u00B3
\u2074
\u2075

these represents superscript 3, 4, 5 respectively.
Now I read the file and read these as String "\u00B3".
I would convert this representation in a new String object: which would
have the superscrit 3 4 and 5.

What is the correct code for this ? I have compilation errors or
converion errors ...

Thanks a lot for any help !

Francois Rappaz
 
H

Harald

Francois said:
I have caracters represented with escape sequence in a file:
\u00B3
\u2074
\u2075

Now I read the file and read these as String "\u00B3".
I would convert this representation in a new String object: which would
have the superscrit 3 4 and 5.

What is the correct code for this ? I have compilation errors or
converion errors ...

Strip the backslash, strip the u, convert the resulting four hex digits
to an int (Integer.parseInt() or the likes should help), cast the
resulting int to char, append to a StringBuilder (StringBuffer),
convert the resulting StringBuilder to String.

Harald.
 
F

Francois

Thanks a lot !

That's is exactly what I needed and it gives something like

String s = rawCode;
byte c = new byte[1]
try {
c[0] = Integer.decode("0x" + rawCode.substring(2)).byteValue();
} catch (NumberFormatExceptione){e.printStackTrace();}
String result = new String(c);

Could I have encoding problems with that code ? or should it worked on
any situation ?
TIA
Francois
 
H

HK

Francois said:
Thanks a lot !

That's is exactly what I needed and it gives something like

String s = rawCode;

You don't use the s below, but rawCode (should not matter).
byte c = new byte[1]

This should definitively be char[], not byte[].

try {
c[0] = Integer.decode("0x" + rawCode.substring(2)).byteValue();

You certainly want .intValue(), not .byteValue(). Remember that
char --- in Java and very much unlike C/C++ --- is 2 bytes long.
In addition I would rather use

c[0] = Integer.parseInt(rawCode.substring(2), 16);
} catch (NumberFormatExceptione){e.printStackTrace();}
String result = new String(c);

Could I have encoding problems with that code ? or should it worked on
any situation ?

I assume you checked before that rawCode starts with '\\' and 'u'
and that it contains at most 4 hex digits. If it contains 5, digits,
you loose
something in the cast to char.

Harald.
 
F

Francois

Well probably nobody care, but the code above does not work:
the following seems to be alright:

char c[] = new char[n]
s has been read in a file and contain "\u2079" for example
....
c= (char)Integer.parseInt(s.substring(2),16);
....
and for the whole array c
String result = new String(c);

Francois
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,431
Messages
2,571,677
Members
48,796
Latest member
Greg L.

Latest Threads

Top