Convert UTF-8 encoded data from file into unicode escape

F

Fritz Bayer

Hi,

I'm looking for a little program, which reads utf-8 data from a file
and writes it in the form of unicode escape into another text file.

Why am I looking for something like this? Well, I have a file which
contains utf-8 encoded data.

That data I would like to build staticly into my program. So I would
like to copy and paste it into a String constant (String
="\uxxxx\uxxxx...").

However, since I can't just open up a viewer and copy and paste the
contents (of course), I would have to convert it into unicode escape.

Then I could copy and paste those escape code into my program. I
thought that their must be some source code / program aroung which
does that?

Fritz
 
A

Alex Kizub

Fritz:
Convertion from utf-8 to unicode is Java loalization privilege.
So, let Java do what it supposed to do.

public class a{
public static void main(String []a) throws Exception {
java.text.DecimalFormat f;
f = new java.text.DecimalFormat();
f.applyPattern("\\u0000");

java.io.FileReader fr=new java.io.FileReader("a.java");
while (fr.ready()) {
System.out.println(f.format(fr.read()));
}
fr.close();
}
}

Alex Kizub.
 
F

Fritz Bayer

Alex Kizub said:
Fritz:
Convertion from utf-8 to unicode is Java loalization privilege.
So, let Java do what it supposed to do.

public class a{
public static void main(String []a) throws Exception {
java.text.DecimalFormat f;
f = new java.text.DecimalFormat();
f.applyPattern("\\u0000");

java.io.FileReader fr=new java.io.FileReader("a.java");
while (fr.ready()) {
System.out.println(f.format(fr.read()));
}
fr.close();
}
}

Alex Kizub.

Fritz said:
Hi,

I'm looking for a little program, which reads utf-8 data from a file
and writes it in the form of unicode escape into another text file.

Why am I looking for something like this? Well, I have a file which
contains utf-8 encoded data.

That data I would like to build staticly into my program. So I would
like to copy and paste it into a String constant (String
="\uxxxx\uxxxx...").

However, since I can't just open up a viewer and copy and paste the
contents (of course), I would have to convert it into unicode escape.

Then I could copy and paste those escape code into my program. I
thought that their must be some source code / program aroung which
does that?

Fritz

Thank you Alex. I`m experience a small problem so. Some of the escapes
look like:

\u65533

ie they are too long. I also noticed that none of the escapes contain
hexadecimal, which seems to be wrong since unicode escapes contain
them.
 
A

Alex Kizub

My appology. Of course it should be hex numbers.
It's hard to think in the middele of the night.
But it's obvious and you can do hex numbers by yourself.

Here is one of solutions.
public class a{
public static void main(String []a) throws Exception {

java.io.FileReader fr=new java.io.FileReader("a.java");
while (fr.ready()) {
String hex=Integer.toHexString(fr.read());
switch (hex.length()){
case 1: System.out.print("\\u000"); break;
case 2: System.out.print("\\u00"); break;
case 3: System.out.print("\\u0"); break;
case 4: System.out.print("\\u"); break;
default: throw new RuntimeException( hex+" is tool long to be a Character");
}
System.out.println(hex);
} fr.close();
}
}

Alex Kizub.
Fritz said:
Alex Kizub said:
Fritz:
Convertion from utf-8 to unicode is Java loalization privilege.
So, let Java do what it supposed to do.

public class a{
public static void main(String []a) throws Exception {
java.text.DecimalFormat f;
f = new java.text.DecimalFormat();
f.applyPattern("\\u0000");

java.io.FileReader fr=new java.io.FileReader("a.java");
while (fr.ready()) {
System.out.println(f.format(fr.read()));
}
fr.close();
}
}

Alex Kizub.

Fritz said:
Hi,

I'm looking for a little program, which reads utf-8 data from a file
and writes it in the form of unicode escape into another text file.

Why am I looking for something like this? Well, I have a file which
contains utf-8 encoded data.

That data I would like to build staticly into my program. So I would
like to copy and paste it into a String constant (String
="\uxxxx\uxxxx...").

However, since I can't just open up a viewer and copy and paste the
contents (of course), I would have to convert it into unicode escape.

Then I could copy and paste those escape code into my program. I
thought that their must be some source code / program aroung which
does that?

Fritz

Thank you Alex. I`m experience a small problem so. Some of the escapes
look like:

\u65533

ie they are too long. I also noticed that none of the escapes contain
hexadecimal, which seems to be wrong since unicode escapes contain
them.
 
M

Michael Borgwardt

Fritz said:
I'm looking for a little program, which reads utf-8 data from a file
and writes it in the form of unicode escape into another text file.

Sun distributes that program with its JDK/SDK. It's called
"native2ascii".
 
F

Fritz Bayer

Michael Borgwardt said:
Sun distributes that program with its JDK/SDK. It's called
"native2ascii".

Thanks for the tip. I looked at it and their are just to issues with
the program. The one thing is that is does not encode all ASCII codes
as unicode escapes.

So if non printable characters occur I will not be able to copy and
paste them into my source code. That's why I'm looking for something,
which converts everything.

Second issue I ran into is that if I copy and paste only unicode
escapes into a string, I still have to escape some of the characters
for example the " - This seems very cumbersome. I guess i would also
have to escape newlines and tabs and so on to be sure that everything
gets imported correctly. Oh my...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top