Unicode escapes and String literals?

L

Lew

markspace said:
David said:
Cute. But presupposing that the OP isn't the idiot some people seem to
have assumed, I suspect he meant something more like

String line = someBufferedFile.readline();
... change all \u escapes into unicode in "line" ... [1]

That was not obvious to me, hence my question as to what he did mean.
Maybe. But your code above is obvious, imo. Either Knute had a brain
fart and forgot about \\ to escape a slash, or he ran into some other
problem.

My point was that there's a very simple pre-compiler for Java. It
translates all \u-escapes into characters before the compiler proper
sees it. There's no difference to the Java compiler between "fed" and
"\u0066\u0065\u0064". It literally can't tell the difference.

That was also the point of my SSCCE.
 
R

Roedy Green

I just had a great revelation as I was putting together my SSCCE for the
question I was going to ask. So it has changed my question. How do I
do the conversion of unicode escape sequences to a String that are done
by string literals?

String s = "\u0066\u0065\u0064";

becomes "fed" but if you create a String with \u0066\u0065\u0064 in it
without using the literal it stays \u0066\u0065\u0064. Is there a built
in mechanism in Java for doing that translation to a String?

have a look at native2ascii

IIRC it uses sequences like that in its ASCII representation which you
can then convert to any encoding you like.

see http://mindprod.com/jgloss/encoding.html#NATIVE2ASCII

A little finite state machine should handle that fairly easily.
If you find that difficult, I would write one for you.
 
L

Lew

rossum said:
StringBuilder sb = new StringBuilder(18);
sb.append('\\');
sb.append("u0066");
sb.append('\\');
sb.append("u0065");
sb.append('\\');
sb.append("u0064");

String ss = sb.toString();
System.out.println(ss);

Produces: \u0066\u0065\u0064

Which still leaves the question why?

This has been explained to death upthread already.

Those are not Unicode escapes, that's why.

You have created the String literal that comprises backslashes, the letter "u" and
various digits. That happens at runtime.

There is no way for the pre-compiler to see those and convert them.

That code sequence is exactly equivalent to this one:

StringBuilder sb = new StringBuilder(\u0031\u0038);
sb.append('\u005c\u005c\u0027)\u003b
sb.append("\u0075\u0030\u0030\u0036\u0036");
sb.append('\u005c\u005c\u0027)\u003b
sb.append("u006\u0035\u0022);
sb.append('\u005c\u005c\u0027)\u003b
sb.append(\u0022\u00750064");

Unicode escape sequence processing is a pre-compiler operation, not a compiler
operation and not a run-time operation.

To do what you want you have to parse the string and convert it yourself.
 
A

Arne Vajhøj

have a look at native2ascii

IIRC it uses sequences like that in its ASCII representation which you
can then convert to any encoding you like.

see http://mindprod.com/jgloss/encoding.html#NATIVE2ASCII

First: it does not do what Knute asked for. It actually
generates the escape sequences that Knute is trying to
convert from.

Second: even it has done whar Knute asked for, then:
- create a file with the String
- use Runtime exec (or ProcessBuilder) to run native2ascii
- read a new String from the new file
seems at the least efficient solution possible.
A little finite state machine should handle that fairly easily.
If you find that difficult, I would write one for you.

Based on the above: hmmmmmmm.

Arne
 
R

Roedy Green

I just had a great revelation as I was putting together my SSCCE for the
question I was going to ask. So it has changed my question. How do I
do the conversion of unicode escape sequences to a String that are done
by string literals?

The code you want exists inside Quoter.

see FromJavaStringLiteral
and ToJavaStringLiteral classes.

Source is available from http://mindprod.com/products.html#QUOTER
you can play with it as an Applet at
http://mindprod.com/applet/quoter.html
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top