Unicode support

Z

zakasbanda

Hello All

I need some help from you folks. I want my J2EE webapp to accept
international characters.

Here is what I have done so far
1. Wrote a filter that encodes request and response with UTF-8.
2. Oracle is UTF-8.

Now when I input a hindi character from the jsp page, and save it in
database, I see characters in format. Here are my questions,
1. Is it okay to see characters in this format in database?
2. Now we escape (html) all are string text fields, which escape & and
&#dddd shows as is on the browser, which is problem.

Any help or suggestions will be appreciated.

Thanks,
-Sandy
 
S

Stanimir Stamenkov

Mon, 6 Oct 2008 01:01:02 -0700 (PDT), /[email protected]/:
Now when I input a hindi character from the jsp page, and save it in
database, I see characters in format. Here are my questions,
1. Is it okay to see characters in this format in database?
2. Now we escape (html) all are string text fields, which escape & and
&#dddd shows as is on the browser, which is problem.

It is a browser issue. For backwards compatibility browsers encode
the submitted HTML form data using the document encoding, or if
specified the encoding given in 'accept-charset' attribute of the
FORM element (but I remember the later not working in all browsers
sometime in the past). So whenever a character can't be encoded
using the target encoding it is converted to a HTML character
reference. The only sure thing in this case is to serve the HTML
document using some UTF variant capable of encoding the entire
Unicode repertoire.
 
R

Roedy Green

Hello All

I need some help from you folks. I want my J2EE webapp to accept
international characters.

Here is what I have done so far
1. Wrote a filter that encodes request and response with UTF-8.
2. Oracle is UTF-8.

Now when I input a hindi character from the jsp page, and save it in
database, I see characters in format. Here are my questions,
1. Is it okay to see characters in this format in database?
2. Now we escape (html) all are string text fields, which escape & and
&#dddd shows as is on the browser, which is problem.

Any help or suggestions will be appreciated.

The first thing I would do is have a look at the page sent to the
browser.

You should see something like this in it:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

Further, check that the Hindi characters in the page are indeed UTF-8
encoded. See http://mindprod.com/jgloss/utf.html

Then use Wireshark to snoop on the message from the browser to the
server. See http://mindprod.com/jgloss/wireshark.html

Make sure the browser is including UTF-8 as one of its preferred
response encodings, and that message itself is UTF-8 encoded. See
http://mindprod.com/jgloss/http.html

In any problem the first job is to localise who is screwing up. Then
you can work on correcting the problem.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top