how to post UTF-8 values to a servlet

A

Andy Fish

Hi,

I have a form with some text boxes, and I'm trying to post the data to a
servlet in utf-8 format (which I would have thought would be the default but
it appears not?)

the HTML file containing the form itself is definitely encoded in UTF-8, and
the form tag looks like this:

<form action="http://localhost:8080/foo/servlet" method="post" id="form1"
charset="UTF-8" name="form1">
<INPUT type="text" name="foo">
</form>

In the servlet I'm just calling request.getParameter("foo");

If I type in an English pound sign £ (this is the English currency symbol,
not #), I get £ (which is A circumflex followed by the pound sign).

I've been playing with various variations for 1/2 a day now and it's
starting to get me rather frustrated. Can anyone point me in the right
direction.

TIA

Andy
 
C

Collin VanDyck

the HTML file containing the form itself is definitely encoded in UTF-8, and
the form tag looks like this:

<form action="http://localhost:8080/foo/servlet" method="post" id="form1"
charset="UTF-8" name="form1">
<INPUT type="text" name="foo">
</form>

In the servlet I'm just calling request.getParameter("foo");

If I type in an English pound sign £ (this is the English currency symbol,
not #), I get £ (which is A circumflex followed by the pound sign).

I've been playing with various variations for 1/2 a day now and it's
starting to get me rather frustrated. Can anyone point me in the right
direction.


How are you testing for the value of getParameter("foo") ? If you are
outputting to the console and you are using Windows, you will very
likely get gibberish, as the Windows console does not output UNICODE
properly.

If this is true (console), then try outputting to a file instead and
open the text file with a UNICODE capable text editor.

Sorry if you've tried this already -- I beat my head into the wall for a
week before asking this same question on alt.text.xml and getting this
answer rather quickly.

Collin
 
A

Andy Fish

Collin VanDyck said:
How are you testing for the value of getParameter("foo") ? If you are
outputting to the console and you are using Windows, you will very likely
get gibberish, as the Windows console does not output UNICODE properly.

If this is true (console), then try outputting to a file instead and open
the text file with a UNICODE capable text editor.

No, I'm looking at it in the debugger. I can see that the string is of
length 2 characters.

Andy
 
A

Andy Fish

Andy Fish said:
Hi,

I have a form with some text boxes, and I'm trying to post the data to a
servlet in utf-8 format (which I would have thought would be the default
but it appears not?)

the HTML file containing the form itself is definitely encoded in UTF-8,
and the form tag looks like this:

<form action="http://localhost:8080/foo/servlet" method="post" id="form1"
charset="UTF-8" name="form1">
<INPUT type="text" name="foo">
</form>

In the servlet I'm just calling request.getParameter("foo");

If I type in an English pound sign £ (this is the English currency symbol,
not #), I get £ (which is A circumflex followed by the pound sign).

I've been playing with various variations for 1/2 a day now and it's
starting to get me rather frustrated. Can anyone point me in the right
direction.

FWIW, my solution was to call request.setCharacterEncoding("UTF-8") at the
top of the doPost method.

I'm not entirely happy with hard-coding it this way, but hey it seems to
work.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,567
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top