POSTing: can character encoding be specified?

M

Mickey Segal

When you use a URLConnection, can you impose a Unicode character encoding
for the text POSTed? I image one could add the encoding like this:

BufferedWriter bufferedWriter = new BufferedWriter(new
OutputStreamWriter(urlConnection.getOutputStream(), "UTF8"));
bufferedWriter.write(query,0,query.length());

but I don't see an encoding parameter in the servlet's
ServletRequest.getParameter() method. Is encoding supposed to be recognized
automatically? If not, is there a way to specify character encoding when
POSTing to a servlet and have it recognized?
 
B

Bryce

When you use a URLConnection, can you impose a Unicode character encoding
for the text POSTed? I image one could add the encoding like this:

BufferedWriter bufferedWriter = new BufferedWriter(new
OutputStreamWriter(urlConnection.getOutputStream(), "UTF8"));
bufferedWriter.write(query,0,query.length());

but I don't see an encoding parameter in the servlet's
ServletRequest.getParameter() method. Is encoding supposed to be recognized
automatically? If not, is there a way to specify character encoding when
POSTing to a servlet and have it recognized?

In the header of your post, set the "Content-Type" to someting like:
application/x-www-form-urlencoded;charset=UTF8
 
M

Mickey Segal

Bryce said:
In the header of your post, set the "Content-Type" to someting like:
application/x-www-form-urlencoded;charset=UTF8

Thanks. I knew it had to be there somewhere but this kind of stuff doesn't
get covered well in most Java materials.
 
M

Mickey Segal

I also see there is a form of URLEncoder.encode with character encoding
specified in addition to URL encoding. However, this is not available in
Java 1.1 so if we use it we need to leave our Microsoft JVM users behind,
and they are still about half of our users.

This is getting complicated. To send UTF-8 information via URL connection,
which of the following three encodings need to be done (and do I have the
right UTF-8 versus UTF8 syntax):

1. Use URLEncoder.encode("my string contents", "UTF-8")
2. Set "Content-Type" to: application/x-www-form-urlencoded;charset=UTF8
3. Use new OutputStreamWriter(myOutputStream, "UTF8")
 
B

Bryce

I also see there is a form of URLEncoder.encode with character encoding
specified in addition to URL encoding. However, this is not available in
Java 1.1 so if we use it we need to leave our Microsoft JVM users behind,
and they are still about half of our users.

This is getting complicated. To send UTF-8 information via URL connection,
which of the following three encodings need to be done (and do I have the
right UTF-8 versus UTF8 syntax):

1. Use URLEncoder.encode("my string contents", "UTF-8")

For URLEncoding query strings. Doesn't really apply to Posting
2. Set "Content-Type" to: application/x-www-form-urlencoded;charset=UTF8

This Works
3. Use new OutputStreamWriter(myOutputStream, "UTF8")

This will encode the stream as UTF8, but I don't think the server will
necessarily know what the encoding is.
 
B

Bryce

Thanks. I knew it had to be there somewhere but this kind of stuff doesn't
get covered well in most Java materials.

Yep, because its not a Java issue, its a standard HTTP issue.
 
M

Mickey Segal

The more I look into this the more confused I get. I thought one had to do
URL encoding when POSTing from a Java applet, as in this Java World code:
http://www.javaworld.com/javaworld/javatips/jw-javatip34.html
Is this Java World article just wrong about the need to use URL encoding?

Things seem much more complicated with UTF encoding on top of the URL
encoding.

Java 2 has URLEncoder and URLDecoder classes that allow you to specify
"UTF-8" encoding. However, the URL decoding happens automatically in the
receiving servlet, and that messes up using UTF decoding.

It is hard to imagine that I am the first one trying to use the UTF form of
URL encoding, particularly since the non-UTF form is deprecated. Is there
some reasonable way to get this all to work? I am trying to do the
following:

1. POST strings from an applet to a servlet. The strings sometimes have
non-Latin characters so I presume I need UTF encoding.
2. Receive the strings in the servlet, reconstituting the string from the
UTF encoding.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,432
Messages
2,571,681
Members
48,796
Latest member
Greg L.

Latest Threads

Top