how to post UTF-8 values to a servlet

Discussion in 'Java' started by Andy Fish, Dec 20, 2004.

  1. Andy Fish

    Andy Fish Guest

    Hi,

    I have a form with some text boxes, and I'm trying to post the data to a
    servlet in utf-8 format (which I would have thought would be the default but
    it appears not?)

    the HTML file containing the form itself is definitely encoded in UTF-8, and
    the form tag looks like this:

    <form action="http://localhost:8080/foo/servlet" method="post" id="form1"
    charset="UTF-8" name="form1">
    <INPUT type="text" name="foo">
    </form>

    In the servlet I'm just calling request.getParameter("foo");

    If I type in an English pound sign £ (this is the English currency symbol,
    not #), I get £ (which is A circumflex followed by the pound sign).

    I've been playing with various variations for 1/2 a day now and it's
    starting to get me rather frustrated. Can anyone point me in the right
    direction.

    TIA

    Andy
     
    Andy Fish, Dec 20, 2004
    #1
    1. Advertising

  2. > the HTML file containing the form itself is definitely encoded in UTF-8, and
    > the form tag looks like this:
    >
    > <form action="http://localhost:8080/foo/servlet" method="post" id="form1"
    > charset="UTF-8" name="form1">
    > <INPUT type="text" name="foo">
    > </form>
    >
    > In the servlet I'm just calling request.getParameter("foo");
    >
    > If I type in an English pound sign £ (this is the English currency symbol,
    > not #), I get £ (which is A circumflex followed by the pound sign).
    >
    > I've been playing with various variations for 1/2 a day now and it's
    > starting to get me rather frustrated. Can anyone point me in the right
    > direction.
    >



    How are you testing for the value of getParameter("foo") ? If you are
    outputting to the console and you are using Windows, you will very
    likely get gibberish, as the Windows console does not output UNICODE
    properly.

    If this is true (console), then try outputting to a file instead and
    open the text file with a UNICODE capable text editor.

    Sorry if you've tried this already -- I beat my head into the wall for a
    week before asking this same question on alt.text.xml and getting this
    answer rather quickly.

    Collin
     
    Collin VanDyck, Dec 20, 2004
    #2
    1. Advertising

  3. Andy Fish

    Andy Fish Guest

    "Collin VanDyck" <> wrote in message
    news:ZNGxd.5038469$...
    >> the HTML file containing the form itself is definitely encoded in UTF-8,
    >> and the form tag looks like this:
    >>
    >> <form action="http://localhost:8080/foo/servlet" method="post" id="form1"
    >> charset="UTF-8" name="form1">
    >> <INPUT type="text" name="foo">
    >> </form>
    >>
    >> In the servlet I'm just calling request.getParameter("foo");
    >>
    >> If I type in an English pound sign £ (this is the English currency
    >> symbol, not #), I get £ (which is A circumflex followed by the pound
    >> sign).
    >>
    >> I've been playing with various variations for 1/2 a day now and it's
    >> starting to get me rather frustrated. Can anyone point me in the right
    >> direction.
    >>

    >
    >
    > How are you testing for the value of getParameter("foo") ? If you are
    > outputting to the console and you are using Windows, you will very likely
    > get gibberish, as the Windows console does not output UNICODE properly.
    >
    > If this is true (console), then try outputting to a file instead and open
    > the text file with a UNICODE capable text editor.
    >


    No, I'm looking at it in the debugger. I can see that the string is of
    length 2 characters.

    Andy

    > Sorry if you've tried this already -- I beat my head into the wall for a
    > week before asking this same question on alt.text.xml and getting this
    > answer rather quickly.
    >
    > Collin
    >
     
    Andy Fish, Dec 21, 2004
    #3
  4. Andy Fish

    Andy Fish Guest

    "Andy Fish" <> wrote in message
    news:y2Fxd.3780$...
    > Hi,
    >
    > I have a form with some text boxes, and I'm trying to post the data to a
    > servlet in utf-8 format (which I would have thought would be the default
    > but it appears not?)
    >
    > the HTML file containing the form itself is definitely encoded in UTF-8,
    > and the form tag looks like this:
    >
    > <form action="http://localhost:8080/foo/servlet" method="post" id="form1"
    > charset="UTF-8" name="form1">
    > <INPUT type="text" name="foo">
    > </form>
    >
    > In the servlet I'm just calling request.getParameter("foo");
    >
    > If I type in an English pound sign £ (this is the English currency symbol,
    > not #), I get £ (which is A circumflex followed by the pound sign).
    >
    > I've been playing with various variations for 1/2 a day now and it's
    > starting to get me rather frustrated. Can anyone point me in the right
    > direction.
    >


    FWIW, my solution was to call request.setCharacterEncoding("UTF-8") at the
    top of the doPost method.

    I'm not entirely happy with hard-coding it this way, but hey it seems to
    work.


    > TIA
    >
    > Andy
    >
    >
     
    Andy Fish, Dec 21, 2004
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. JJBW
    Replies:
    1
    Views:
    10,223
    Joerg Jooss
    Apr 24, 2004
  2. =?Utf-8?B?QXNoYQ==?=
    Replies:
    3
    Views:
    430
  3. Sean Clarke
    Replies:
    1
    Views:
    1,901
    Sudsy
    Jan 7, 2004
  4. circuit_breaker
    Replies:
    2
    Views:
    2,021
    Jack Jia
    Apr 4, 2004
  5. Arifi Koseoglu
    Replies:
    2
    Views:
    983
    Arifi Koseoglu
    Apr 13, 2004
Loading...

Share This Page