F
Fritz Bayer
Hello,
I have stumbled across something, which seems to be of ambuiguity.
Recently I decoded the URI of a servlet request.
At first I could not get the expected result. The umlauts äüö would
not show up correctly, which made me wonder why.
I then tried URLDecoder.decode(uri, "UTF-8"), which also did not work.
After googling a bit I found out that tomcat 5.0 (which I use) used to
send the URI's in the encoding of the document transferred but now
always sends the URI in ISO-8859-1 but that another encoding can be
specified in the connector by setting the attribute URIEncoding="..".
So I set it to "utf8" and now I can decode the URI's correctly.
However, I was wondering how it can be that this does not seem to be
specified.
I though that the HTTP 1.1 protocoll encoding is ASCII only. Of course
the documents transfered can have a different encoding. But the URI
part belongs to the startline of the message and therefore to the
protocoll.
Anyway if somebody wants to elaborate a bit on this uri issue I would
be interested in having a little conversation about the subject.
Fritz
BTW: So it seems that how uri's get treated depend on the
implementation of each servlet engine?!
I have stumbled across something, which seems to be of ambuiguity.
Recently I decoded the URI of a servlet request.
At first I could not get the expected result. The umlauts äüö would
not show up correctly, which made me wonder why.
I then tried URLDecoder.decode(uri, "UTF-8"), which also did not work.
After googling a bit I found out that tomcat 5.0 (which I use) used to
send the URI's in the encoding of the document transferred but now
always sends the URI in ISO-8859-1 but that another encoding can be
specified in the connector by setting the attribute URIEncoding="..".
So I set it to "utf8" and now I can decode the URI's correctly.
However, I was wondering how it can be that this does not seem to be
specified.
I though that the HTTP 1.1 protocoll encoding is ASCII only. Of course
the documents transfered can have a different encoding. But the URI
part belongs to the startline of the message and therefore to the
protocoll.
Anyway if somebody wants to elaborate a bit on this uri issue I would
be interested in having a little conversation about the subject.
Fritz
BTW: So it seems that how uri's get treated depend on the
implementation of each servlet engine?!