Unicode in application/x-www-form-urlencoded?

L

Leif K-Brooks

What is the proper encoding for a browser to use to encode non-ASCII
characters? HTML 4.01 doesn't seem to say anything about it, but XForms
1.0 specifies that browsers should use UTF-8.

Of the browsers I've tested, Firefox 1.0 seems to use UTF-8 whereas
Konqueror 3.2.2 seems to use latin-1 or a question mark if the character
can't be represented that way. Does anyone know what IE does? Is there
anything besides user-agent sniffing that Web authors can do to accept
Unicode characters in our forums?
 
T

Toby Inkster

Leif said:
What is the proper encoding for a browser to use to encode non-ASCII
characters? HTML 4.01 doesn't seem to say anything about it, but XForms
1.0 specifies that browsers should use UTF-8.

The HTML 2.0 spec says:
| The form field names and values are escaped: space characters are
| replaced by `+', and then reserved characters are escaped as per http://www.w3.org/MarkUp/html-spec... for HTTP/1.1 is ISO-8859-1, so perhaps that?
 
L

Leif K-Brooks

Toby said:
Leif K-Brooks wrote:

What is the proper encoding for a browser to use to encode non-ASCII
characters

RFC 1738 (the document referenced as ) says: | Octets must be encoded if... the Content-Type. Thanks a lot for the help.
 
C

Courtney

Leif K-Brooks said:
What is the proper encoding for a browser to use to encode non-ASCII
characters? HTML 4.01 doesn't seem to say anything about it, but XForms
1.0 specifies that browsers should use UTF-8.

Of the browsers I've tested, Firefox 1.0 seems to use UTF-8 whereas
Konqueror 3.2.2 seems to use latin-1 or a question mark if the character
can't be represented that way. Does anyone know what IE does? Is there
anything besides user-agent sniffing that Web authors can do to accept
Unicode characters in our forums?

Why not specify when you create the page?

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>Untitled Document</title>
</head>
<body>
</body>
</html>

This will cause the browser to use UTF-8 reguardless of what the default for
the browser is.

courtney sends...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,754
Messages
2,569,521
Members
44,995
Latest member
PinupduzSap

Latest Threads

Top