Unicode in application/x-www-form-urlencoded?

Discussion in 'HTML' started by Leif K-Brooks, Nov 28, 2004.

  1. What is the proper encoding for a browser to use to encode non-ASCII
    characters? HTML 4.01 doesn't seem to say anything about it, but XForms
    1.0 specifies that browsers should use UTF-8.

    Of the browsers I've tested, Firefox 1.0 seems to use UTF-8 whereas
    Konqueror 3.2.2 seems to use latin-1 or a question mark if the character
    can't be represented that way. Does anyone know what IE does? Is there
    anything besides user-agent sniffing that Web authors can do to accept
    Unicode characters in our forums?
    Leif K-Brooks, Nov 28, 2004
    #1
    1. Advertising

  2. Leif K-Brooks

    Toby Inkster Guest

    Leif K-Brooks wrote:

    > What is the proper encoding for a browser to use to encode non-ASCII
    > characters? HTML 4.01 doesn't seem to say anything about it, but XForms
    > 1.0 specifies that browsers should use UTF-8.


    The HTML 2.0 spec says:
    | The form field names and values are escaped: space characters are
    | replaced by `+', and then reserved characters are escaped as per http://www.w3.org/MarkUp/html-spec...ontact Me ~ http://tobyinkster.co.uk/contact
    Toby Inkster, Nov 28, 2004
    #2
    1. Advertising

  3. Toby Inkster wrote:
    > Leif K-Brooks wrote:
    >
    >
    >>What is the proper encoding for a browser to use to encode non-ASCII
    >>characters

    >
    > RFC 1738 (the document referenced as ) says: > | Octets must be encoded ...the Content-Type. Thanks a lot for the help.
    Leif K-Brooks, Nov 28, 2004
    #3
  4. Leif K-Brooks

    Courtney Guest

    "Leif K-Brooks" <> wrote in message
    news:...
    > What is the proper encoding for a browser to use to encode non-ASCII
    > characters? HTML 4.01 doesn't seem to say anything about it, but XForms
    > 1.0 specifies that browsers should use UTF-8.
    >
    > Of the browsers I've tested, Firefox 1.0 seems to use UTF-8 whereas
    > Konqueror 3.2.2 seems to use latin-1 or a question mark if the character
    > can't be represented that way. Does anyone know what IE does? Is there
    > anything besides user-agent sniffing that Web authors can do to accept
    > Unicode characters in our forums?


    Why not specify when you create the page?

    <?xml version="1.0" encoding="utf-8"?>
    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
    "http://www.w3.org/TR/html4/loose.dtd">
    <html>
    <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
    <title>Untitled Document</title>
    </head>
    <body>
    </body>
    </html>

    This will cause the browser to use UTF-8 reguardless of what the default for
    the browser is.

    courtney sends...
    Courtney, Nov 29, 2004
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Robert Mark Bram
    Replies:
    0
    Views:
    3,905
    Robert Mark Bram
    Sep 28, 2003
  2. Replies:
    3
    Views:
    471
  3. Thomas Henz

    decode a urlencoded string

    Thomas Henz, Aug 25, 2003, in forum: ASP General
    Replies:
    2
    Views:
    111
    Ray at
    Aug 25, 2003
  4. Yohan N. Leder

    How to get UTF-8 from an urlencoded web form ?

    Yohan N. Leder, Jul 15, 2006, in forum: Perl Misc
    Replies:
    0
    Views:
    281
    Yohan N. Leder
    Jul 15, 2006
  5. Yohan N. Leder

    How get UTF-8 from urlencoded web form

    Yohan N. Leder, Jul 15, 2006, in forum: Perl Misc
    Replies:
    23
    Views:
    545
    John W. Kennedy
    Jul 20, 2006
Loading...

Share This Page