UNICODE input for CGI using C

Discussion in 'C Programming' started by puneet.p.shah@gmail.com, May 29, 2007.

  1. Guest

    Dear All,
    I'm trying to accept a multi-lingual string (UNICODE) in a
    form and am trying to parse it. What i am getting is %XX (which is a
    single byte, not 2 bytes). So, is the data getting lost? What format
    is it, if it is not getting lost.

    Thanx in advance,
    Punit.
     
    , May 29, 2007
    #1
    1. Advertising

  2. In article <>,
    <> wrote:

    > I'm trying to accept a multi-lingual string (UNICODE) in a
    >form and am trying to parse it. What i am getting is %XX (which is a
    >single byte, not 2 bytes). So, is the data getting lost? What format
    >is it, if it is not getting lost.


    You should be getting 2 or more successive %XXs. HTML form data send
    using GET is part of the URL Non-ASCII characters are represented in
    UTF-8, then each byte of the UTF-8 sequence is encoded in hex as %XX.

    See

    http://www.ietf.org/rfc/rfc3986.txt
    http://www.ietf.org/rfc/rfc2279.txt

    For POST data, I can't find up-to-date documentation. The very old
    http://www.w3.org/TR/html4/interact/forms.html describes the
    application/x-www-form-urlencoded mime type, but it does not mention
    non-ASCII characters. I think you'll find that it uses the same
    method as GET, but it's possible that it might use the encoding
    specified by the HTTP charset declaration rather than UTF-8. You'll
    need to ask about that somewhere other than comp.lang.c.

    -- Richard
    --
    "Consideration shall be given to the need for as many as 32 characters
    in some alphabets" - X3.4, 1963.
     
    Richard Tobin, May 29, 2007
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Robert Mark Bram
    Replies:
    0
    Views:
    4,059
    Robert Mark Bram
    Sep 28, 2003
  2. LarsenMTL
    Replies:
    4
    Views:
    859
    Eric Walstad
    Nov 4, 2004
  3. ygao

    unicode wrap unicode object?

    ygao, Apr 8, 2006, in forum: Python
    Replies:
    6
    Views:
    591
    =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=
    Apr 8, 2006
  4. Josef 'Jupp' Schugt

    (Ab)using class CGI as non-CGI HTML generator?

    Josef 'Jupp' Schugt, Mar 5, 2005, in forum: Ruby
    Replies:
    3
    Views:
    304
    Lee Braiden
    Mar 6, 2005
  5. Replies:
    12
    Views:
    421
    alpha_beta_release
    Aug 28, 2006
Loading...

Share This Page