Valid characters in GET data

Discussion in 'HTML' started by David Segall, Dec 20, 2007.

  1. David Segall

    David Segall Guest

    I want to encode a string that will be used as a GET data parameter
    but the algorithm I have can produce the characters "/", "+" and "="
    in addition to alphanumeric characters. Those characters don't have a
    named entity in my HTML text book so I believe they can be used
    without further encoding. Can they? In other words, is the URL
    <http://example.com?myparam=four/two+3=5> valid? For extra credit :),
    where should I have looked to find the definitive answer to this
    question?
     
    David Segall, Dec 20, 2007
    #1
    1. Advertising

  2. David Segall wrote:
    > I want to encode a string that will be used as a GET data parameter
    > but the algorithm I have can produce the characters "/", "+" and "="
    > in addition to alphanumeric characters. Those characters don't have a
    > named entity in my HTML text book so I believe they can be used
    > without further encoding. Can they? In other words, is the URL
    > <http://example.com?myparam=four/two+3=5> valid? For extra credit :),
    > where should I have looked to find the definitive answer to this
    > question?


    URLs are not HTML. They have their own syntax. In particular, the plus
    and equals signs have special meanings in a query string: the plus sign
    is interpreted as a space character and the equals sign is used to
    create a key/value association as you did in your own example,
    associating the key "myparam" with the value "four/two+5=5".

    The characters that have special meaning in a query string or that
    delimit the query string from other parts of the URL are the ones in the
    set {=?&;#%+}. When you use want to use any of these as an ordinary
    character, encode it as %nn where nn is the hexadecimal ASCII code for
    the character. An embedded space can be encoded as either %20 or a plus
    sign.

    See http://en.wikipedia.org/wiki/Percent-encoding.
     
    Harlan Messinger, Dec 20, 2007
    #2
    1. Advertising

  3. David Segall wrote:

    > I want to encode a string that will be used as a GET data parameter but
    > the algorithm I have can produce the characters "/", "+" and "=" in
    > addition to alphanumeric characters. Those characters don't have a named
    > entity in my HTML text book so I believe they can be used without
    > further encoding.


    Wrong. All three have special meanings in form data, so although a URL
    like "http://example.com?myparam=four/two+3=5" is perfectly valid, your
    webserver and/or scripting language may interpret it differently to how
    you might expect.

    For example, the '+' may be decoded to a space character before you get
    your hands on it.

    URLs such as this would typically need to be hex-encoded to avoid being
    interpreted:

    http://example.com?myparam=four/two+3=5

    - and * would probably be safer than + and /. < or > would be safer than =.

    --
    Toby A Inkster BSc (Hons) ARCS
    [Geek of HTML/SQL/Perl/PHP/Python/Apache/Linux]
    [OS: Linux 2.6.17.14-mm-desktop-9mdvsmp, up 13 days, 3:12.]

    Sharing Music with Apple iTunes
    http://tobyinkster.co.uk/blog/2007/11/28/itunes-sharing/
     
    Toby A Inkster, Dec 20, 2007
    #3
  4. David Segall wrote:

    > where should I have looked to find the definitive answer to this
    > question?


    Mostly:

    HTML 4.01 Specification: 17.13 Form submission
    http://www.w3.org/TR/html401/interact/forms.html#h-17.13

    Supporting documents:

    RFC 2616: Hypertext Transfer Protocol -- HTTP/1.1
    http://www.ietf.org/rfc/rfc2616.txt

    RFC 1738: Uniform Resource Locators (URL)
    http://www.ietf.org/rfc/rfc1738.txt

    --
    Toby A Inkster BSc (Hons) ARCS
    [Geek of HTML/SQL/Perl/PHP/Python/Apache/Linux]
    [OS: Linux 2.6.17.14-mm-desktop-9mdvsmp, up 13 days, 3:18.]

    Sharing Music with Apple iTunes
    http://tobyinkster.co.uk/blog/2007/11/28/itunes-sharing/
     
    Toby A Inkster, Dec 20, 2007
    #4
  5. Harlan Messinger wrote:

    > URLs are not HTML. They have their own syntax.


    But of course, URLs represented in HTML have to both conform to the URL
    syntax rules, *plus* HTML syntax rules. (The trick is to apply the URL
    rules first, such as percent-encoding, and *then* apply HTML rules, like
    representing ampersands as "&amp;".)

    --
    Toby A Inkster BSc (Hons) ARCS
    [Geek of HTML/SQL/Perl/PHP/Python/Apache/Linux]
    [OS: Linux 2.6.17.14-mm-desktop-9mdvsmp, up 13 days, 23:04.]

    Sharing Music with Apple iTunes
    http://tobyinkster.co.uk/blog/2007/11/28/itunes-sharing/
     
    Toby A Inkster, Dec 21, 2007
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    2
    Views:
    771
  2. Stefan Mueller
    Replies:
    3
    Views:
    33,051
    Stefan Mueller
    Jul 23, 2006
  3. rote
    Replies:
    2
    Views:
    7,401
    Mark Fitzpatrick
    Jan 23, 2008
  4. Joshua Kugler
    Replies:
    0
    Views:
    493
    Joshua Kugler
    Dec 12, 2008
  5. Nanda
    Replies:
    2
    Views:
    2,943
    Nanda
    Aug 16, 2009
Loading...

Share This Page