% characters in html code

Discussion in 'HTML' started by jrefactors@hotmail.com, Feb 21, 2005.

  1. Guest

    In html code, I always see % characters such as following:
    <a
    href="mailto:?subject=Business%20Applications%20Consultant%20%2d%2d%20Learn%20Oracle%20ERP%20%20%28san%20jose%20north%29">

    But the outlook can understand it and interpret it as following when
    i click the hyperlink:

    Business Applications Consultant -- Learn Oracle ERP (san jose north)

    My question is what are those % characters?
    Please advise. thanks!!
    , Feb 21, 2005
    #1
    1. Advertising

  2. Jan Faerber Guest

    ::::::Monday 21 February 2005
    18:06:::<>:::alt.html
    .... output:

    > In html code, I always see % characters such as following:
    > <a
    >

    href="mailto:?subject=Business%20Applications%20Consultant%20%2d%2d%20Learn%20Oracle%20ERP%20%20%28san%20jose%20north%29">
    >
    > But the outlook can understand it and interpret it as following when
    > i click the hyperlink:
    >
    > Business Applications Consultant -- Learn Oracle ERP (san jose north)
    >
    > My question is what are those % characters?
    > Please advise. thanks!!


    We can look that up in a good book:

    That is encoding of URL data.

    %HH is an encoded character where HH is the hexadecimal acii-Value of this
    character / sign.

    The basics for this transformation can be found in the RFC 1738 that means
    that all signs / characters will be transformed except alphanumeric signs,
    the fullstop ".", the underscore "_" and the minus "-".

    In historical ways the "+" is used for an empty space sign which does not
    conform to the RFC 1738.

    This relates to the functions rawurldecode / rawurlencode / urldecode /
    urlencode in PHP.


    --
    Jan

    http://html.janfaerber.com
    Jan Faerber, Feb 21, 2005
    #2
    1. Advertising

  3. Jan Faerber <> wrote:

    > The basics for this transformation can be found in the RFC 1738


    Haven't you heard that the said RFC was obsoleted, as far as generic URL
    syntax is considered (and this _is_ about generic URL syntax), by RFC 2396
    about six and a half years ago? This particular thing wasn't changed, but
    it's still odd to avoid consulting the current specification.

    > In historical ways the "+" is used for an empty space sign which does
    > not conform to the RFC 1738.


    No, encoding a space as "+" was and is prescribed for encoding _form data_
    according to HTML specifications, before applying URL encoding.

    --
    Yucca, http://www.cs.tut.fi/~jkorpela/
    Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html
    Jukka K. Korpela, Feb 21, 2005
    #3
  4. Guest

    Jukka K. Korpela wrote:
    > Jan Faerber <> wrote:
    >
    > > The basics for this transformation can be found in the RFC 1738

    >
    > Haven't you heard that the said RFC was obsoleted, as far as generic

    URL
    > syntax is considered (and this _is_ about generic URL syntax), by RFC

    2396
    > about six and a half years ago? This particular thing wasn't changed,

    but
    > it's still odd to avoid consulting the current specification.
    >
    > > In historical ways the "+" is used for an empty space sign which

    does
    > > not conform to the RFC 1738.

    >
    > No, encoding a space as "+" was and is prescribed for encoding _form

    data_
    > according to HTML specifications, before applying URL encoding.
    >
    > --
    > Yucca, http://www.cs.tut.fi/~jkorpela/
    > Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html


    so how do we decode it? any algorithms out there? any tutorials i can
    look up?

    please advise. thanks!!
    , Feb 22, 2005
    #4
  5. wrote:

    > so how do we decode it? any algorithms out there?


    The URL encoding is publicly defined in RFC 2396, as mentioned, and it is
    fairly trivial to decode it on the basis of the definition. But unless you
    need a simple programming exercise, find a library routine for the purpose.
    It's probably named something like "decode" or "URLdecode". But if you are
    using a high-level package, such as CGI.pm for processing form data (in a
    format that includes URL encoding), you will find that the package's tools
    automagically give the decoded data for you, unless you specifically want
    to read the encoded version using lower-level routines.

    > any tutorials i can look up?


    Use the tutorial for the server-side technology you are using, and if
    problems remain, consider asking in a group that is most closely devoted to
    that technology.

    --
    Yucca, http://www.cs.tut.fi/~jkorpela/
    Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html
    Jukka K. Korpela, Feb 22, 2005
    #5
  6. Jan Faerber Guest

    Jukka K. Korpela ... output:

    > Jan Faerber <> wrote:
    >
    >> The basics for this transformation can be found in the RFC 1738

    >
    > Haven't you heard that the said RFC was obsoleted, as far as generic URL
    > syntax is considered (and this _is_ about generic URL syntax), by RFC 2396
    > about six and a half years ago? This particular thing wasn't changed, but
    > it's still odd to avoid consulting the current specification.


    That is interesting. It is completly new to me. 2396 is derived from 1738
    and 1808: http://www.faqs.org/rfcs/rfc2396.html

    The book that I cited was "PHP 4 - Webserverprogrammierung unter Linux und
    Windows": webserverprogramming under linux and windows
    author: Jörg Krause
    (c) 2003 Carl Hanser Verlag München Wien - Munic Vienna

    http://www.php.comzept.de
    http://www.phparchiv.de


    >> In historical ways the "+" is used for an empty space sign which does
    >> not conform to the RFC 1738.

    >
    > No, encoding a space as "+" was and is prescribed for encoding _form data_
    > according to HTML specifications, before applying URL encoding.


    Yes, for (... http://at.php.net/manual/en/function.urlencode.php ...)
    application/x-www-form-urlencoded media type.
    But here they use it for URLs in a "convenient way to pass variables to the
    next page".

    So you want to say that you will never find this in Outlook Express.


    --
    Jan

    http://html.janfaerber.com
    Jan Faerber, Feb 22, 2005
    #6
  7. Jan Faerber <> wrote:

    >>> In historical ways the "+" is used for an empty space sign which does
    >>> not conform to the RFC 1738.

    >>
    >> No, encoding a space as "+" was and is prescribed for encoding _form
    >> data_ according to HTML specifications, before applying URL encoding.

    >
    > Yes, for (... http://at.php.net/manual/en/function.urlencode.php ...)
    > application/x-www-form-urlencoded media type.
    > But here they use it for URLs in a "convenient way to pass variables to
    > the next page".


    "It"? You mean the plus sign? Well they can use it the way they like, but
    it's odd - the common convention is to use "&", and the age-old
    recommendation is to use ";", but they probably wanted to apply the
    NIH principle (Not Invented Here).

    > So you want to say that you will never find this in Outlook Express.


    I have no idea what made you think so. I didn't say a word about Outlook
    Express, and I don't know how OE would relate to anything in this issue.

    --
    Yucca, http://www.cs.tut.fi/~jkorpela/
    Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html
    Jukka K. Korpela, Feb 22, 2005
    #7
  8. Jan Faerber Guest

    Jukka K. Korpela ... output:

    > Jan Faerber <> wrote:
    >
    >>>> In historical ways the "+" is used for an empty space sign which does
    >>>> not conform to the RFC 1738.
    >>>
    >>> No, encoding a space as "+" was and is prescribed for encoding _form
    >>> data_ according to HTML specifications, before applying URL encoding.

    >>
    >> Yes, for (... http://at.php.net/manual/en/function.urlencode.php ...)
    >> application/x-www-form-urlencoded media type.
    >> But here they use it for URLs in a "convenient way to pass variables to
    >> the next page".

    >
    > "It"? You mean the plus sign? Well they can use it the way they like, but
    > it's odd - the common convention is to use "&", and the age-old
    > recommendation is to use ";", but they probably wanted to apply the
    > NIH principle (Not Invented Here).


    Yes: the "+" sign. Isn't it "%20" in %HH form for the empty space? But good
    that you might mention more than one possibility!

    >> So you want to say that you will never find this in Outlook Express.

    >
    > I have no idea what made you think so. I didn't say a word about Outlook
    > Express, and I don't know how OE would relate to anything in this issue.


    ..oO(Because "+" is 'only' for forms...)
    ..oO(... and the "+" does not conform to the RFC 1738.)

    Therefore I thought that OE won't use it ... after your informative posting.



    --
    Jan

    http://html.janfaerber.com
    Jan Faerber, Feb 22, 2005
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    2
    Views:
    751
  2. Stefan Mueller
    Replies:
    3
    Views:
    32,953
    Stefan Mueller
    Jul 23, 2006
  3. omegaman
    Replies:
    1
    Views:
    555
    omegaman
    Sep 21, 2004
  4. Micah
    Replies:
    2
    Views:
    330
    Micah
    Jun 2, 2006
  5. Replies:
    4
    Views:
    331
    Keith Thompson
    Dec 14, 2006
Loading...

Share This Page