Illegal characters in variables passed via a fat url?

Fernie · Dec 19, 2004

Hello again,

Is there a list of characters that may cause problems in a fat url? I can
think of charcters such as (? = & and blank spaces) that may cause problems
with a url.

The reason for this is that I am considering using fat urls to pass
protected variables (encrypted on the server using the BlowFish algorithm).

Thanks very much,

Fernie

Jukka K. Korpela · Dec 19, 2004

Fernie said:
Is there a list of characters that may cause problems in a fat url?

Yes. See RFC 2396, which applies to URLs irrespectively of their color,
sex, and fatness.

I can think of charcters such as (? = & and blank spaces) that may
cause problems with a url.

The right question is "which characters are safe?".

The reason for this is that I am considering using fat urls to pass
protected variables (encrypted on the server using the BlowFish
algorithm).

I'm pretty sure you are solving the wrong problem, since you asked such
an elementary question and yet describe a fairly complex setting.

Richard · Dec 20, 2004

Fernie said:
Hello again,

Is there a list of characters that may cause problems in a fat url? I can
think of charcters such as (? = & and blank spaces) that may cause
problems with a url.

The reason for this is that I am considering using fat urls to pass
protected variables (encrypted on the server using the BlowFish
algorithm).

Thanks very much,

Fernie

In a url in coding, the proper form is to use the "special" character set.
That is, for & use &amp.
&nbsp is for a blank space.

Dylan Parry · Dec 20, 2004

Richard wrote:
[url encoding]

&nbsp is for a blank space.

No it isn't. %20 is for space in URLs.

Fernie · Dec 20, 2004

Richard said:
In a url in coding, the proper form is to use the "special" character set.
That is, for & use &amp.
&nbsp is for a blank space.

Richard & Dylan,

Thanks for your suggestions.

Regards,

Fernie

Fernie · Dec 20, 2004

Jukka K. Korpela said:
Yes. See RFC 2396, which applies to URLs irrespectively of their color,
sex, and fatness.

Jukka, I really appreciate the info. I found a copy of the specification
you provided. It was posted at the following site:
http://asg.web.cmu.edu/rfc/

The right question is "which characters are safe?".

Not these according to the document:

";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" | "$" | ","

If I understood correctly, the above reserve characters must be encoded if
passed in a URL.

Regards,

Fernie

Richard · Dec 20, 2004

Dylan said:
Richard wrote:
[url encoding]

&nbsp is for a blank space.

Click to expand...

No it isn't. %20 is for space in URLs.

That's for the output.
Within a tag inside the html, the correct format is &nbsp.

Steve Pugh · Dec 20, 2004

Dylan said:
Dylan said:

Richard wrote:
[url encoding]

&nbsp is for a blank space.

Click to expand...

Click to expand...

No it isn't. %20 is for space in URLs.

Click to expand...

That's for the output.
Within a tag inside the html, the correct format is &nbsp.

No. For starters a space and a non-breaking space are different
characters. Positions 32 and 160 respectively in ISO-8859-1 and many
other character encodings.

In a URL, wherever it is written, a space is %20.
A non-breaking space would be   in HTML or %A0 in URLs.

Steve

Jukka K. Korpela · Dec 20, 2004

Fernie said:
Not these according to the document:

";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" | "$" | ","

It depends. RFC 2396 is tough reading. I have read it about three times
and converted it to hypertext, and still don't understand all the
pieces (partly because the terminology is horrendously confused and not
compatible with character code standard terminology).

If I understood correctly, the above reserve characters must be
encoded if passed in a URL.

It's much more complex than that. Some of them _must not_ be encoded
when used in a _specific meaning_.

I repeat: you are probably solving the wrong problem. Why create
"fat URLs" in the first place? You won't need them if you use the POST
method.

Fernie · Dec 20, 2004

Jukka K. Korpela said:
I repeat: you are probably solving the wrong problem. Why create
"fat URLs" in the first place? You won't need them if you use the POST
method.

Hyperlinks is my stumbling block.

I am trying to achieve redundant session management and tamper prevention.
At this time, I am sucessfully using:

* Cookies. Unless cookies are rejected by the client, therefore, I cannot
rely solely on this.

* Hidden fields. The session data is posted to new pages whenever a post is
performed. Sessions are maintained. The drawback is that I have hyperlinks
to other pages and if cookies are off, the session is lost when the
hyperlinks are clicked.

The cookies and hidden fields are to be encrypted, therefore, tampering hits
can be immediately discovered and ignored by returning a blank page if the
format is in any way invalid. Processing stops here instead of going on
further, to perhaps a database lookup routine, etc.

I've heard that Perl, PHP, and other interpreters already have built in
functionality to handle this but I am writting my application in C++ since I
don't know how to use any of the mentioned tools specific to web
development. I have found two different urlencode/urldecode functions
written for C++ that I will try today. Here is a documentation snippet:

URLEncode is a String class function that converts a US-ASCII string to its
representation in the URL Encoding scheme. URLEncode is based on the URL
character encoding rules as described in the Internet Standards document RFC
1738 - Uniform Resource Locators (URL)
(http://www.rfc-editor.org/rfc/rfc1738.txt).

Regards,

Fernie

Jukka K. Korpela · Dec 21, 2004

Fernie said:
At this time, I am sucessfully using: - -
* Hidden fields. The session data is posted to new pages whenever
a post is performed. Sessions are maintained.

That's what I've been suggesting, more or less.

The drawback is
that I have hyperlinks to other pages and if cookies are off, the
session is lost when the hyperlinks are clicked.

So don't do that. This is a place for using buttons instead of links.
Note that you could use CSS to make a button (in a form containing just
the button and a collection of hidden fields) much like a link. Not
quite, but quite a lot. But maybe a button would be more suitable and
"honest", since it is a submission of a kind rather than a normal link.

I've heard that Perl, PHP, and other interpreters already have
built in functionality to handle this but I am writting my
application in C++ since I don't know how to use any of the
mentioned tools specific to web development.

Sounds like the hard way of doing things. Perl takes more than an
eternity to learn properly, but the basics aren't really rocker
science. PHP is much easier.

I have found two
different urlencode/urldecode functions written for C++ that I will
try today. Here is a documentation snippet:

URLEncode is a String class function that converts a US-ASCII
string to its representation in the URL Encoding scheme. URLEncode
is based on the URL character encoding rules as described in the
Internet Standards document RFC 1738 - Uniform Resource Locators
(URL) (http://www.rfc-editor.org/rfc/rfc1738.txt).

Sounds pretty old. As far as generic URL syntax is considered, RFC 1738
was superseded _in 1998_. That's more than six years ago. Moreover, the
reference is incorrect; RFC 1738 wasn't an Internet Standard (which is
a status specifically given to fairly few RFCs).

In practice the changes aren't that big, but it's still a very old
specification, and the new one should be used instead. However it's a
major effort to implement it properly in all details.

Hyperlink that Posts (instead of Get)?	8	Dec 19, 2004
Digester and illegal characters	2	Mar 5, 2004
reading variables from a URL in an XSL page.	2	Apr 30, 2007
Must we include urllib just to decode a URL-encoded string, whenusing Requests?	0	Jun 13, 2013
XML error: Illegal characters in path	0	Jul 8, 2003
Could someone please paraphrase this statement about variables andfunctions in Python?	2	Sep 5, 2013
Could someone please paraphrase this statement about variables andfunctions in Python?	4	Sep 5, 2013
Where Should Variables be Declared?	2	Sep 16, 2009

Illegal characters in variables passed via a fat url?

Fernie

Jukka K. Korpela

Richard

Dylan Parry

Fernie

Fernie

Richard

Steve Pugh

Jukka K. Korpela

Fernie

Jukka K. Korpela

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads