url encoding string (Code Worth Recommending Project)

P

Peter Michaux

encodeURIComponent is commonly used to serialize forms for use with
XMLHttpRequest requests. For a perfect simulation of browser form
requests, the goal is to serialize form data in the <URL:
http://www.w3.org/TR/1999/REC-html401-19991224/interact/forms.html#h-17.13.4.1>
application/x-www-form-urlencoded</a> standardized format.

The handling of whitespace by encodeURIComponent() is different from
the x-www-form-urlencoded standard. According to the x-www-form-
urlencoded standard, a space should be encoded as a "+" and a newline
should be encoded as a "%0D%0A".

The JavaScript encodeURIComponent function encodes a space as "%20".
On Windows operating system a new line is encoded as "%0D%0A" but on
Mac OS X a new line is encoded as just "%0A". I don't know about
Linux. Can someone check using a textarea in a form?

Some developers may need to have the application/x-www-form-urlencoded
format. We can wrap encodeURIComponent.

if (typeof encodeURIComponent != 'undefined' &&
String.prototype.replace) {

var urlEncode = function(s) {
return encodeURIComponent(s).replace(/%20/, '+').replace(/(.
{0,3})(%0A)/g,
function(m, p1, p2) {return p1+(p1=='%0D'?'':'%0D')+p2;});
};
})();

}

The feature detection for String.prototype.replace is not sufficient
because we really need to know if the second argument to replace can
be a function. I imagine we can test this by using try-catch and
actually using replace with a function as the second argument. Does
anyone know a way to check without using try-catch?


For developers with server-side toolkits capable of dealing with the
output of encodeURIComponent the regexp work is not necessary and can
use just this simple little function.

if (typeof encodeURIComponent != 'undefined') {

var urlEncode = function(s) {
return encodeURIComponent(s);
};

}


There could even be a version for forms without textareas so only the
spaces need replacement.

if (typeof encodeURIComponent != 'undefined' &&
String.prototype.replace) {

var urlEncode = function(s) {
return encodeURIComponent(s).replace(/%20/, '+');
};
})();

}



--------------

BTW, I set up svn/trac for the code and posted the getOptionValue code
that resulted from the previous thread on form serialization.

<URL: http://cljs.michaux.ca/trac/browser/trunk/src/getOptionValue>

Please email me if you would like wiki editing and ticket creation/
commenting permissions. Let me know if you have a username/password
preference.

Peter
 
P

pr

Peter said:
On Windows operating system a new line is encoded as "%0D%0A" but on
Mac OS X a new line is encoded as just "%0A". I don't know about
Linux. Can someone check using a textarea in a form?

Same as Windows: before%0D%0Aafter

Linux xxxx 2.6.17-12-generic #2 SMP Sun Sep 23 22:56:28 UTC 2007 i686
GNU/Linux
 
B

Bart Van der Donck

Peter said:
The JavaScript encodeURIComponent function encodes a space as "%20".
On Windows operating system a new line is encoded as "%0D%0A" but on
Mac OS X a new line is encoded as just "%0A". I don't know about
Linux. Can someone check using a textarea in a form?

Yes but %0A is only because Mac OSX is a UNIX derivate (which
historically always used %0A).

Mac until OS9 should return %0D.

http://en.wikipedia.org/wiki/Line_Feed

%0A: Multics, Unix and Unix-like systems (GNU/Linux, AIX, Xenix, Mac
OS X, etc.), BeOS, Amiga, RISC OS, and others
%0D%0A: DEC RT-11 and most other early non-Unix, non-IBM OSes, CP/M,
MP/M, MS-DOS, OS/2, Microsoft Windows
%0D: Commodore machines, Apple II family and Mac OS up to version 9
 
P

Peter Michaux

Yes but %0A is only because Mac OSX is a UNIX derivate (which
historically always used %0A).

Mac until OS9 should return %0D.

http://en.wikipedia.org/wiki/Line_Feed

%0A: Multics, Unix and Unix-like systems (GNU/Linux, AIX, Xenix, Mac
OS X, etc.), BeOS, Amiga, RISC OS, and others
%0D%0A: DEC RT-11 and most other early non-Unix, non-IBM OSes, CP/M,
MP/M, MS-DOS, OS/2, Microsoft Windows
%0D: Commodore machines, Apple II family and Mac OS up to version 9

Thanks pr and Bart. It seems the two options ('%0D' and '%0A') need to
be corrected. How about this?

var urlencode = (function() {

var f = function(s) {
return encodeURIComponent(s).replace(/%20/,
'+').replace(/(.{0,3})(%0A)/g,
function(m, p1, p2) {return p1+(p1=='%0D'?'':'%0D')+p2;}
).replace(/(%0D)(.{0,3})/g,
function(m, p1, p2) {return p1+(p2=='%0A'?'':'%0A')+p2;});
};

try {
if (f('\n \r') == '%0D%0A+%0D%0A') {
return f;
}
}
catch(e) {}

})();


Using try-catch is a nice way to avoid host object feature testing and
ensure that all the necessary parts of the regexp are working as
expected. This makes a full test of the new feature in a very small
space.

However, try-catch doesn't appeal to some programmers. Any other
option to test that replace can take a function for the second
argument?

Peter
 
P

pr

Peter said:
However, try-catch doesn't appeal to some programmers. Any other
option to test that replace can take a function for the second
argument?

If it's backwards-compatibility you mainly want to test for, then AFAIK
try...catch arrived in the language about the same time as the ability
to use a function in String.replace() (JS 1.3 & 1.4, IE 5 & 5.5). So if
the feature isn't present then a try...catch feature test is almost
equally likely to fall over.

Unless you expect somebody with a seven-year-old browser to show up or
you know of a recent implementation that can't successfully use
functions in String.replace(), it would seem reasonable not to test at
all. I couldn't think of a test for this that would work pre-Netscape
6.0 without browser sniffing.
 
P

Peter Michaux

If it's backwards-compatibility you mainly want to test for, then AFAIK
try...catch arrived in the language about the same time as the ability
to use a function in String.replace() (JS 1.3 & 1.4, IE 5 & 5.5). So if
the feature isn't present then a try...catch feature test is almost
equally likely to fall over.

Unless you expect somebody with a seven-year-old browser to show up or
you know of a recent implementation that can't successfully use
functions in String.replace(), it would seem reasonable not to test at
all. I couldn't think of a test for this that would work pre-Netscape
6.0 without browser sniffing.

Randy Webb frequently mentions his dislike for try-catch and says his
cell phone doesn't support it. I really don't have a problem with try-
catch other than I don't like using features when they aren't
necessary.

It seems that if a browser doesn't support a function second argument
then this would be sufficient.

var urlencode = (function() {

var f = function(s) {
return encodeURIComponent(s).replace(/%20/,
'+').replace(/(.{0,3})(%0A)/g,
function(m, p1, p2) {return p1+(p1=='%0D'?'':'%0D')+p2;}
).replace(/(%0D)(.{0,3})/g,
function(m, p1, p2) {return p1+(p2=='%0A'?'':'%0A')+p2;});
};

if (f('\n \r') == '%0D%0A+%0D%0A') {
return f;
}

})();

The reason this is ok is because the if conditional will be false (not
throw an error). JavaScript will automatically type convert the
function second argument of replace to a string.

"asdf".replace(/s/, function(){});

outputs something like this

afunction(){}df

The Function.prototype.toString function is standard but it's behavior
is implementation dependent.

So I think try-catch may not be necessary after all.

Peter
 
T

Thomas 'PointedEars' Lahn

Peter said:
Nice page

Thanks. It is an HTML document, though. You can see that it is _not_ a
page because you can press the "Page Down" key to get to the next "page".
but I don't see anything about when String.prototype.replace
could take a function as the second argument.

Implicitly, it does. That feature could not have been available before
function expressions were introduced. (I might emphasize that by stating
it in a row below explicitly, but then I would have to do it for several
methods.)


PointedEars
 
T

Thomas 'PointedEars' Lahn

Thomas said:
Peter said:
[...] I don't see anything about when String.prototype.replace
could take a function as the second argument.

Implicitly, it does. That feature could not have been available before
function expressions were introduced. (I might emphasize that by stating
it in a row below explicitly, but then I would have to do it for several
methods.)

I have spotted a bug in the Matrix ;-)

String.prototype.replace() was only described with a string as second
argument, which probably has contributed not only to your confusion.
Sorry about that.

I have done some tests now and updated the ES Matrix accordingly. I have
also fixed an error (of omission) at MDC: the method was not part of
ECMAScript specifications before Edition *3*.

http://developer.mozilla.org/en/doc...ference:Global_Objects:String:replace#Summary


PointedEars
 
P

Peter Michaux

Thomas 'PointedEars' Lahn said the following on 11/23/2007 1:23 PM:





You are a serious piece of work.

:)

I had a response partly typed but thought "You know what? Randy will
probably handle this one better."
Ranks right up there with "That's nice[1]".

A classic!

Peter
 
R

Richard Cornford

Thomas said:
Peter Michaux wrote:

Implicitly, it does. That feature could not have been
available before function expressions were introduced.
<snip>

The second argument to - String.prototype.replace - does not have to be
a function expression. It can be (and very often should be) a reference
to a function via a named property of a Variable object (an Identifier
in the actual code). You can do that as soon as you have first class
functions (as function objects) that can be passed by reference, which
is pretty much from day one.

Richard.
 
R

Richard Cornford

Peter Michaux wrote:
The reason this is ok is because the if conditional will be
false (not throw an error). JavaScript will automatically
type convert the function second argument of replace to a
string.

"asdf".replace(/s/, function(){});

outputs something like this

afunction(){}df

The Function.prototype.toString function is standard but
it's behavior is implementation dependent.

So I think try-catch may not be necessary after all.

Given the number of times I have seen you referring people to my page in
the FAQ notes about feature detecting it is disappointing that you have
forgotten that the example of pure javascript feature detection that I
used on that page was a test for - String.prototype.replace - accepting
a function reference as its second argument or not (and it did not use
try-catch).

Richard.
 
T

Thomas 'PointedEars' Lahn

Richard said:
<snip>

The second argument to - String.prototype.replace - does not have to be
a function expression. It can be (and very often should be) a reference
to a function via a named property of a Variable object (an Identifier
in the actual code).

"Very often should be" -- why? That would pollute the namespace and, given
the current support of runtime environments, it would pollute it needlessly.
I would have concurred if you had said "there are occasions where it is
advisable" but not with this.
You can do that as soon as you have first class functions (as function
objects) that can be passed by reference, which is pretty much from day one.

Nevertheless, String.prototype.replace() was not introduced before
JavaScript 1.2, JScript 1.0, and ECMAScript 3, and function expressions
were introduced with JavaScript 1.2, JScript 3.0, and ECMAScript 3. So
yes, it looks like there is a correlation.


PointedEars
 
T

Thomas 'PointedEars' Lahn

Randy said:
Thomas 'PointedEars' Lahn said the following on 11/23/2007 1:23 PM:

You are a serious piece of work.

Why, thank you.


PointedEars
 
P

Peter Michaux

Peter Michaux wrote:






Given the number of times I have seen you referring people to my page in
the FAQ notes about feature detecting it is disappointing that you have
forgotten that the example of pure javascript feature detection that I
used on that page was a test for - String.prototype.replace - accepting
a function reference as its second argument or not (and it did not use
try-catch).

At least I had the joy of rediscovery :)

Peter
 
P

Peter Michaux

Peter Michaux wrote:






Given the number of times I have seen you referring people to my page in
the FAQ notes about feature detecting it is disappointing that you have
forgotten that the example of pure javascript feature detection that I
used on that page was a test for - String.prototype.replace - accepting
a function reference as its second argument or not (and it did not use
try-catch).

Below quoted from said:
/* The original string is the one letter string literal "a". The
Regular Expression /a/ identifies that entire string, so it is the
entire original string that will be replaced. The second argument is
the function expression - function(){return ''';} -, so the entire
original string will be replaced with an empty string if the
function expression is executed. If it is instead type-converted
into a string that string will not be an empty string.

I don't agree with the above statement as a necessary truth. A non-
standard implementation may type-convert a function to an empty
string. This may seem like unnecessary paranoia but there is some
serious general confusion in the implementations of replace before a
function second argument works as it does in ECMAScript 3rd edition.

For example, in Opera 6.0

'a'.replace(/a/, function(){return '';});

returns 'a' which is far from the expected result especially since
O6.0 does type convert the function expression to a string properly
(ie 'function () {\nreturn "";\n}'). For the above replace statement,
IE4 returns "function(){return '';}" and NN4.0 returns "[object
Closure]".

This makes my paranoia grow and that some implementation somewhere may
type convert the function expression second argument above into an
empty string and evaluate the above statement to ''. It is more likely
that an implementation would type-convert the function expression to
an empty string than it would be to convert the function expression to
a the string "asdfewr".
if(!('a'.replace(/a/, (function(){return '';})))){
... //function references OK.
}else{
... //no function references with replace.
}

So in the code above this would be a more reliable test

if ('a'.replace(/a/, function(){return 'asdfewr';}) == 'asdfewr')

Even better would be using two replaces with different substitution
values because if the implementation type converts any function
expression argument to a fixed string then it wouldn't matter if it
just so happened to be the 'asdfewr' string.

'ab'.replace(/a/, function(){return '';}).replace(/b/, function()
{return 'z';});

should return 'z'

It just so happens, somewhat by luck, that in the code I wrote, I used
multiple replaces and tested they all work

var urlencode = (function() {

var f = function(s) {
return encodeURIComponent(s).replace(/%20/,'+'
).replace(/(.{0,3})(%0A)/g,
function(m, a, b) {return a+(a=='%0D'?'':'%0D')+b;}
).replace(/(%0D)(.{0,3})/g,
function(m, a, b) {return a+(b=='%0A'?'':'%0A')+b;});
};

if (typeof encodeURIComponent != 'undefined' &&
String.prototype.replace &&
f('\n \r') == '%0D%0A+%0D%0A') {
return f;
}

})();

Peter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,566
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top