form serialization (Code Worth Recommending Project)

P

Peter Michaux

If all goes well, this is part one of what will hopefully result in a
body of code the group can recommend as a resource for JavaScript
programmers and library writers. At the very least it's something
constructive we can bicker over rather than just ridiculing the
popular JavaScript libraries.

I'm not yet sure how this code will be aggregated but it seems that
Jim Ley will be making some changes on the jibbering server soon and
perhaps this code will end up in a wiki or svn repository on the
jibbering site or elsewhere.

This first installment is about form serialization. This is a common
task these days and something not covered in the group's FAQ notes.
Below is an example of form serialization that works with forms with
text-like inputs (eg text, hidden, textarea, etc). This follows
Richard Cornford's idea of having multiple implementations of a single
interface where each implementation is for a specific circumstance.
Other implementations for forms with select, radio and checkbox inputs
can be written (and I have written some.) Perhaps first it would be a
good idea to criticize my implementation and documentation below for
the stated circumstance much like the daily FAQ posts are heavily
scrutinized.

//
----------------------------------------------------------------------------

/**
@object serializeForm<function>

@param f<form> The form element to be serialized.

@returns <string> The serialized form data in "foo=bar&asdf=1234"
format

serializeForm will serialize data contained in forms with text-
like
inputs (eg. text, password, hidden, textarea.) The returned string
is suitable for a URL query string or POST request body. Test that
this function is defined before calling it. This function has
no side effects.

Features Not Tested
(ECMAScript v2, DOM 1 and earlier features are assumed to reduce
code size)
JavaScript 1.0
form.elements
form.elements.length
input.name
input.value
ECMAScript v1
array.join
DOM Level 1
input.disabled
*/

// ECMAScript v3 feature tests
if (Array.prototype.push && encodeURIComponent) {

function serializeForm(f) {
var i, // elements loop index
l, // elements loop length
e, // element
es = f.elements,
c = []; // the serialization data parts

for (i=0, l=es.length; i<l; i++) {
e = es;
if (!e.disabled) {
c.push(encodeURIComponent(e.name) + "=" +
encodeURIComponent(e.value));
}
}
return c.join('&');
}

}
 
R

RobG

If all goes well, this is part one of what will hopefully result in a
body of code the group can recommend as a resource for JavaScript
programmers and library writers. At the very least it's something
constructive we can bicker over rather than just ridiculing the
popular JavaScript libraries.

I'm not yet sure how this code will be aggregated but it seems that
Jim Ley will be making some changes on the jibbering server soon and
perhaps this code will end up in a wiki or svn repository on the
jibbering site or elsewhere.

This first installment is about form serialization. This is a common
task these days and something not covered in the group's FAQ notes.
Below is an example of form serialization that works with forms with
text-like inputs (eg text, hidden, textarea, etc). This follows
Richard Cornford's idea of having multiple implementations of a single
interface where each implementation is for a specific circumstance.
Other implementations for forms with select, radio and checkbox inputs
can be written (and I have written some.) Perhaps first it would be a
good idea to criticize my implementation and documentation below for
the stated circumstance much like the daily FAQ posts are heavily
scrutinized.

//
----------------------------------------------------------------------------

/**
@object serializeForm<function>

@param f<form> The form element to be serialized.

@returns <string> The serialized form data in "foo=bar&asdf=1234"
format

serializeForm will serialize data contained in forms with text-
like
inputs (eg. text, password, hidden, textarea.) The returned string
is suitable for a URL query string or POST request body. Test that
this function is defined before calling it. This function has
no side effects.

Features Not Tested
(ECMAScript v2, DOM 1 and earlier features are assumed to reduce
code size)
JavaScript 1.0
form.elements
form.elements.length
input.name
input.value
ECMAScript v1
array.join
DOM Level 1
input.disabled
*/

// ECMAScript v3 feature tests
if (Array.prototype.push && encodeURIComponent) {

function serializeForm(f) {
var i, // elements loop index
l, // elements loop length
e, // element
es = f.elements,
c = []; // the serialization data parts

for (i=0, l=es.length; i<l; i++) {
e = es;
if (!e.disabled) {
c.push(encodeURIComponent(e.name) + "=" +
encodeURIComponent(e.value));
}
}
return c.join('&');
}
}


The above function seems a little simplistic. It should exclude form
controls that don't have a name, it should also deal with radio
buttons, selects, checkboxes, etc.

You might want to check with Matt Kruse's Ajax Toolbox which has a
form serialisation function:

<URL: http://www.ajaxtoolbox.com/request/documentation.php#serializeForm

There is also IE's problem with reporting the value of an option
element (the text value should be reported if there is no value
attribute but IE doesn't).

<URL:
http://groups.google.com.au/group/c...=en&lnk=gst&q=serialize+form#5618177bf1658a30
and

<URL: http://groups.google.com/group/comp.lang.javascript/browse_frm/thread/3b0ff7ccb462f76a?tvc=1
 
P

Peter Michaux

If all goes well, this is part one of what will hopefully result in a
body of code the group can recommend as a resource for JavaScript
programmers and library writers. At the very least it's something
constructive we can bicker over rather than just ridiculing the
popular JavaScript libraries.
I'm not yet sure how this code will be aggregated but it seems that
Jim Ley will be making some changes on the jibbering server soon and
perhaps this code will end up in a wiki or svn repository on the
jibbering site or elsewhere.
This first installment is about form serialization. This is a common
task these days and something not covered in the group's FAQ notes.
Below is an example of form serialization that works with forms with
text-like inputs (eg text, hidden, textarea, etc). This follows
Richard Cornford's idea of having multiple implementations of a single
interface where each implementation is for a specific circumstance.
Other implementations for forms with select, radio and checkbox inputs
can be written (and I have written some.) Perhaps first it would be a
good idea to criticize my implementation and documentation below for
the stated circumstance much like the daily FAQ posts are heavily
scrutinized.

/**
@object serializeForm<function>
@param f<form> The form element to be serialized.
@returns <string> The serialized form data in "foo=bar&asdf=1234"
format
serializeForm will serialize data contained in forms with text-
like
inputs (eg. text, password, hidden, textarea.) The returned string
is suitable for a URL query string or POST request body. Test that
this function is defined before calling it. This function has
no side effects.
Features Not Tested
(ECMAScript v2, DOM 1 and earlier features are assumed to reduce
code size)
JavaScript 1.0
form.elements
form.elements.length
input.name
input.value
ECMAScript v1
array.join
DOM Level 1
input.disabled
*/
// ECMAScript v3 feature tests
if (Array.prototype.push && encodeURIComponent) {
function serializeForm(f) {
var i, // elements loop index
l, // elements loop length
e, // element
es = f.elements,
c = []; // the serialization data parts
for (i=0, l=es.length; i<l; i++) {
e = es;
if (!e.disabled) {
c.push(encodeURIComponent(e.name) + "=" +
encodeURIComponent(e.value));
}
}
return c.join('&');
}
}


The above function seems a little simplistic. It should exclude form
controls that don't have a name,


Yes it should. I will add this.

it should also deal with radio
buttons, selects, checkboxes, etc.

I have a version that does this to post later. It is bigger and wanted
to post something smaller to look at format of how one of the multiple
implementations might be presented. Each is simply an implementation
for a different circumstance but with the same interface.

Thanks,
Peter
 
M

Matt Kruse

I have a version that does this to post later. It is bigger and wanted
to post something smaller to look at format of how one of the multiple
implementations might be presented. Each is simply an implementation
for a different circumstance but with the same interface.

IMO, this example is taking that philosophy a bit too far. The extra
code required to handle all input types is minimal. Why would someone
want a stripped-down version that only handles a very limited case,
when they could have a fully reusable version that could be used in
every case? This problem is very "solvable" in that a single general-
purpose solution can be created. For different problems, that that's
not so true and the "many implementation" approach makes more sense.

In any case, my stab at it was posted by Rob. It does need to be
improved to handle the special select case in IE and maybe there are
some more quirks that exist that I didn't handle. You could surely
take the serialization code from a few different libs and combine them
into a single solution.

Matt Kruse
 
M

Martin Honnen

Peter said:
// ECMAScript v3 feature tests
if (Array.prototype.push && encodeURIComponent) {

Shouldn't that check be
if (Array.prototype.push && typeof encodeURIComponent != 'undefined')
? If you simply use the identifier encodeURIComponent then your code
throws an error in implementations not implementing encodeURIComponent.
function serializeForm(f) {

A function declaration inside of the if block? That is not even allowed
syntax according to ECMAScript edition 3 as you can only put a statement
in the if block but function declarations are not statements. And
different implementations handle that case differently as it is not
specified, I think Mozilla's Spidermonkey indeed processes the function
declaration conditionally but with Microsoft's JScript the function
declaration is processed unconditionally.
 
P

Peter Michaux

IMO, this example is taking that philosophy a bit too far. The extra
code required to handle all input types is minimal. Why would someone
want a stripped-down version that only handles a very limited case,
when they could have a fully reusable version that could be used in
every case? This problem is very "solvable" in that a single general-
purpose solution can be created. For different problems, that that's
not so true and the "many implementation" approach makes more sense.

I agree this problem is quite solvable in general and is why I chose
it first. It is a chance to see how code could be aggregated with a
philosophy that is inclusive to encourage participation. Acknowledging
multiple implementations where people have different priorities
(download size verses generality is a legitimate argument) will make
the code base more reusable both educationally and by various people.
By having an inclusive approach established, when the trickier
problems are presented, I think we will have more input to tackle
them.

In any case, my stab at it was posted by Rob. It does need to be
improved to handle the special select case in IE and maybe there are
some more quirks that exist that I didn't handle. You could surely
take the serialization code from a few different libs and combine them
into a single solution.

The option wrinkle is not as simple as it seems. In fact, I think
there is no general JavaScript-only solution to that problem and the
multiple implementations approach is the only way. I didn't know that
until I started making examples for this. Various solutions require
either supported browser restrictions or HTML authoring restrictions.
I will post what I have when this simpler version is sorted.

Thanks,
Peter
 
P

Peter Michaux

Shouldn't that check be
if (Array.prototype.push && typeof encodeURIComponent != 'undefined')
? If you simply use the identifier encodeURIComponent then your code
throws an error in implementations not implementing encodeURIComponent.

Indeed. Silly mistake that I've encountered before.

A function declaration inside of the if block? That is not even allowed
syntax according to ECMAScript edition 3 as you can only put a statement
in the if block but function declarations are not statements. And
different implementations handle that case differently as it is not
specified, I think Mozilla's Spidermonkey indeed processes the function
declaration conditionally but with Microsoft's JScript the function
declaration is processed unconditionally.

Thanks. I didn't know this. It seems that JavaScript(TM) does allow a
FunctionDeclaration inside a Block but the ECMAScript v3 specification
does not. It is quite an ordeal to verify that an IfStatement can
contain a FunctionExpression through this series of productions:
IfStatement, Block, StatementList, Statement, VariableStatement,
VariableDeclarationList, VariableDeclaration, Initialiser,
AssignmentExpression, ConditionalExpression, LogicalORExpression,
LogicalANDExpression, BitwiseORExpression, BitwiseXORExpression,
BitwiseANDExpression, EqualityExpression, RelationalExpression,
ShiftExpression, AdditiveExpression, MultiplicativeExpression,
UnaryExpression, PostfixExpression, LeftHandSideExpression,
NewExpression, MemberExpression, FunctionExpression.


Revised code...

if (Array.prototype.push && typeof encodeURIComponent != 'undefined')
{

var serializeForm = function(f) {
var i, // elements loop index
l, // elements loop length
e, // element
es = f.elements,
c = []; // the serialization data parts

for (i=0, l=es.length; i<l; i++) {
e = es;
if (e.name && !e.disabled) {
c.push(encodeURIComponent(e.name) + "=" +
encodeURIComponent(e.value));
}
}
return c.join('&');
}

}


Thanks again,
Peter
 
M

Matt Kruse

The option wrinkle is not as simple as it seems. In fact, I think
there is no general JavaScript-only solution to that problem and the
multiple implementations approach is the only way.

This is my solution. I haven't encountered a situation where it fails
(yet). If you can find one, I'd like to try to improve the code to
handle it:

function selectValue(sel) {
if (sel.options && sel.options.length) {
var selected = [];
for (var i=0; i<sel.options.length; i++) {
var opt = sel.options;
if (sel.options.selected) {
var val = null;
// This mess is here because an option can have no value
attribute, in which case the text property is used.
// But IE messes up and gives a blank string as .value, even when
value doesn't exist. Yuck.
if (opt.value!="") { val = opt.value; }
else if (!'value' in opt) { val = opt.text; }
else if (opt.outerHTML && opt.outerHTML.test(/<[^>]+value\s*=/i))
{ val = opt.value; }
else { val = opt.text; }
if (sel.type=="select-one") {
return val;
}
selected.push(val);
}
}
return selected;
}
}

Forgive me if it errors out, I actually slightly modified my code
before posting because it was part of a bigger method. But the logic
at least should be there.

Matt Kruse
 
P

Peter Michaux

There is also IE's problem with reporting the value of an option
element (the text value should be reported if there is no value
attribute but IE doesn't).

<URL:http://groups.google.com.au/group/comp.lang.javascript/browse_frm/thread/a32139cf2fe20f60>

and

<URL:http://groups.google.com/group/comp.lang.javascript/browse_frm/thread/3b0ff7ccb462f76a>

Two getOptionValue functions were proposed in those threads. Neither
is bullet proof.

For both versions, if the value attribute is dynamically set to the
empty string, then Firefox will report the option's text property as
the option's value. An HTML page is appended below if anyone would
like to verify this problem with the two versions.

Matt's version will also be tricked if the option element has another
attribute which has a value containing the string fragment "value=" as
part of it's value. Elegie's version collapses the attribute values to
avoid this problem.

Peter



<html>
<head>
<title>getOptionValue</title>

// Matt Kruse
// <URL: http://groups.google.com/group/comp.lang.javascript/msg/3f5de3dcacef20d2>
String.prototype.test = function(regex) {
return regex.test(this);
}
function getOptionValue(opt) {
if (opt.value!="") {
return opt.value;
}
if (!'value' in opt) {
return opt.text;
}
if (opt.outerHTML && opt.outerHTML.test(/<[^>]+value\s*=/i)) {
return opt.value;
}
return opt.text;
}


// Elegie
// <URL: http://groups.google.com/group/comp.lang.javascript/msg/a5029b734629d5c5>
function getOptionValue(opt) {
var v = opt.value,
t = opt.text;

return (v || attributeExists(opt, "value")) ? v : t;

function attributeExists(obj, attrName) {
var oHtml = obj.outerHTML;
alert(oHtml)
var found = false;

if (oHtml)
found = /value\s*=/i.test(collapseQuotedValues(oHtml));

return found;

function collapseQuotedValues(txt){
var sQuote = txt.indexOf("'");
var dQuote = txt.indexOf("\"");
var q = "";

if (sQuote==-1 && dQuote!=-1) {
q ="\"";
} else if (sQuote!=-1 && dQuote==-1) {
q ="'"
} else if (sQuote!=-1 && dQuote!=-1) {
if (sQuote<dQuote) q = "'";
if (dQuote<sQuote) q = "\"";
}

if (q) txt = arguments.callee(
txt.replace(new RegExp(q+"[^"+q
+"]*"+q),"_")
);

return txt;
} // collapseQuotedValues

} // attributeExists

} // getOptionValue

</script>

<script type="text/javascript">

window.onload = function() {
var o = document.forms.foo.elements.bar.options[0];
o.value = '';
alert(getOptionValue(o)); // "hello" but should be empty
string
};

</script>

</head>
<body>

<form action="#" name="foo" method="get" accept-charset="utf-8">
<p>
<select name="bar">
<option>hello</option>
</select>
</p>
</form>

</body>
</html>
 
P

Peter Michaux

There is also IE's problem with reporting the value of an option
element (the text value should be reported if there is no value
attribute but IE doesn't).


This getOptionValue function is inspired by the form serialization
function in YUI's connection.js file.

function getOptionValue(o) {
return ((o.hasAttribute && o.hasAttribute('value')) ||
(o.attributes.value && o.attributes.value.specified)) ?
o.value : o.text;
}


The o.hasAttribute() function works in FF2 and S2 but not IE6. The
o.attributes.value.specified works in FF2 and IE6 but not S2. So both
checks are needed but what if neither works for a particular browser?
This situation exists in some older browsers and they will report
o.text when perhaps they should be reporting o.value.

One thing I don't understand is the direct access to the
o.attributes.value property. It seems to me that the spec says for a
NamedNodeMap it would instead need to be
o.attributes.getNamedItem('value'). Does anyone know about the
legality of the direct access use?

---------

The only "cross browser" solution I can think of is just develop so
this never matters. If the option element's value attribute is always
included in the HTML then we can write just the following which works
back to IE4/NN4.

function getOptionValue(o) {
return o.value;
}

The only argument against always including the value attribute is
space. For example, a list of countries is long and omitting the value
attribute would save quite a bit of space. On a page where countries
are used and the value attributes are omitted for just that select
element, we could write something like this.

function getOptionValue(o) {
return o.parentNode && o.parentNode.name == 'country' ? o.text :
o.value;
}

If we know that on the page there are some options that don't have a
value attribute specified but should never have an empty string as the
value, then we can write

function getOptionValue(o) {
return o.value || o.text;
}

Right now, it doesn't seem to me that there is a single bulletproof,
non-sniffing solution to getting the value of an option element when
the page can contain any combination of options and value attributes
and any browser may be used. With the multiple implementations we can
always choose one that works for a particular page or site, given the
rules by which the HTML authors are playing.

---------

Some links and notes I collected when looking into the YUI function...

Node interface
<URL: http://www.w3.org/TR/2000/WD-DOM-Level-1-20000929/level-one-core.html#ID-1780488922>
<URL: http://www.w3.org/TR/DOM-Level-2-Core/core.html#ID-1950641247>

The Node interface specifies an "attributes" property which is a
NamedNodeMap.

HTMLOptionElement interface
<URL: http://www.w3.org/TR/DOM-Level-2-HTML/html.html#ID-70901257>

The HTMLOptionElement interface extends the HTMLElement interface
which extends the Element interface which extends the Node interface.

The HTMLOptionElement interface specifies a read-write "value"
property. The initial value of the "value" property is set by the
value attribute in the HTML. If that attribute is not present then the
text of the element is used.

The HTMLOptionElement interface specifies a read-only "text" property.

Attr interface
<URL: http://www.w3.org/TR/DOM-Level-2-Core/core.html#ID-637646024>

The Attr interface specifies a read-only boolean "specified".

MS attributes collection
<URL: http://msdn2.microsoft.com/en-us/library/ms537438.aspx>

MS getNamedItem function (IE6+?)
<URL: http://msdn2.microsoft.com/en-us/library/ms536441.aspx>
 
P

Peter Michaux

This getOptionValue function is inspired by the form serialization
function in YUI's connection.js file.

function getOptionValue(o) {
return ((o.hasAttribute && o.hasAttribute('value')) ||
(o.attributes.value && o.attributes.value.specified)) ?
o.value : o.text;

}

The o.hasAttribute() function works in FF2 and S2 but not IE6. The
o.attributes.value.specified works in FF2 and IE6 but not S2. So both
checks are needed but what if neither works for a particular browser?
This situation exists in some older browsers and they will report
o.text when perhaps they should be reporting o.value.

The following code seems to be working. The getOptionValue function
will only be defined in browsers where the function will work
properly.

It uses only DOM standard features (no outerHTML) if it is true that
the direct option.attributes.value is really an allowed access into a
NamedNodeMap.

In the feature detection it assumes that, in a given browser, the
documentElement and the option element implement the same attribute
reporting interface. I think that is about as close to a direct test
as is possible.

// NN4 syntax error

// NN4.5 getOptionValue undefined
// O6 getOptionValue undefined
// IE4 getOptionValue undefined

// NN6 works with hasAttribute branch
// O7 works with hasAttribute branch
// S1.9 works with hasAttribute branch
// Mac/icab3.0.3 works with hasAttribute branch
// IE5 works with attributes branch
// Mac/IE5.2 works with attributes branch

var getOptionValue = (function() {
if (document.documentElement) {
if (document.documentElement.hasAttribute) {
return function(o) {
return o.hasAttribute('value') ? o.value : o.text;
};
}
if (document.documentElement.attributes) {
return function(o) {
return (o.attributes.value &&
o.attributes.value.specified) ?
o.value : o.text;
};
}
}
})();
 
R

Richard Cornford

Peter Michaux wrote:
This first installment is about form serialization.
This is a common task these days and something not
covered in the group's FAQ notes. Below is an example
of form serialization that works with forms with
text-like inputs (eg text, hidden, textarea, etc).
This follows Richard Cornford's idea of having multiple
implementations of a single interface where each
implementation is for a specific circumstance.
<snip>

So in this case we have an interface that consist of a global Identifier
named 'serializeForm' that refers to a function object, which takes a
form element as its single argument and returns a string that represents
the form control's values in a (near) application/x-www-form-urlencoded
form.

My first quibble would be the name. Where global Identifiers are to be
used (and I have no objection to their use here (better than the
pointless runtime overheads of one of those silly 'namespace' schemes))
the Identifier should be as unambiguous as possible, and certainly be
fairly explicit about what the object identified is for.

Form serialisation may take many forms, including (increasingly these
days) JSON and XML serialisation. It may even include (given a suitable
Intranet application with very restricted browser/browser configuration
support and local file system access) multipart/form-data serialization
for context involving file upload.

It would be better if the interface name stated how it was going to
serialize the form. Something like - urlSerializeForm - or -
serializeFormUrlEncodeed -.
------------------------------------------

/**
@object serializeForm<function>

@param f<form> The form element to be serialized.

@returns <string> The serialized form data in "foo=bar&asdf=1234"
format

This description should be more explicit/precise about what this
implementation actually does. It returns a string consisting of a '&'
separated sequence of '=' separated name/value pairs where the names and
values have been encoded using javascript's - encodeURIComponent -. The
result being an approximation of application/x-www-form-urlencoded, but
not including the transfformation of line breaks into "CR LF" pairs
(which is only likely to be an issue with TEXTAREA fileds on non-Windows
OSs, and then only if the recivenign software cares or where consitency
is expectd in the storage medium).

There is a text wrapping issue in this presentation of the code. If this
code is to be critiqued on Usenet (as it always should given the
intention) it would be a good idea to apply/impose the formatting rule
that all code and comments should be manually wrapped at (or before),
say, 72 characters.
serializeForm will serialize data contained in forms with text-
like
inputs (eg. text, password, hidden, textarea.)

This is a statement of intended use masquerading as a statement of
actual behaviour. This particular version will also serialize <input
type="button">, <input type="submit">, and so on, in a way that would be
unexpected in any submitted form data. In practice this version is only
really intended for use with forms that consist _only_ of the elements
listed (and that all such controls _must_ have name attributes), and
that restriction should be very clearly stated.
The returned string
is suitable for a URL query string or POST request body. Test that
this function is defined before calling it. This function has
no side effects.

The conditions determining the creation of the function object are the
existence of a - push - method of arrays (so JScript 5.5+. or emulated
on IE (<=5.0)) and the ECMAScript 3rd Ed. encodeURIComponent function
(or an emulation). When these conditions can be known to be met then
there is no need for testing prior to use. Thus the true conditions
should be stated here so that the test/don't test decision can be made
on an informed basis.

There is also the possibility of defining a 'default' function in an
else branch for the creation test that just returns - null - and having
the calling code test the return value for the success of the sterilize
function call (no string, including the empty string, equals - null - by
type-converting (or, obviously, strict) equality). That design would
allow for the signalling of error states within the serialization
function to also be signalled with a - null - return value).
Features Not Tested

This would be better expressed as "Features assumed to exist"
(ECMAScript v2, DOM 1 and earlier features are assumed to reduce
code size)
JavaScript 1.0
form.elements
form.elements.length
input.name
input.value

There has got to be as better way of expressing this. While
JavaScript(tm) 1.0 did have a notion of 'client side' and 'core'
JavaScript(tm) that notion has long gone in practice, and the echoes of
idea do no more now than introduce misconceptions. We (as a group)
should do nothing to promote any ongoing perception of the browser
object model as being a part of the javascript language.
ECMAScript v1
array.join
DOM Level 1
input.disabled
*/

// ECMAScript v3 feature tests
if (Array.prototype.push && encodeURIComponent) {

Martin Honnen has already corrected that second test, though it could
still be a type-converting test rather than - typeof - (-
window.encodeURIComponent - , or some form of -
global.encodeURIComponent - with - var global - explicitly set (perhaps
elsewhere)).

The use of - push - is avoidable with another local variable initialised
to zero and post incremented at assignment:-

var n = 0;

....

c[n++] = (
encodeURIComponent(e.name) +
"=" +
encodeURIComponent(e.value)
);

- as the output array is initialised to an empty array.
function serializeForm(f) {
var i, // elements loop index
l, // elements loop length
e, // element
es = f.elements,
c = []; // the serialization data parts


Where the intention is to publish a large(ish) code repository (of any
sort) there must be a great deal to be said for the code in that
repository having a consistent form. That would effectively mean all the
code being written to conform to (or modified to conform to) some single
set of 'code style' guidelines. Consistency would be the primary goal so
the precise set of style rules applied would not matter beyond their
being reasonably ration/workable and clearly stated. And inevitably such
rules would need to be agreed by anyone (at least initially) intending
to participate.

I can anticipate the above code block attracting criticism under many
people's preferred 'code style' rules. The idea of having a comma
separated set of local variable declarations/initialisations spread
across a number of lines would fall fowl of some for a start. Others may
quibble about the degree to which the Identifiers used where meaningful,
while others would not see that as important in such a short section of
code.

I would not object to a single letter Identifier being used as a loop
counter, but I would object to that letter being any one of lower case
'I' or 'L', and upper case 'o'. It has been observed that in many
type-faces these characters are easily mistaken for other characters;
'i' and/or 'l' for 'l', 'i' and '1', and 'O' for zero. In other
languages (most notably Java) 'i' may be an acceptable loop counter
Identifier, but there it has meaning beyond being an arbitrary
Identifier (the type of the counter is invariably - int -). In
javascript the type of the counter is always just number (as there is
only one number type) so there are no reasons for using any particular
character in that context, but there are reasons for not using 'i', 'l,
or 'O'. Personally I use 'c' (for counter) exclusively (and 'd', 'e',
'f', etc. when nesting loops), and use 'len' as the Identifier for
length constraints.

(There are also inherent advantages in not using Java style conventions,
such as - int i - for loop counters, when writing javascript because
there are contexts (e.g. JSP) where the two types of code can appear
alongside each other, and a quick visual discrimination between the two
(otherwise syntactically similar) languages is useful).
for (i=0, l=es.length; i<l; i++) {
e = es;
if (!e.disabled) {
c.push(encodeURIComponent(e.name) + "=" +
encodeURIComponent(e.value));
}
}
return c.join('&');
}
}


On the general interface design question, may I suggest an optional
second argument that would be the name of a submit button (of one sort
or another). It would not be used in this implementation (where submit
buttons are not being handled at all, and so should be absent form the
form) but in more elaborate implementations, while filtering out <input
type=submit>, <input type=image>, <button type=submit> and <button>
elements, the name value pair for any who's name corresponded with that
second argument (if provided) could be inserted into the serialization.
That could be useful for emulating types of form submission where the
name/value pair of the submit button clicked determined the actions of
the server script).

Maybe not a complete solution as people often use many like-named submit
buttons with differing values. And maybe not a necessary addition as the
code retrieving the serialized form could always append such a name
value pair to the serialized form.

Generally these interface designs deserve some kicking about before
fixing on the final design so that they can be flexible enough to
support a good range of possible/likely underlying implementations
beyond the one situation that first prompts their creation.

Richard.
 
R

Richard Cornford

Matt said:
IMO, this example is taking that philosophy a bit too far. The
extra code required to handle all input types is minimal. Why
would someone want a stripped-down version that only handles a
very limited case, when they could have a fully reusable version
that could be used in every case?

Speed would be one very good reason. The odds are that when you want to
serialize a form you are reacting to user input and planning on getting
the server to decide how to react to that input. Responding to user
input is something that you always want to do quickly, and if you know
up front that a set of possible conditions will never occur in the
context there is no need to be wasting time executing code that is
designed to accommodate them.
This problem is very "solvable" in that a single
general- purpose solution can be created.

Well, no. Your own attempts have never addressed the possibility of a
form including OBJECT elements in its submissions. Granted that is an
extremely unexpected condition (especially given the extremely poor
specification for OBJECT elements (generally and in the context of form
controls), combined with extremely inconsistent browser support for
OBJECT elements), but they are still part of the general problem by
specification. And your handling of line breaks in TEXTAREA elements
does not quite come up to the specification for
application/x-www-form-urlencoded.
For different problems, that that's not so true and the
"many implementation" approach makes more sense.

I think this is a very reasonable candidate for a "many implementations"
approach. The web application that I am working on at the moment imposes
conditions where I can be certain that no SELECT element will ever not
have a VALUE attribute (as a direct consequence of the need to be
multilingual) and the set of 'supported' browsers is such that I do not
need to go into the - options - collection for - type == 'select-one' -
and can treat such elements just like <input type=text>. Thus the
specific code for the task would be smaller, simpler and faster. Other
context may present alternative sets of knowns and unknowns and so be
optimally handled by different implementations, while retaining the
flexibility to react to changing conditions by doing no more than
changing the underlying implementation.
In any case, my stab at it was posted by Rob. It does need
to be improved to handle the special select case in IE and
maybe there are some more quirks that exist that I didn't
handle. You could surely take the serialization code from a
few different libs and combine them into a single solution.

A strategy where the resulting code gets bigger, more complex and slower
with the passage of time, but only in order to cover ever more obscure
conditions (some of which are likely to be mutually exclusive in
reality).

Richard.
 
R

Richard Cornford

var getOptionValue = (function() {
if (document.documentElement) {
if (document.documentElement.hasAttribute) {
return function(o) {
return o.hasAttribute('value') ? o.value : o.text;
<snip>

Whenever a - value - attribute has trueness there is no need for any
other testing to be applied (and that will be the most common case by a
large margined). I would be inclined to test for that first, and use a
true result to short-circuit the rest of the testing. I.E.:-

return (o,value || (o.hasAttribute('value') ? o.value : o.text));

Richard.
 
P

Peter Michaux

<snip>

Whenever a - value - attribute has trueness there is no need for any
other testing to be applied (and that will be the most common case by a
large margined). I would be inclined to test for that first, and use a
true result to short-circuit the rest of the testing. I.E.:-

return (o,value || (o.hasAttribute('value') ? o.value : o.text));

This is one of the balance issues upon which programmers notoriously
have a difficult time agreeing. Both versions will produce correct
results. Is the extra code bulk worth the extra download time? For
some forms it would be and for others it wouldn't. Unfortunately there
is not clear answer.

I would say that I worry more about download time for serialization. I
know the code will need to be downloaded. I don't know if the code you
have posted will be faster or slower. Form serialization is usually so
fast that slight algorithmic inefficiencies can be included and still
produce a sufficiently fast algorithms. By sufficiently fast I mean
the user doesn't notice a difference. If the serialized data is sent
to the server, the communication with the server will be far longer
than the serialization. What generally seems slow in the browser
scripting world are screen redraws in animations and server
communication (including initial script downloads.) In terms of
responsiveness, those two activities eclipse most run of the mill
browser scripting activities.

Peter
 
P

Peter Michaux

Peter Michaux wrote:



<snip>

So in this case we have an interface that consist of a global Identifier
named 'serializeForm' that refers to a function object, which takes a
form element as its single argument and returns a string that represents
the form control's values in a (near) application/x-www-form-urlencoded
form.

Indeed it is only "near".

http://www.w3.org/TR/html4/interact/forms.html#h-17.13.4.1

"a b" should be encoded as "a+b"

but

encodeURIComponent('a b') // "a%20b"


"a\nb" should be encoded as "a%0D%0Ab"

but

encodeURIComponent('\n') // "%0A"


In practice, it seems developers don't complain about this so their
server-side toolkits that decode a request must know to do the right
thing.

I wonder why encodeURIComponent doesn't encode the whitespace
correctly.

Is replicating the spec for spaces and new lines necessary or should
it simply be documented that there is a difference?

My first quibble would be the name.

I agree.

(To a certain degree, over time I've been loosing energy to fight for
this aesthetic quality simply because I work with a server-side legacy
system that has a hodgepodge of names. Also, the browser scripting
world is not known for a clean API. I just get used to looking up the
correct name. We should try for a nice naming scheme, however.)

Where global Identifiers are to be
used (and I have no objection to their use here (better than the
pointless runtime overheads of one of those silly 'namespace' schemes))

I don't think one level of namespace is silly but would like to focus
on correct algorithms. If an automatic build system converts a
collection of these multiple implementations into a library, then the
build system could introduce the namespacing. Correct algorithms are
far more time consuming to develop then repackaging the code for use
with a namespace.

the Identifier should be as unambiguous as possible, and certainly be
fairly explicit about what the object identified is for.

Form serialisation may take many forms, including (increasingly these
days) JSON and XML serialisation. It may even include (given a suitable
Intranet application with very restricted browser/browser configuration
support and local file system access) multipart/form-data serialization
for context involving file upload.

It would be better if the interface name stated how it was going to
serialize the form. Something like - urlSerializeForm - or -
serializeFormUrlEncodeed -.

To me, descriptive but still short is better. I like the idea that
serializing the form is the primary concern and the format is
secondary. "serializeFormUrl" and "serializeFormJson" have the feel of
the ES4 type annotations that are coming down the pipe.

This description should be more explicit/precise about what this
implementation actually does. It returns a string consisting of a '&'
separated sequence of '=' separated name/value pairs where the names and
values have been encoded using javascript's - encodeURIComponent -. The
result being an approximation of application/x-www-form-urlencoded, but
not including the transfformation of line breaks into "CR LF" pairs
(which is only likely to be an issue with TEXTAREA fileds on non-Windows
OSs, and then only if the recivenign software cares or where consitency
is expectd in the storage medium).

The documentation does need improvement.
There is a text wrapping issue in this presentation of the code. If this
code is to be critiqued on Usenet (as it always should given the
intention) it would be a good idea to apply/impose the formatting rule
that all code and comments should be manually wrapped at (or before),
say, 72 characters.

I noticed this also. I only looked briefly and thought it was the
google groups software wrapping after 71 characters. If Usenet wraps
after 72 characters then perhaps we should set the limit to 70 so that
at least one reply can be made without causing wrapping in the reply.

TEST

123456789 123456789 123456789 123456789 123456789 123456789 1 3 5 7 9
1 3 5 7 9


123456789 123456789 123456789 123456789 123456789 1234567890 2 4 6 8 0
2 4 6 8 0


(I'm posting through the google groups interface.)
This is a statement of intended use masquerading as a statement of
actual behaviour. This particular version will also serialize <input
type="button">, <input type="submit">, and so on, in a way that would be
unexpected in any submitted form data. In practice this version is only
really intended for use with forms that consist _only_ of the elements
listed (and that all such controls _must_ have name attributes), and
that restriction should be very clearly stated.

Good point.

The conditions determining the creation of the function object are the
existence of a - push - method of arrays (so JScript 5.5+. or emulated
on IE (<=5.0)) and the ECMAScript 3rd Ed. encodeURIComponent function
(or an emulation). When these conditions can be known to be met then
there is no need for testing prior to use. Thus the true conditions
should be stated here so that the test/don't test decision can be made
on an informed basis.

So you are in favor of stating the requirements in documentation
rather than including the feature tests in the code? I would think
that a version with and a version without the tests could account for
two of the multiple implementations. The one with tests being
applicable to the general web.

There is also the possibility of defining a 'default' function in an
else branch for the creation test that just returns - null - and having
the calling code test the return value for the success of the sterilize
function call (no string, including the empty string, equals - null - by
type-converting (or, obviously, strict) equality). That design would
allow for the signalling of error states within the serialization
function to also be signalled with a - null - return value).

I've been thinking about this lately as a result of some other posts
about feature detection.

I like the idea of not defining a function if feature detection
determines the function cannot be supported. It would be nice if
browser features that exist were bug free always worked. This allows
for a nice programming pattern where functions are only defined if the
dependencies exists. For example, a tabbed pane widget creation
function wouldn't even exist if the browser's event's module is not
present and functioning adequately. This seems like a safer way to
program. The situation where a section of JavaScript starts executing
and manipulating the DOM but fails half way through execution seems to
be a concern that occupies my mind frequently. This technique of only
defining a function when the dependencies are available is not
possible in general. Scroll reporting is an example as the body
element may need to be present. In scroll reporting checking the
return value is important. I'd like to try to define functions only
when dependencies are available and fall back to checking return
values only when necessary. I'm not completely committed on way or the
other, however.

Another technique I tried for a while was to have an isSupported
function

function serializeForm() {
// ...
}
serializeForm.isSupported = function() {
// ...
};

// use
if (serializeForm.isSupported()) {
serializeForm();
}

This would be better expressed as "Features assumed to exist"
Yes.


There has got to be as better way of expressing this.

Indeed. Those were very rough notes.
While
JavaScript(tm) 1.0 did have a notion of 'client side' and 'core'
JavaScript(tm) that notion has long gone in practice, and the echoes of
idea do no more now than introduce misconceptions. We (as a group)
should do nothing to promote any ongoing perception of the browser
object model as being a part of the javascript language.
Agreed.



Martin Honnen has already corrected that second test, though it could
still be a type-converting test rather than - typeof - (-
window.encodeURIComponent - , or some form of -
global.encodeURIComponent - with - var global - explicitly set (perhaps
elsewhere)).

One thing that really bothers me about the mainstream libraries is
there is some sort of foundation file upon which all other files
exist. I would like to avoid this and so would rather not have a file
with "var global=this;". It is a slippery slope and that foundation
file has always grown in size in the libraries I've watched.

In ES4 there will be a global identifier called "global" that is
equivalent to "window" so avoiding "global" as a global identifier may
be a good idea although the clobbering may not make any difference.

I am not opposed to having a "global" identifier at times. For example
this pattern seems handy...

var foo = (function() {
var global = this;
return function() {
// some use of global
}
})();

I don't think we should use "window" as a global reference in case the
code could be used outside a browser

Regardless of those nit picks, there is a problem changing the test
from "typeof encodeURIComponent != 'undefined'" to
"window.encodeURIComponent" or "global.encodeURIComponent". The
problem is that the test is not the same as the use in the function.
If the use was also changed to "global.encodeURIComponent" it would be
ok. I'm thinking about an automatic build system that inserts the
serializeFormUrl source code into a scope (perhaps using some
templating system to make the build JavaScript files)

function() {

function encodeUrlComponent(){}

// begin template inserted code
if (window.encodeURIComponent) {
var serializeFormUrl = function() {
encodeURIComponent();
}
}
// end template inserted code

}

If the above code runs in a browser without a global
encodeUrlComponent then it will not define serializeFormUrl even
though it would have been possible to do so. For these reason I think
Martin Honnen's suggestion is a good one to stick with.


The use of - push - is avoidable with another local variable initialised
to zero and post incremented at assignment:-

var n = 0;

...

c[n++] = (
encodeURIComponent(e.name) +
"=" +
encodeURIComponent(e.value)
);

- as the output array is initialised to an empty array.


In principle keeping the dependencies low is a good idea. I'd feel
very compelled to make this change if there exists a browser that
supports encodeURIComponent and does not have Array.prototype.push.
Otherwise I really don't have much opinion. If you or others think
removing the "push" dependency is that important it is fine with me.

[snip about code conventions: whitespace, single letter identifiers
and loop identifiers]

I agree these issues are important and should be addressed. It is very
difficult to achieve agreement. Because this is so difficult to
achieve agreement, I will state my primary concern is local variables
in short functions are short. It doesn't really bother me that lower
case "ell" and uppercase "Oh" are a problem in some fonts. Programmers
really should have picked a good font for their editor :). I'm not so
concerned if loop identifiers are "i", "j", "k" with lengths "ilen",
"jlen", "klen" or if they are "c", "d", "e". I'd like to stick with
"i", "j", "k" because they are traditional and I've used them for a
long time. I do use "e" for element and event and "f" for form and
function. However these functions are short so any identifiers really
are ok for me. And after the functions are written my plan is not to
look inside them frequently so it is not a big concern for me.
Download size does matter, however. Perhaps we can otherwise finish
the first function and then make a decision about this just for some
momentum.

----

I've always seen the "trick" to avoid Array.prototype.push written as

c[c.length] = ...

On the general interface design question, may I suggest an optional
second argument that would be the name of a submit button (of one sort
or another). It would not be used in this implementation (where submit
buttons are not being handled at all, and so should be absent form the
form) but in more elaborate implementations, while filtering out <input
type=submit>, <input type=image>, <button type=submit> and <button>
elements, the name value pair for any who's name corresponded with that
second argument (if provided) could be inserted into the serialization.
That could be useful for emulating types of form submission where the
name/value pair of the submit button clicked determined the actions of
the server script).

Maybe not a complete solution as people often use many like-named submit
buttons with differing values. And maybe not a necessary addition as the
code retrieving the serialized form could always append such a name
value pair to the serialized form.

In a previous XHR library I wrote I did accommodate the button name/
value problem. For these many implementations I was thinking it would
be better to let the calling function append.
Generally these interface designs deserve some kicking about before
fixing on the final design so that they can be flexible enough to
support a good range of possible/likely underlying implementations
beyond the one situation that first prompts their creation.

I think that when designing an interface that is intended to grow,
having the last argument be an object (to be used as a hash) is a good
idea. That way any implementation could require extra arguments

serializeFormUrl(form, options) {}

for one implementation options could be

{
button: {name:'foo', value:'bar'}
}

or another format like the following (which could suffer from the for-
in don't enum enumeration problem in IE.)

{
extraParams: {foo:'bar'}
}

Thanks for the thoughts, Richard.

Peter
 
P

Peter Michaux

Peter Michaux wrote:



<snip>

So in this case we have an interface that consist of a global Identifier
named 'serializeForm' that refers to a function object, which takes a
form element as its single argument and returns a string that represents
the form control's values in a (near) application/x-www-form-urlencoded
form.

Indeed it is only "near".

http://www.w3.org/TR/html4/interact/forms.html#h-17.13.4.1

"a b" should be encoded as "a+b"

but

encodeURIComponent('a b') // "a%20b"


"a\nb" should be encoded as "a%0D%0Ab"

but

encodeURIComponent('\n') // "%0A"


In practice, it seems developers don't complain about this so their
server-side toolkits that decode a response must know to do the right
thing.

I wonder why encodeURIComponent doesn't encode the whitespace
correctly.

Is replicating the spec for spaces and new lines necessary or should
it simply be documented that there is a difference?

My first quibble would be the name.

I agree.

(To a certain degree, over time I've been loosing energy to fight for
this aesthetic quality simply because I work with a server-side legacy
system that has a hodgepodge of names. Also, the browser scripting
world is not known for a clean API. I just get used to looking up the
correct name. We should try for a nice naming scheme, however.)

Where global Identifiers are to be
used (and I have no objection to their use here (better than the
pointless runtime overheads of one of those silly 'namespace' schemes))

I don't think one level of namespace is silly but would like to focus
on correct algorithms. If an automatic build system converts a
collection of these multiple implementations into a library, then the
build system could introduce the namespacing. Correct algorithms are
far more time consuming to develop then repackaging the code for use
with a namespace.

the Identifier should be as unambiguous as possible, and certainly be
fairly explicit about what the object identified is for.

Form serialisation may take many forms, including (increasingly these
days) JSON and XML serialisation. It may even include (given a suitable
Intranet application with very restricted browser/browser configuration
support and local file system access) multipart/form-data serialization
for context involving file upload.

It would be better if the interface name stated how it was going to
serialize the form. Something like - urlSerializeForm - or -
serializeFormUrlEncodeed -.

To me, descriptive but still short is better. I like the idea that
serializing the form is the primary concern and the format is
secondary. "serializeFormUrl" and "serializeFormJson" have the feel of
the ES4 type annotations that are coming down the pipe.

This description should be more explicit/precise about what this
implementation actually does. It returns a string consisting of a '&'
separated sequence of '=' separated name/value pairs where the names and
values have been encoded using javascript's - encodeURIComponent -. The
result being an approximation of application/x-www-form-urlencoded, but
not including the transfformation of line breaks into "CR LF" pairs
(which is only likely to be an issue with TEXTAREA fileds on non-Windows
OSs, and then only if the recivenign software cares or where consitency
is expectd in the storage medium).

THe documentation does need improvement.
There is a text wrapping issue in this presentation of the code. If this
code is to be critiqued on Usenet (as it always should given the
intention) it would be a good idea to apply/impose the formatting rule
that all code and comments should be manually wrapped at (or before),
say, 72 characters.

I noticed this also. I only looked briefly and thought it was the
google groups software wrapping after 71 characters. If Usenet wraps
after 72 characters then perhaps we should set the limit to 70 so that
at least one reply can be made without causing wrapping in the reply.

TEST

123456789 123456789 123456789 123456789 123456789 123456789 1 3 5 7 9
1 3 5 7 9


123456789 123456789 123456789 123456789 123456789 1234567890 2 4 6 8 0
2 4 6 8 0


(I'm posting through the google groups interface.)
This is a statement of intended use masquerading as a statement of
actual behaviour. This particular version will also serialize <input
type="button">, <input type="submit">, and so on, in a way that would be
unexpected in any submitted form data. In practice this version is only
really intended for use with forms that consist _only_ of the elements
listed (and that all such controls _must_ have name attributes), and
that restriction should be very clearly stated.

Good point.

The conditions determining the creation of the function object are the
existence of a - push - method of arrays (so JScript 5.5+. or emulated
on IE (<=5.0)) and the ECMAScript 3rd Ed. encodeURIComponent function
(or an emulation). When these conditions can be known to be met then
there is no need for testing prior to use. Thus the true conditions
should be stated here so that the test/don't test decision can be made
on an informed basis.

So you are in favor of stating the requirements in documentation
rather than including the feature tests in the code? I would think
that a version with and a version without the tests could account for
two of the multiple implementations. The one with tests being
applicable to the general web.

There is also the possibility of defining a 'default' function in an
else branch for the creation test that just returns - null - and having
the calling code test the return value for the success of the sterilize
function call (no string, including the empty string, equals - null - by
type-converting (or, obviously, strict) equality). That design would
allow for the signalling of error states within the serialization
function to also be signalled with a - null - return value).

I've been thinking about this lately as a result of some other posts
about feature detection.

I like the idea of not defining a function if feature detection
determines the function cannot be supported. It would be nice if
browser features that exist were bug free always worked. This allows
for a nice programming pattern where functions are only defined if the
dependencies exists. For example, a tabbed pane widget creation
function wouldn't even exist if the browser's event's module is not
present and functioning adequately. This seems like a safer way to
program. The situation where a section of JavaScript starts executing
and manipulating the DOM but fails half way through execution seems to
be a concern that occupies my mind frequently. This technique of only
defining a function when the dependencies are available is not
possible in general. Scroll reporting is an example as the body
element may need to be present. In scroll reporting checking the
return value is important. I'd like to try to define functions only
when dependencies are available and fall back to checking return
values only when necessary. I'm not completely committed on way or the
other, however.

Another technique I tried for a while was to have an isSupported
function

function serializeForm() {
// ...
}
serializeForm.isSupported = function() {
// ...
};

// use
if (serializeForm.isSupported()) {
serializeForm();
}

This would be better expressed as "Features assumed to exist"
Yes.


There has got to be as better way of expressing this.

Indeed. Those were very rough notes.
While
JavaScript(tm) 1.0 did have a notion of 'client side' and 'core'
JavaScript(tm) that notion has long gone in practice, and the echoes of
idea do no more now than introduce misconceptions. We (as a group)
should do nothing to promote any ongoing perception of the browser
object model as being a part of the javascript language.
Agreed.



Martin Honnen has already corrected that second test, though it could
still be a type-converting test rather than - typeof - (-
window.encodeURIComponent - , or some form of -
global.encodeURIComponent - with - var global - explicitly set (perhaps
elsewhere)).

One thing that really bothers me about the mainstream libraries is
there is some sort of foundation file upon which all other files
exist. I would like to avoid this and so would rather not have a file
with "var global=this;". It is a slippery slope and that foundation
file has always grown in size in the libraries I've watched.

In ES4 there will be a global identifier called "global" that is
equivalent to "window" so avoiding "global" as a global identifier may
be a good idea although the clobbering may not make any difference.

I am not opposed to having a "global" identifier at times. For example
this pattern seems handy...

var foo = (function() {
var global = this;
return function() {
// some use of global
}
})();

I don't think we should use "window" as a global reference in case the
code could be used outside a browser

Regardless of those nit picks, there is a problem changing the test
from "typeof encodeURIComponent != 'undefined'" to
"window.encodeURIComponent" or "global.encodeURIComponent". The
problem is that the test is not the same as the use in the function.
If the use was also changed to "global.encodeURIComponent" it would be
ok. I'm thinking about an automatic build system that inserts the
serializeFormUrl source code into a scope (perhaps using some
templating system to make the build JavaScript files)

function() {

function encodeUrlComponent(){}

// begin template inserted code
if (window.encodeURIComponent) {
var serializeFormUrl = function() {
encodeURIComponent();
}
}
// end template inserted code

}

If the above code runs in a browser without a global
encodeUrlComponent then it will not define serializeFormUrl even
though it would have been possible to do so. For these reason I think
Martin Honnen's suggestion is a good one to stick with.


The use of - push - is avoidable with another local variable initialised
to zero and post incremented at assignment:-

var n = 0;

...

c[n++] = (
encodeURIComponent(e.name) +
"=" +
encodeURIComponent(e.value)
);

- as the output array is initialised to an empty array.


In principle keeping the dependencies low is a good idea. I'd feel
very compelled to make this change if there exists a browser that
supports encodeURIComponent and does not have Array.prototype.push.
Otherwise I really don't have much opinion. If you or others think
removing the "push" dependency is that important it is fine with me.

[snip about code conventions: whitespace, single letter identifiers
and loop identifiers]

I agree these issues are important and should be addressed. It is very
difficult to achieve agreement. Because this is so difficult to
achieve agreement, I will state my primary concern is local variables
in short functions are short. It doesn't really bother me that lower
case "ell" and uppercase "Oh" are a problem in some fonts. Programmers
really should have picked a good font for their editor :). I'm not so
concerned if loop identifiers are "i", "j", "k" with lengths "ilen",
"jlen", "klen" or if they are "c", "d", "e". I'd like to stick with
"i", "j", "k" because they are traditional and I've used them for a
long time. I do use "e" for element and event and "f" for form and
function. However these functions are short so any identifiers really
are ok for me. And after the functions are written my plan is not to
look inside them frequently so it is not a big concern for me.
Download size does matter, however. Perhaps we can otherwise finish
the first function and then make a decision about this just for some
momentum.

----

I've always seen the "trick" to avoid Array.prototype.push written as

c[c.length] = ...

On the general interface design question, may I suggest an optional
second argument that would be the name of a submit button (of one sort
or another). It would not be used in this implementation (where submit
buttons are not being handled at all, and so should be absent form the
form) but in more elaborate implementations, while filtering out <input
type=submit>, <input type=image>, <button type=submit> and <button>
elements, the name value pair for any who's name corresponded with that
second argument (if provided) could be inserted into the serialization.
That could be useful for emulating types of form submission where the
name/value pair of the submit button clicked determined the actions of
the server script).

Maybe not a complete solution as people often use many like-named submit
buttons with differing values. And maybe not a necessary addition as the
code retrieving the serialized form could always append such a name
value pair to the serialized form.

In a previous XHR library I wrote I did accommodate the button name/
value problem. For these many implementations I was thinking it would
be better to let the calling function append.
Generally these interface designs deserve some kicking about before
fixing on the final design so that they can be flexible enough to
support a good range of possible/likely underlying implementations
beyond the one situation that first prompts their creation.

I think that when designing an interface that is intended to grow,
having the last argument be an object (to be used as a hash) is a good
idea. That way any implementation could require extra arguments

serializeFormUrl(form, options) {}

for one implementation options could be

{
button: {name:'foo', value:'bar'}
}

or another format like the following (which could suffer from the for-
in don't enum enumeration problem in IE.)

{
extraParams: {foo:'bar'}
}

Thanks for the thoughts, Richard.

Peter
 
P

Peter Michaux

Below is my current version of the simple serializeFormUrl function.

I've changed the documentation format. In the dependency section "a"
means a required and assumed feature, "r" means a required and tested
feature. I don't mind if the documentation system evolves over the
implementation of the first few interfaces. If this is on the right
track that is fine with me for now.

I removed the use of c.push() in favor of c[c.length]. This change
reduces dependencies, is faster (at least in FF2), reduces code size
(no need for feature test), and does not require any extra variables.

I reduce the indent to two spaces since usenet has such short line
lengths (approx 70 chars). I personally prefer four spaces but I think
two will be better for this project.

If there are any remaining offenses in this implementation of the
interface that make this implementation inadequate for recommendation,
please post an edited version.


/**
* @object serializeFormUrl [function]
* All elements of the form that have a non-empty name
* attribute and are not disabled will be serialized.
*
* Serialize form data in a format similar to the
* <a href="http://www.w3.org/TR/1999/REC-html401-19991224/interact/
forms.html#h-17.13.4.1">
* application/x-www-form-urlencoded</a> standardized format.
*
* The handling of whitespace different from the standard.
* According to the x-www-form-urlencoded standard,
* a space should be encoded as a "+" and a newline
* should be encoded as a "%0D%0A".
*
* The JavaScript encodeURIComponent function is used
* to encode the form data. A space is encoded as "%20".
* On Windows operating system a new line is encoded
* as "%0D%0A". On Mac OS X a new line is encded as
* just "%0A".
*
* @param f [form]
* The form element to be serialized.
*
* @returns [string]
* The serialized form.
*
* @dependencies
* r - ES3 - encodeURIComponent
* a - ES1 - Array.prototype.join
* a - DOM1 - HTMLFormElement.elements
* a - DOM1 - HTMLFormElement.elements.length
* a - DOM1 - HTMLFormElement.elements.name
* a - DOM1 - HTMLFormElement.elements.value
* a - DOM1 - HTMLFormElement.elements.disabled
*/

// test for required feature
if (typeof encodeURIComponent != 'undefined') {

var serializeFormUrl = function(f) {
var i, // elements loop index
ilen, // elements loop length
e, // element
es = f.elements,
c = []; // the serialized name=value pairs

for (i=0, ilen=es.length; i<ilen; i++) {
e = es;
if (e.name && !e.disabled) {
c[c.length] = encodeURIComponent(e.name) + "=" +
encodeURIComponent(e.value);
}
}
return c.join('&');
};

}
 
D

dhtmlkitchen

I'm glad this problem is finally getting the attention it deserves!
I've proposed Form Serialization to the WHAT WG. The browsers could
make this really easy for us with something like:

aForm.getEncodedURIString
aForm.getJSONString
aForm.getJSONData

Encoding could be based on the form, or could be an param.

<snip>

Whenever a - value - attribute has trueness there is no need for any
other testing to be applied (and that will be the most common case by a
large margined). I would be inclined to test for that first, and use a
true result to short-circuit the rest of the testing. I.E.:-

return (o,value || (o.hasAttribute('value') ? o.value : o.text));

The problem is when the value attribute is intentionally blank.

<select>
<option value="">select a country</option>
<option value="number1">USA</option>
</select>

In that case, there will be an error in IE (doesn't support
Element.prototype.hasAttribute).

The code Peter posted is attempting to parse the outerHTML in lieu of
Internet Explorer's broken attribute support. This is a painful. I
would like to offer an IE 6+ solution for hasAttribute:

// IE does not support Element.prototype.hasAttribute
function hasAttribute(el, name) {
var att = el.getAttributeNode(name);
return Boolean(att) && att.specified;
}

What if the form has a file? Do you use FileList, or do you use an
iframe, just to be on the lowest common denominator for the other
browsers?

Garrett
 
P

Peter Michaux

On Nov 11, 11:10 pm, "(e-mail address removed)" <[email protected]>
wrote:

[snip]
The problem is when the value attribute is intentionally blank.

<select>
<option value="">select a country</option>
<option value="number1">USA</option>
</select>

In that case, there will be an error in IE (doesn't support
Element.prototype.hasAttribute).

The code Peter posted is attempting to parse the outerHTML in lieu of
Internet Explorer's broken attribute support.

I did repost some other people's outerHTML solutions but don't think
they are the right way to go.

This is a painful. I
would like to offer an IE 6+ solution for hasAttribute:

// IE does not support Element.prototype.hasAttribute
function hasAttribute(el, name) {
var att = el.getAttributeNode(name);
return Boolean(att) && att.specified;

}


el.getAttributeNode is undefined in IE 5.5. The following works in IE
5.5+ and so better suits the "lowest common denominator" approach.

var getOptionValue = (function() {
if (document.documentElement) {
if (document.documentElement.hasAttribute) {
return function(o) {
return o.hasAttribute('value') ? o.value : o.text;
};
}
if (document.documentElement.attributes) {
return function(o) {
return (o.attributes.value &&
o.attributes.value.specified) ?
o.value : o.text;
};
}
}

})();

What if the form has a file? Do you use FileList, or do you use an
iframe, just to be on the lowest common denominator for the other
browsers?

Use an iframe. Form serialization wouldn't likely be used in the case
of iframe submission.

Peter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,432
Messages
2,571,680
Members
48,796
Latest member
Greg L.

Latest Threads

Top