My Library TaskSpeed tests updated

D

David Mark

I've updated the TaskSpeed test functions to improve performance. This
necessitated some minor additions (and one change) to the OO interface
as well. I am pretty happy with the interface at this point, so will
set about properly documenting it in the near future.

http://groups.google.com/group/my-library-general-discussion/browse_thread/thread/94bcbd95caa03991

I had been thinking about what to do with cloneNode in the OO interface.
All of the talk about PureDom's non-use of it gave me some ideas on how
it could fit in and speed up these tests as well. The change is that
E.prototype.clone no longer returns a wrapped node (not sure why I had
it doing that in the first place as it has been a while since I slapped
these objects together).

Objects design is considerably less slap-dash now too. Almost all
methods are on the prototypes now. There were a dozen or so that were
being added at construction. It wasn't a big issue, but required that
inheriting constructors call the "super" constructor on construction
(which isn't always desirable). For example, F (form) inherits from E,
but implements custom element and load methods. C (control) inherits
from E, but uses the stock methods.

F = function(i, docNode) {
var el;

if (this == global) {
return new F(i, docNode);
}

// Custom element/load methods get and set the element

function element() {
return el;
}

this.load = function(name, docNode) {
el = typeof name == 'object' ? name : getForm(name, docNode);
this.element = (el)?element:null;
return this;
};

this.load(i, docNode);
};

C = function(i, docNode) {

// Called without - new - operator

if (this == global) {
return new C(i, docNode);
}

// Uses stock E element/load methods

E.call(this, i, docNode);
};

I'm no OO guru (or even fan), but I am happy with the way interface has
turned out. One nagging incongruity is that (as noted in the
documentation), you _must_ use the - new - operator when calling from
outside the frame containing the My Library script (sort of like dialing
the area code for a long-distance call).

Functions are a bit more concise now too. The methods names are still
relatively verbose (which is not central to the issue IMO), but there
aren't as many calls to them now (which is what I consider the
determining factor for conciseness). Fewer calls makes for a lower
bill. ;)

Your move, Andrea. Do your worst. :)
 
D

David Mark

David said:
I've updated the TaskSpeed test functions to improve performance. This
necessitated some minor additions (and one change) to the OO interface
as well. I am pretty happy with the interface at this point, so will
set about properly documenting it in the near future.

http://groups.google.com/group/my-library-general-discussion/browse_thread/thread/94bcbd95caa03991

I had been thinking about what to do with cloneNode in the OO interface.
All of the talk about PureDom's non-use of it gave me some ideas on how
it could fit in and speed up these tests as well. The change is that
E.prototype.clone no longer returns a wrapped node (not sure why I had
it doing that in the first place as it has been a while since I slapped
these objects together).

Objects design is considerably less slap-dash now too. Almost all
methods are on the prototypes now. There were a dozen or so that were
being added at construction. It wasn't a big issue, but required that
inheriting constructors call the "super" constructor on construction
(which isn't always desirable). For example, F (form) inherits from E,
but implements custom element and load methods. C (control) inherits
from E, but uses the stock methods.

F = function(i, docNode) {
var el;

if (this == global) {
return new F(i, docNode);
}

// Custom element/load methods get and set the element

function element() {
return el;
}

this.load = function(name, docNode) {
el = typeof name == 'object' ? name : getForm(name, docNode);
this.element = (el)?element:null;
return this;
};

this.load(i, docNode);
};

C = function(i, docNode) {

// Called without - new - operator

if (this == global) {
return new C(i, docNode);
}

// Uses stock E element/load methods

E.call(this, i, docNode);
};

I'm no OO guru (or even fan), but I am happy with the way interface has
turned out. One nagging incongruity is that (as noted in the
documentation), you _must_ use the - new - operator when calling from
outside the frame containing the My Library script (sort of like dialing
the area code for a long-distance call).

Functions are a bit more concise now too. The methods names are still
relatively verbose (which is not central to the issue IMO), but there
aren't as many calls to them now (which is what I consider the
determining factor for conciseness). Fewer calls makes for a lower
bill. ;)

Your move, Andrea. Do your worst. :)

Opera 10.10, Windows XP on a very busy and older PC:-

2121 18624 9000 5172 22248 4846 4360 1109 1266 1189
6140 1876 843* 798*


I ran it a few times. This is representative. The two versions flip
flop randomly. Usually around a third of the purer tests. :)

The worst is always Prototype. Here roughly 30x slower than My Library.
And it failed to return anything for one of the tests (undefined
result). I remember that disqualification from the previous Operas as
well (among other things). jQuery is at roughly 8X slower.

The other three respectable scores were the two Dojo's and Qooxdoo.
YUI3 came close to respectability.

Of course, if you disallow all that sniff the UA string, the field
shrinks considerably (down to four IIRC), retaining none of the noted
respectable showings. If you do something wrong really fast, what are
you accomplishing? I plan to add a column style to highlight that.
Looking at results for the latest libraries in older (or odd) browsers
shows without a doubt that the sniffing "strategies" failed miserably
(and so can be predicted to fail miserably in the future).
 
M

Michael Wojcik

Andrew said:
I'm not meaning to be offensive, I'm just wondering how one person can
appear to achieve so much.

Surely if studies of software development have taught us anything,
they've taught us that there is no correlation between the size of a
team and the quality of the code.

I've been a professional developer for nearly a quarter of a century,
and I'm not surprised at all when one person delivers a better
solution than what a large team delivers. Even if the large team's
total effort is much greater - which may or may not be the case here.
Even if the large team has had ample opportunity to solicit feedback
and review and improve their design and implementation.

What matters more, in my opinion, are factors like understanding the
problem space, employing a robust and consistent design, rigorous
attention to quality, and preferring fewer well-implemented features
to many poorly-implemented ones. An individual can often do better in
those areas than a diverse and loosely-organized team can.

That doesn't mean that the open-source "bazaar" development model
necessarily produces bad code - just that it is vulnerable to
sloppiness, uneven quality, and conflicting approaches. Sometimes that
doesn't matter, if the project isn't meant to produce robust code.
(Some projects are prototypes or toys.) Sometimes a strong editorial
hand can enforce discipline on released code. But sometimes too many
cooks simply spoil the soup.
 
S

Scott Sauyet

So you've learned that test-driven development is not an
oxymoron? :)

I actually am a fan of test-driven design, but I don't do it with
performance tests; that scares me.
Opera 10.10, Windows XP on a very busy and older PC:-

2121     18624   9000    5172    22248   4846    4360   1109    1266    1189  
6140     1876    843*    798*

I ran it a few times.  This is representative.  The two versions flip
flop randomly.  Usually around a third of the purer tests.  :)

I can confirm similar rankings (with generally faster speeds, of
course) on my modern machine in most recent browsers, with two
exceptions: First, in Firefox and IE, PureDOM was faster than My
Library. Second, in IE6, several tests fail in My Library ("setHTML",
"insertBefore", "insertAfter".) Also note that the flip-flopping of
the two versions might have to do with the fact that they are pointing
at the same exact version of My Library (the one with QSA) and the
same test code. You're running the same infrastructure twice! :)

This is great work, David. I'm very impressed.

But I do have some significant caveats. Some of the tests seem to me
to be cheating, especially when it comes to the loops. For instance,
here is one of the functions specifications:

"append" : function(){
// in a 500 iteration loop:
// create a new <div> with the same critera as 'create'
// - NOTE: rel needs to be == "foo2"
// then append to body element (no caching)
//
// return then length of the matches from the selector
"div[rel^='foo2']"
},

My Library's implementation looks like this:

"append" : function() {
var myEl = E().loadNew('div', { 'rel':'foo2' }), body =
document.body;
for (var i = 500; i--;) {
myEl.loadClone().appendTo(body);
}
return $("div[rel^=foo2]").length;
},

This definitely involves caching some objects outside the loop. There
are a number of such instances of this among the test cases. My
Library is not alone in this, but most of the other libraries mainly
just have variable declarations outside the loop, not initialization.
In part, this is a problem with the Task Speed design and
specification. It would have been much better to have the testing
loop run each library's tests the appropriate number of times rather
than including the loop count in the specification. But that error
should not be an invitation to abuse. I ran a version with such
initializations moved inside the loop, and my tests average about a
15% performance drop for My Library, in all browsers but Opera, where
it made no significant difference.

But that is just one instance of a general problem. My Library is not
alone in coding its tests to the performance metrics. The spec has
this:

"bind" : function(){
// connect onclick to every first child li of ever ul
(suggested: "ul > li")
//
// return the length of the connected nodes
},

but the YUI3 tests perform this with event delegation:

"bind" : function(){
Y.one('body').delegate('click', function() {}, 'ul > li');
return Y.all('ul > li').size();
},

This might well be the suggested way to attach a behavior to a number
of elements in YUI. There's much to be said for doing it in this
manner. And yet it is pretty clearly not what was intended in the
specification; if nothing else, it's an avoidance of what was
presumably intended to be a loop. There's a real question of doing
things the appropriate way.

To test this, I limited myself to 15 minutes of trying to optimize the
jQuery tests in the same manner. I moved initialization outside the
loop and switched to event delegation. After this brief attempt, I
achieved speed gains between 54% and 169% in the various browsers.
And I did this without any changes to the underlying library. I'm
sure I could gain reasonable amounts of speed in some of the other
libraries as well, but this sort of manipulation is wrong-headed.

Perhaps an updated version of TaskSpeed is in order, but it's hard to
design a system that can't be gamed in this manner.

Does your host have PHP? It would suggest it would be better to host
a dynamic version of this, and not rely on static files. It's easy to
set up, and that also makes it almost trivial to add and remove
libraries from your tests.

Finally, the matter of IE6 is disappointing. This is still a widely
used browser; I'm surprised you didn't test there before releasing the
code. You've often pointed out how well My Library performed without
change when IE8 came out. Well the flip side is that it needs to keep
doing well at least in environments that are widely used, even as you
make changes. All the other libraries except qoodoox did fine in IE6,
even if all of them were ungodly slow.

So, overall, I'd give this a B+. It's solid work, but there are some
significant issues that still need to be resolved.

Cheers,

-- Scott
 
S

Scott Sauyet

Scott Sauyet wrote:



<snip>        ^^^^^^^

Surely that is not the CSS selector for "every first child li of ever
ul". The specs for these tests are really bad, so it is not surprising
that the implementations of the tests are all over the place.

And of course, if you do what's suggested by the text, you'll look
wrong, because all the other libraries are taking the "ul > li"
literally. I meant to mention this too in my last posting. There is
definitely some room for improvement to the test specs as well as to
the infrastructure.
You see, what is "the length of the connected nodes"? Is that "length"
in terms of something like the pixel widths/height of the displayed
nodes, or is "length" intended to imply the length of some sort of
'collected' array of nodes (i.e. some sort of 'query', and if so why are
we timing that in a context that is not a test of 'selector'
performance), or does the spec just call for *number* of nodes modified
by the task to be returned?

This one does not bother me too much. In the context of the
collection of tests [1], it's fairly clear that they want the number
of nodes modified. Presumably they assume that the tools will collect
the nodes in some array-like structure with a "length" property.


-- Scott
___________________
[1] http://dante.dojotoolkit.org/taskspeed/tests/sample-tests.js
 
S

Scott Sauyet

This one does not bother me too much.  In the context of the
collection of tests [1], it's fairly clear that they want the
number of nodes modified.

Is that all they want, rather than, say, in some cases the number
added/modified nodes as retrieved from the DOM post-modification (as a
verification that the process has occurred as expected)?

I do think that in all of the tests, the result returned is supposed
to have to do with some count of elements after a certain
manipulation. But in some of them, such as bind, the manipulation
doesn't actually change the number of elements manipulated. As this
was based upon SlickSpeed, it inherits one of the main problems of
that system, namely that it tries to do one thing too many. It tries
to verify accuracy and compare speeds at the same time. This is
problematic, in my opinion.

The various formulations used include:-

| return the result of the selector ul.fromcode

| return the length of the query "tr td"

| return the lenght of the odd found divs

| return the length of the destroyed nodes

(without any definition of what "the selector" or "the query" mean).

If you are saying that it is just the number of nodes that needs to be
returned then in, for example, the YUI3 "bind" example you cited:-

|  "bind" : function(){
|        Y.one('body').delegate('click', function() {}, 'ul > li');
|        return Y.all('ul > li').size();
|  },

- the last line could be written:-

return 845; //or whatever number is correct for the document

- and for that function (especially in a non-QSA environment) the bulk
of the work carried out in that function has been removed.

I've actually thought of doing that, and loudly trumpeting that my
library is unbeatable at TaskSpeed! :)
Maybe they do, but that is not what the spec is actually asking for. And
if it was what is being asked for, why should that process be included
in the timing for the tasks, as it is not really part of any realistic
task.

Yes, if they want verification of counts, perhaps the test harness
itself could provide that.

-- Scott
 
R

Richard Cornford

Scott Sauyet wrote:
This one does not bother me too much. In the context of the
collection of tests [1], it's fairly clear that they want the
number of nodes modified.
Is that all they want, rather than, say, in some cases the number
added/modified nodes as retrieved from the DOM post-modification
(as a verification that the process has occurred as expected)?

I do think that in all of the tests, the result returned is
supposed to have to do with some count of elements after a
certain manipulation.

Which is not a very realistic test 'task' as the number of nodes
modified in operations on real DOMs is seldom of any actual interest.
But in some of them, such as bind, the manipulation
doesn't actually change the number of elements manipulated.

And for others, such as the list creation, the number of 'modified'
nodes is pre-determined by the number you have just created.
As this was based upon SlickSpeed, it inherits one of the main
problems of that system, namely that it tries to do one thing
too many.

That is exactly where I attribute the cause of this flaw.
It tries to verify accuracy and compare speeds at the same
time. This is problematic, in my opinion.

Especially when it is timing the verification process with the task,
and applying different verification code to each 'library'. There you
have the potential for miscounting library code to combine with
misbehaving DOM modification code to give the impression that the
whole thing is working correctly, or for a correctly carried out task
to be labelled as failing because the counting process is off form
some reason.

I've actually thought of doing that, and loudly trumpeting
that my library is unbeatable at TaskSpeed! :)

It wouldn't do the pure DOM code any harm either.
Yes, if they want verification of counts, perhaps the test
harness itself could provide that.

Not just "perhaps". It should, and it should use the same verification
code for each test, and outside of any timing recording.

Richard.
 
S

Scott Sauyet

Not just "perhaps". It should, and it should use the same verification
code for each test, and outside of any timing recording.

I think that testing the selector engine is part of testing the
library. Although this is not the same as the SlickSpeed selectors
test, it should subsume that one. So I don't object to testing
selector speed. The verification, though, is a different story. It's
quite easy to switch testing documents, but it is presumably not so
easy to verify all the results of all the manipulations. The
compromise that TaskSpeed inherits from SlickSpeed is, I think, fairly
reasonable. Make all the libraries report their results, and note if
there is any disagreement. They could, of course, all be wrong and
yet all have the same values, but that seems relatively unlikely.

There is an approach that I doubt I'd bother trying, but which is
quite interesting: Add a url query parameter, which would serve as a
seed for a randomizing function. If the server does not get one, it
chooses a random value and redirects to a page with that random seed.
Then, based upon random numbers derived from that seed, a document is
generated with some flexible structure, and a test script is generated
that runs a random some sequence of the predefined test cases against
each library. Verification might be tricky, but should be doable.
This might make it more difficult for libraries to design their tests
around the particulars of the document and/or the ordering of the
tests. While I think this would work, it sounds like more effort than
I'm willing to put in right now.

-- Scott
 
R

Richard Cornford

I think that testing the selector engine is part of testing
the library.

Obviously it is, if the 'library' has a selector engine, but that is a
separate activity from testing the library's ability to carry out
tasks as real world tasks don't necessitate any selector engine.
(Remember that common hardware and browser performance was not
sufficient for any sort of selector engine even to look like a viable
idea before about the middle of 2005, but (even quite extreme) DOM
manipulation was long established by that time.)

The 'pure DOM' tests, as a baseline for comparison, don't necessarily
need a selector engine to perform any given task (beyond the fact that
the tasks themselves have been designed around a notion of
'selectors'). So making selector engine testing part of the 'task'
tests acts to impose arbitrary restrictions on the possible code used,
biases the results, and ultimately negates the significance of the
entire exercise.
Although this is not the same as the SlickSpeed
selectors test,

Comparing the selector engines in libraries that have selector engines
seems like a fairly reasonable thing to do. Suggesting that a selector
engine is an inevitable prerequisite for carrying out DOM manipulation
tasks is self evident BS.
it should subsume that one. So I don't object
to testing selector speed. The verification, though, is a
different story. It's quite easy to switch testing documents,
but it is presumably not so easy to verify all the results of
all the manipulations.

Why not (at least in most cases)? code could be written to record the
changes to a DOM that resulted from running a test function. You know
what you expect the test function to do so verifying that it did do it
shouldn't be too hard.

Granted there are cases like the use of - addEventListener - where
positive verification becomes a lot more difficult, but as it is the
existing tests aren't actually verifying that listeners were added.
The compromise that TaskSpeed inherits
from SlickSpeed is, I think, fairly reasonable.

I don't. TaskSpeed's validity is compromised in the process.
Make all the libraries report their results, and note
if there is any disagreement.

But reporting result is not part of any genuinely representative task,
and so it should not be timed along with any given task. The task
itself should be timed in isolation, and any verification employed
separately.

Whether some 'library' should be allowed to do its own verification is
another matter, but the verification definitely should not be timed
along with the task that it is attempting to verify.
They could, of course, all be wrong and
yet all have the same values, but that seems
relatively unlikely.

Unlikely, but not impossible, and an issue that can easily be entirely
avoided.
There is an approach that I doubt I'd bother trying, but
which is quite interesting: Add a url query parameter,
which would serve as a seed for a randomizing function.
If the server does not get one, it chooses a random value
and redirects to a page with that random seed. Then, based
upon random numbers derived from that seed, a document is
generated with some flexible structure, and a test script
is generated that runs a random some sequence of the
predefined test cases against each library.

I can see how this might make sense in selector speed testing (though
presumably you would run up against many cases where the reported
duration of the test would be zero millisecond, despite our knowing
that nothing happens in zero time) but for task testing randomly
generating the document acted upon would be totally the wrong
approach. If you did that you would bias against the baseline pure DOM
tests as then they would have to handle issues arising from the
general case, which are not issues inherent in DOM scripting because
websites are not randomly generated.

In any real web site/web application employment of scripting,
somewhere between something and everything is known about the
documents that are being scripted. Thus DOM scripts do not need to
deal with general issues in browser scripting, but rather only need to
deal with the issues that are known to exist in their specific
context.

In contrast, it is an inherent problem in general purpose library code
that they must address (or attempt to address) all the issues that
occur in a wide range of context (at minimum, all the common
contexts). There are inevitably overheads in doing this, with those
overheads increasing as the number of contexts accommodated increases.

With random documents and comparing libraries against some supposed
'pure DOM' baseline, you will be burdening the baseline with the
overheads that are only inherent in general purpose code. The result
would not be a representative comparison.
Verification might be tricky, but should be doable.
This might make it more difficult for libraries to
design their tests around the particulars of the
document and/or the ordering of the tests. While
I think this would work, it sounds like more
effort than I'm willing to put in right now.

Given that javascript source is available if anyone want to look at
it, any library author attempting to optimise for a specific test
(rather than, say, optimising for a common case) is likely to be
spotted doing so, and see their reputation suffer as a result.

Richard.
 
S

Scott Sauyet

On Feb 16, 8:57 pm, Scott Sauyet wrote:

Obviously it is, if the 'library' has a selector engine, but that is a
separate activity from testing the library's ability to carry out
tasks as real world tasks don't necessitate any selector engine.

Perhaps it's only because the test framework was built testing against
libraries that had both DOM manipulation and selector engines, but
these seem a natural fit. I don't believe this was meant to be a DOM
manipulation test in particular. My understanding (and I was not
involved in any of the original design, so take this with a grain of
salt) is that this was meant to be a more general test of how the
libraries were used, which involved DOM manipulation and selector-
based querying. If it seemed at all feasible, the framework would
probably have included event handler manipulation tests, as well. If
the libraries had all offered classical OO infrastructures the way
MooTools and Prototype do, that would probably also be tested.

Why the scare quotes around "library"? Is there a better term --
"toolkit"? -- that describes the systems being tested?
(Remember that common hardware and browser performance was not
sufficient for any sort of selector engine even to look like a viable
idea before about the middle of 2005, but (even quite extreme) DOM
manipulation was long established by that time.)

Really? Very interesting. I didn't realize that it was a system
performance issue. I just thought it was a new way of doing things
that people started trying around then.

The 'pure DOM' tests, as a baseline for comparison, don't necessarily
need a selector engine to perform any given task (beyond the fact that
the tasks themselves have been designed around a notion of
'selectors'). So making selector engine testing part of the 'task'
tests acts to impose arbitrary restrictions on the possible code used,

Absolutely. A pure selector engine would also not be testable, nor
would a drag-and-drop toolkit. We are restricted to systems that can
manipulate the DOM and find the size of certain collections of
elements.

biases the results,

In what way?

and ultimately negates the significance of the entire exercise.

I just don't see it. There is clearly much room for improvement, but
I think the tests as they stand have significant value.


Comparing the selector engines in libraries that have selector engines
seems like a fairly reasonable thing to do. Suggesting that a selector
engine is an inevitable prerequisite for carrying out DOM manipulation
tasks is self evident BS.

Note that these results don't require that the library actually use a
CSS-style selector engine, only that it can for instance find the
number of elements of a certain type, the set of which if often most
easily described via a CSS selector. When the "table" function is
defined to return "the length of the query 'tr td'," we can interpret
that as counting the results of running the selector "tr td" in the
context of the document if we have a selector engine, but as "the
number of distinct TD elements in the document which descend from TR
elements" if not. Being able to find such elements has been an
important part of most of the DOM manipulation I've done.

PureDOM does all this without any particular CSS selector engine, so
it's clear that one is not required to pass the tests.

Why not (at least in most cases)? code could be written to record the
changes to a DOM that resulted from running a test function. You know
what you expect the test function to do so verifying that it did do it
shouldn't be too hard.

The document to test has been fairly static, and I suppose one could
go through it, analyzing its structure, and calculating the expected
results. But the document is included as a stand-alone file, used
with this PHP:

<?php include('../template.html');?>

Another file could easily be substituted, and it might well be
worthwhile doing. Adding this sort of analysis would make it much
more time-consuming to test against a different document.

Granted there are cases like the use of - addEventListener - where
positive verification becomes a lot more difficult, but as it is the
existing tests aren't actually verifying that listeners were added.

Are there any good techniques you know of that would make it
straightforward to actually test this from within the browser's script
engine? It would be great to be able to test this.
The compromise that TaskSpeed inherits
from SlickSpeed is, I think, fairly reasonable.

I don't. TaskSpeed's validity is compromised in the process.
Make all the libraries report their results, and note
if there is any disagreement.

But reporting result is not part of any genuinely representative task,
and so it should not be timed along with any given task. The task
itself should be timed in isolation, and any verification employed
separately. [ ... ]

I think this critique is valid only if you assume that the
infrastructure is designed only to test DOM Manipulation. I don't buy
that assumption.

Unlikely, but not impossible, and an issue that can easily be entirely
avoided.

Easily for a single document, and even then only with some real work
in finding the expected results and devising a way to test them.

I can see how this might make sense in selector speed testing (though
presumably you would run up against many cases where the reported
duration of the test would be zero millisecond, despite our knowing
that nothing happens in zero time)

In another thread [1], I discuss an updated version of slickspeed,
which counts repeated tests over a 250ms span to more accurately time
the selectors.
but for task testing randomly
generating the document acted upon would be totally the wrong
approach. If you did that you would bias against the baseline pure DOM
tests as then they would have to handle issues arising from the
general case, which are not issues inherent in DOM scripting because
websites are not randomly generated.

I was not expecting entirely random documents. Instead, I would
expect to generate one in which the supplied tests generally have
meaningful results. So for this test

"attr" : function(){
// find all ul elements in the page.
// generate an array of their id's
// return the length of that array
},

I might want to randomly determine the level of nesting at which ULs
appear, randomly determine how many are included in the document, and
perhaps randomly choose whether some of them do not actually have
ids. There would probably be some small chance that there were no ULs
at all.
In any real web site/web application employment of scripting,
somewhere between something and everything is known about the
documents that are being scripted. Thus DOM scripts do not need to
deal with general issues in browser scripting, but rather only need to
deal with the issues that are known to exist in their specific
context.

Absolutely. I definitely wouldn't try to build entirely random
documents, only documents for which the results of the tests should be
meaningful. The reason I said I probably wouldn't do this is that,
while it is by no means impossible, it is also a far from trivial
exercise.

In contrast, it is an inherent problem in general purpose library code
that they must address (or attempt to address) all the issues that
occur in a wide range of context (at minimum, all the common
contexts). There are inevitably overheads in doing this, with those
overheads increasing as the number of contexts accommodated increases.

Yes, this is true. But it is precisely these general purpose
libraries that are under comparison in these tests. Being able to
compare their performance and the code each one uses are the only
reason these tests exist.
[ ... ]
Verification might be tricky, but should be doable.
This might make it more difficult for libraries to
design their tests around the particulars of the
document and/or the ordering of the tests.  While
I think this would work, it sounds like more
effort than I'm willing to put in right now.

Given that javascript source is available if anyone want to look at
it, any library author attempting to optimise for a specific test
(rather than, say, optimising for a common case)  is likely to be
spotted doing so, and see their reputation suffer as a result.

I would hope so, but as I said in the post to which you initially
responded, I see a fair bit of what could reasonably be considered
optomising for the test, and I only really looked at jQuery's, YUI's,
and My Library's test code. I wouldn't be surprised to find more in
the others.

-- Scott
____________________
[1] http://groups.google.com/group/comp.lang.javascript/msg/f333d40588ae2ff0
 
D

David Mark

Andrew said:
I'm not sure that if one person can write a library that's as good or
better than libraries on which (I believe) teams of people have worked
says a lot about one person's ability or not much about the others.

Thanks. But, as mentioned, I am not the only one who could do this.
The basic theory that has been put forth for years is that those who
really know cross-browser scripting refuse to work on GP libraries
because they break the first three rules of cross-browser scripting
(context, context, context). The three rules bit is mine, but the basic
theory about context-specific scripts has been put forth by many others.
I'm not meaning to be offensive, I'm just wondering how one person can
appear to achieve so much.

Thanks! It's because this stuff is not that complicated. If groups of
developers spent years fumbling and bumbling their way through basic
tasks (e.g. using browser sniffing for everything and still failing),
then it wouldn't be too hard to show them up. And it wasn't. Took
about a month to clean up what was originally a two-month project. I
think it is really shaping up as a world-beater now. :)

And all without browser sniffing. Who could have predicted such a
thing? Lots of people, that's who. ;)
 
D

David Mark

Michael said:
Surely if studies of software development have taught us anything,
they've taught us that there is no correlation between the size of a
team and the quality of the code.

I've been a professional developer for nearly a quarter of a century,
and I'm not surprised at all when one person delivers a better
solution than what a large team delivers. Even if the large team's
total effort is much greater - which may or may not be the case here.

I think I agree with all of this, but was confused by this statement.
What may or may not be the case? I'm one guy who took a two-year hiatus
from the library. Meanwhile hundreds (if not thousands) of people have
been filing tickets, arguing patches, actually applying come patches,
un-applying patches, maintaining repositories, testing browsers
(somewhat ineffectually), arguing about blog comments, etc. Make no
mistake that I did none of that. It's been me and Notepad and a few
weekends and evenings over the last month. Thanks to the handful of
people who gave feedback too. :)
 
D

David Mark

Scott said:
So you've learned that test-driven development is not an
oxymoron? :)

I think you are misquoting me. The term is "test-driven" design (a la
John Resig). The quotes indicate that he isn't designing anything but
treating empirical observations like they are specifications. It's the
crystal ball approach. Seatch the archive for "test swarm".
I actually am a fan of test-driven design, but I don't do it with
performance tests; that scares me.

I am sure you are _not_ talking about what I am talking about. At least
I hope not. And what makes you think that these performance tests had
anything to do with the changes. FYI, they didn't. It was Richard's
solid suggestion to use cloneNode as there are no event listeners or
custom attributes to deal with in these test functions. I knew it would
be faster _before_ I re-ran the tests. That's the difference. I don't
take test results at face value. You have to understand what you are
looking at before you can react to them.

In contrast, I see the various "major" efforts resorting to all sorts of
unexplicable voodoo based solely on "proof" provided by test results
with no understanding at all going into the process. That's what is
wrong with "test-driven" design/development.

And as for the GP clone method that I added, it will be stipulated that
listeners (and custom attributes if you are into those) _must_ be added
_after_ cloning. That takes care of that. ;)
Opera 10.10, Windows XP on a very busy and older PC:-

2121 18624 9000 5172 22248 4846 4360 1109 1266 1189
6140 1876 843* 798*

I ran it a few times. This is representative. The two versions flip
flop randomly. Usually around a third of the purer tests. :)

I can confirm similar rankings (with generally faster speeds, of
course) on my modern machine in most recent browsers, with two
exceptions: First, in Firefox and IE, PureDOM was faster than My
Library. Second, in IE6, several tests fail in My Library ("setHTML",
"insertBefore", "insertAfter".)

Well, that's no good at all. :) Likely a recent development. I will
look into that. If you have specifics, that would be helpful as I don't
have IE6 handy right this minute.
Also note that the flip-flopping of
the two versions might have to do with the fact that they are pointing
at the same exact version of My Library (the one with QSA) and the
same test code. You're running the same infrastructure twice! :)

Huh? One is supposed to be pointing to the non-QSA version. I'll fix that.
This is great work, David. I'm very impressed.

Thanks! I'll respond to the rest after I track down this IE6 problem.
I can't believe I broke IE6 (of all things). That's what I get for not
re-testing. :(
 
R

Richard Cornford

Scott Sauyet wrote:

I am sure you are _not_ talking about what I am talking about.
... . That's the difference. I don't
take test results at face value. You have to understand what you
are looking at before you can react to them.

In contrast, I see the various "major" efforts resorting to all
sorts of unexplicable voodoo based solely on "proof" provided
by test results with no understanding at all going into the
process. That's what is wrong with "test-driven"
design/development.

Another significant issue (beyond understanding the results) is the
question of designing the right test(s) to apply. Get the test design
wrong and the results will be meaningless, so not understanding them
isn't making anything worse.

To illustrate; the conclusions drawn on this page:-

<URL: http://ejohn.org/blog/most-bizarre-ie-quirk/ >

- are totally incorrect because the test used (predictably) interfered
with the process that was being examined.

(So is the next step going to be "test-driven test design"? ;-)

Richard.
 
D

David Mark

Richard said:
Another significant issue (beyond understanding the results) is the
question of designing the right test(s) to apply. Get the test design
wrong and the results will be meaningless, so not understanding them
isn't making anything worse.

Agreed. :) And I meant "inexplicable" of course. And speaking of
embarassments (my own). I seem to have broken IE6 for some of the
TaskSpeed tests. I can't get to IE6 at the moment and am having trouble
getting a multi-IE tester installed on my work box. I know if there is
one guy on the planet who will know what I botched (likely recently), it
is you. Any clues while I scramble to get a testbed set up? I glanced
at the three tests mentioned (setHTML, insertBefore and insertAfter),
but nothing jumped out at me as IE6-incompatible.
To illustrate; the conclusions drawn on this page:-

<URL: http://ejohn.org/blog/most-bizarre-ie-quirk/ >

- are totally incorrect because the test used (predictably) interfered
with the process that was being examined.

No question. That domain is just full of misconceptions. :)
(So is the next step going to be "test-driven test design"? ;-)

It seems like Resig is already there.
 
M

Michael Wojcik

David said:
I think I agree with all of this, but was confused by this statement.
What may or may not be the case? I'm one guy who took a two-year hiatus
from the library. Meanwhile hundreds (if not thousands) of people have
been filing tickets, arguing patches, actually applying come patches,
un-applying patches, maintaining repositories, testing browsers
(somewhat ineffectually), arguing about blog comments, etc. Make no
mistake that I did none of that. It's been me and Notepad and a few
weekends and evenings over the last month. Thanks to the handful of
people who gave feedback too. :)

Now that you ask, I'm not sure why I added that final clause. It may
be the result of overly-aggressive editing. Or I might have been
thinking in general terms, but appended the "here" out of habit.
 
D

David Mark

Scott said:
So you've learned that test-driven development is not an
oxymoron? :)

I actually am a fan of test-driven design, but I don't do it with
performance tests; that scares me.
Opera 10.10, Windows XP on a very busy and older PC:-

2121 18624 9000 5172 22248 4846 4360 1109 1266 1189
6140 1876 843* 798*

I ran it a few times. This is representative. The two versions flip
flop randomly. Usually around a third of the purer tests. :)

I can confirm similar rankings (with generally faster speeds, of
course) on my modern machine in most recent browsers, with two
exceptions: First, in Firefox and IE, PureDOM was faster than My
Library. Second, in IE6, several tests fail in My Library ("setHTML",
"insertBefore", "insertAfter".) Also note that the flip-flopping of
the two versions might have to do with the fact that they are pointing
at the same exact version of My Library (the one with QSA) and the
same test code. You're running the same infrastructure twice! :)

This is great work, David. I'm very impressed.

But I do have some significant caveats. Some of the tests seem to me
to be cheating, especially when it comes to the loops. For instance,
here is one of the functions specifications:

There's definitely no cheating going on.
"append" : function(){
// in a 500 iteration loop:
// create a new <div> with the same critera as 'create'
// - NOTE: rel needs to be == "foo2"
// then append to body element (no caching)
//
// return then length of the matches from the selector
"div[rel^='foo2']"
},

My Library's implementation looks like this:

"append" : function() {
var myEl = E().loadNew('div', { 'rel':'foo2' }), body =
document.body;
for (var i = 500; i--;) {
myEl.loadClone().appendTo(body);
}
return $("div[rel^=foo2]").length;
},

This definitely involves caching some objects outside the loop.

There is a new element cloned each time. So what if it is a clone and
not a freshly created one. I saw where one of the other library's tests
was doing the same thing with some sort of template object. Who says
you can't clone?
There
are a number of such instances of this among the test cases. My
Library is not alone in this, but most of the other libraries mainly
just have variable declarations outside the loop, not initialization.
In part, this is a problem with the Task Speed design and
specification. It would have been much better to have the testing
loop run each library's tests the appropriate number of times rather
than including the loop count in the specification. But that error
should not be an invitation to abuse. I ran a version with such
initializations moved inside the loop, and my tests average about a
15% performance drop for My Library, in all browsers but Opera, where
it made no significant difference.

But that is just one instance of a general problem. My Library is not
alone in coding its tests to the performance metrics. The spec has
this:

"bind" : function(){
// connect onclick to every first child li of ever ul
(suggested: "ul > li")
//
// return the length of the connected nodes
},

but the YUI3 tests perform this with event delegation:

"bind" : function(){
Y.one('body').delegate('click', function() {}, 'ul > li');
return Y.all('ul > li').size();
},

This might well be the suggested way to attach a behavior to a number
of elements in YUI. There's much to be said for doing it in this
manner. And yet it is pretty clearly not what was intended in the
specification; if nothing else, it's an avoidance of what was
presumably intended to be a loop. There's a real question of doing
things the appropriate way.

Yes, I've mentioned this specific issue numerous times. Using
delegation when the test is trying to measure attaching multiple
listeners is bullshit (and I wouldn't expect anything less from Yahoo).
To test this, I limited myself to 15 minutes of trying to optimize the
jQuery tests in the same manner. I moved initialization outside the
loop and switched to event delegation. After this brief attempt, I
achieved speed gains between 54% and 169% in the various browsers.
And I did this without any changes to the underlying library. I'm
sure I could gain reasonable amounts of speed in some of the other
libraries as well, but this sort of manipulation is wrong-headed.

You still can't make jQuery touch mine, no matter what you do (unless
you really cheat like return a number without any DOM manipulation!)
Perhaps an updated version of TaskSpeed is in order, but it's hard to
design a system that can't be gamed in this manner.

Does your host have PHP? It would suggest it would be better to host
a dynamic version of this, and not rely on static files. It's easy to
set up, and that also makes it almost trivial to add and remove
libraries from your tests.

I have ASP and I think it has enough libraries as it is.
Finally, the matter of IE6 is disappointing. This is still a widely
used browser; I'm surprised you didn't test there before releasing the
code.

It's not so much a release as a periodically updated page. I broke
something and due to happenstance (my multi-IE box went down recently),
I didn't get to test and find out that I broke it. No big deal, but
certainly an embarassment. If I can get this #$@% IETester toolbar
working (or even find where it went after installation), I'll fix it
instantly. At the moment, I can't see anything obvious that I did in
the code to break IE6, but then it is a lot of code. :)
You've often pointed out how well My Library performed without
change when IE8 came out. Well the flip side is that it needs to keep
doing well at least in environments that are widely used, even as you
make changes. All the other libraries except qoodoox did fine in IE6,
even if all of them were ungodly slow.

Obviously, I broke something recently. It's not indicative of some
major shift that has invalidated IE6 as a viable browser. :)

It has always been a rock with IE6. I tested the builder stuff to death
in IE <= 6, just a week or two ago. Granted these, "concise" OO tests
are using interfaces that were added afterward, so perhaps I crossed
some wires. Make no mistake, I will fix whatever I broke in IE6. It's
a fail until that time.

BTW, I *hate* this IETester toolbar. Doesn't appear to do _anything_ in
IE8 on XP. Literally nothing. Installs and then vanishes without a
trace, never to be heard from or seen again. :(

So, if you want to help, give me some reports on _exactly_ what happened
to you in IE6. Was there an error? If so, the TaskSpeed thing creates
sort of a quasi-tooltip to display it.
 
D

David Mark

Scott said:
So you've learned that test-driven development is not an
oxymoron? :)

I actually am a fan of test-driven design, but I don't do it with
performance tests; that scares me.
Opera 10.10, Windows XP on a very busy and older PC:-

2121 18624 9000 5172 22248 4846 4360 1109 1266 1189
6140 1876 843* 798*

I ran it a few times. This is representative. The two versions flip
flop randomly. Usually around a third of the purer tests. :)

I can confirm similar rankings (with generally faster speeds, of
course) on my modern machine in most recent browsers, with two
exceptions: First, in Firefox and IE, PureDOM was faster than My
Library. Second, in IE6, several tests fail in My Library ("setHTML",
"insertBefore", "insertAfter".) Also note that the flip-flopping of
the two versions might have to do with the fact that they are pointing
at the same exact version of My Library (the one with QSA) and the
same test code. You're running the same infrastructure twice! :)

Not locally I wasn't (which is where I do most of my testing). I
apparently forgot to update one of the files online. It's updated now.
I don't think you'll see any big difference as these aren't
query-intensive tests.
 
D

David Mark

David said:
Scott said:
David Mark wrote:
I've updated the TaskSpeed test functions to improve performance. This
necessitated some minor additions (and one change) to the OO interface
as well. I am pretty happy with the interface at this point, so will
set about properly documenting it in the near future.
So you've learned that test-driven development is not an
oxymoron? :)

I actually am a fan of test-driven design, but I don't do it with
performance tests; that scares me.
[http://www.cinsoft.net/taskspeed.html]
Opera 10.10, Windows XP on a very busy and older PC:-

2121 18624 9000 5172 22248 4846 4360 1109 1266 1189
6140 1876 843* 798*

I ran it a few times. This is representative. The two versions flip
flop randomly. Usually around a third of the purer tests. :)
I can confirm similar rankings (with generally faster speeds, of
course) on my modern machine in most recent browsers, with two
exceptions: First, in Firefox and IE, PureDOM was faster than My
Library. Second, in IE6, several tests fail in My Library ("setHTML",
"insertBefore", "insertAfter".) Also note that the flip-flopping of
the two versions might have to do with the fact that they are pointing
at the same exact version of My Library (the one with QSA) and the
same test code. You're running the same infrastructure twice! :)

This is great work, David. I'm very impressed.

But I do have some significant caveats. Some of the tests seem to me
to be cheating, especially when it comes to the loops. For instance,
here is one of the functions specifications:

There's definitely no cheating going on.
"append" : function(){
// in a 500 iteration loop:
// create a new <div> with the same critera as 'create'
// - NOTE: rel needs to be == "foo2"
// then append to body element (no caching)
//
// return then length of the matches from the selector
"div[rel^='foo2']"
},

My Library's implementation looks like this:

"append" : function() {
var myEl = E().loadNew('div', { 'rel':'foo2' }), body =
document.body;
for (var i = 500; i--;) {
myEl.loadClone().appendTo(body);
}
return $("div[rel^=foo2]").length;
},

This definitely involves caching some objects outside the loop.

There is a new element cloned each time. So what if it is a clone and
not a freshly created one. I saw where one of the other library's tests
was doing the same thing with some sort of template object. Who says
you can't clone?
There
are a number of such instances of this among the test cases. My
Library is not alone in this, but most of the other libraries mainly
just have variable declarations outside the loop, not initialization.
In part, this is a problem with the Task Speed design and
specification. It would have been much better to have the testing
loop run each library's tests the appropriate number of times rather
than including the loop count in the specification. But that error
should not be an invitation to abuse. I ran a version with such
initializations moved inside the loop, and my tests average about a
15% performance drop for My Library, in all browsers but Opera, where
it made no significant difference.

But that is just one instance of a general problem. My Library is not
alone in coding its tests to the performance metrics. The spec has
this:

"bind" : function(){
// connect onclick to every first child li of ever ul
(suggested: "ul > li")
//
// return the length of the connected nodes
},

but the YUI3 tests perform this with event delegation:

"bind" : function(){
Y.one('body').delegate('click', function() {}, 'ul > li');
return Y.all('ul > li').size();
},

This might well be the suggested way to attach a behavior to a number
of elements in YUI. There's much to be said for doing it in this
manner. And yet it is pretty clearly not what was intended in the
specification; if nothing else, it's an avoidance of what was
presumably intended to be a loop. There's a real question of doing
things the appropriate way.

Yes, I've mentioned this specific issue numerous times. Using
delegation when the test is trying to measure attaching multiple
listeners is bullshit (and I wouldn't expect anything less from Yahoo).
To test this, I limited myself to 15 minutes of trying to optimize the
jQuery tests in the same manner. I moved initialization outside the
loop and switched to event delegation. After this brief attempt, I
achieved speed gains between 54% and 169% in the various browsers.
And I did this without any changes to the underlying library. I'm
sure I could gain reasonable amounts of speed in some of the other
libraries as well, but this sort of manipulation is wrong-headed.

You still can't make jQuery touch mine, no matter what you do (unless
you really cheat like return a number without any DOM manipulation!)
Perhaps an updated version of TaskSpeed is in order, but it's hard to
design a system that can't be gamed in this manner.

Does your host have PHP? It would suggest it would be better to host
a dynamic version of this, and not rely on static files. It's easy to
set up, and that also makes it almost trivial to add and remove
libraries from your tests.

I have ASP and I think it has enough libraries as it is.
Finally, the matter of IE6 is disappointing. This is still a widely
used browser; I'm surprised you didn't test there before releasing the
code.

It's not so much a release as a periodically updated page. I broke
something and due to happenstance (my multi-IE box went down recently),
I didn't get to test and find out that I broke it. No big deal, but
certainly an embarassment. If I can get this #$@% IETester toolbar
working (or even find where it went after installation), I'll fix it
instantly. At the moment, I can't see anything obvious that I did in
the code to break IE6, but then it is a lot of code. :)
You've often pointed out how well My Library performed without
change when IE8 came out. Well the flip side is that it needs to keep
doing well at least in environments that are widely used, even as you
make changes. All the other libraries except qoodoox did fine in IE6,
even if all of them were ungodly slow.

Obviously, I broke something recently. It's not indicative of some
major shift that has invalidated IE6 as a viable browser. :)

It has always been a rock with IE6. I tested the builder stuff to death
in IE <= 6, just a week or two ago. Granted these, "concise" OO tests
are using interfaces that were added afterward, so perhaps I crossed
some wires. Make no mistake, I will fix whatever I broke in IE6. It's
a fail until that time.

BTW, I *hate* this IETester toolbar. Doesn't appear to do _anything_ in
IE8 on XP. Literally nothing. Installs and then vanishes without a
trace, never to be heard from or seen again. :(

So, if you want to help, give me some reports on _exactly_ what happened
to you in IE6. Was there an error? If so, the TaskSpeed thing creates
sort of a quasi-tooltip to display it.

1. Got the IETester thing going. Problem was in my set.
2. Tested TaskSpeed in what it considers IE6
3. No issues, but that doesn't prove anything for sure

I did do some object inferencing with the address bar and it sure
appears to be IE6. There's no browser sniffing involved, so ISTM that
it should also work in true-blue IE6. Let me know if that is still not
the case (and give me the error messages or at least the _exact_ list of
tests that are failing).

I wonder if you ran the thing while I had a bad build up there. That
has happened a couple of times in the last week.

Also, running IE5.5 on SlickSpeed at the moment. All of the other
libraries are crashing and burning. Some refused to even _load_. My
Library looks perfect (and pretty speedy) so far.

Anyone else having TaskSpeed issues in IE < 7? I'd be shocked if I
actually broke something (unless it was one of the aforementioned goofs
that was fixed instantly). I have added quite a bit in the last few
weeks, of course. I had previously tested My Library in IE5-8, doing
lots more than queries and had no issues. I wouldn't expect that the
selector engine improvements broke TaskSpeed tests in IE6.

IE5.5 (in tester) just finished SlickSpeed. Perfect and comparatively
fast (as expected) to the couple of others that managed to not throw
exceptions on every test. Running TaskSpeed on that next...
 
S

Scott Sauyet

David said:
Scott Sauyet wrote:

I am sure you are _not_ talking about what I am talking about.  At least
I hope not.  And what makes you think that these performance tests had
anything to do with the changes.  FYI, they didn't.

Well, this is how your post that started this thread began:

| I've updated the TaskSpeed test functions to improve performance.
This
| necessitated some minor additions (and one change) to the OO
interface
| as well.

:)

-- Scott
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top