Search DOM elements with XPath (getElementById too slow)

C

Claudio Calboni

Hello folks,
I'm having some performance issues with the client-side part of my
application.
Basically, it renders a huge HTML table (about 20'000 cells in my
testing scenario), without content. Content is "pushed" from the back
via some JS, for the only displayed portion of table. Once the user
scrolls, JS updates visible cells with data. It's quite the philosophy
behind GMaps and similars.

So, the server says to JS "update this group of cells with these
data". JS iterates (this is fast) through these instructions and push
data into cells. To find which object it has to update, JS uses
getElementByID(), and this is slow. With a 20k cells table and 300
displayied cells it takes 5 to 10 seconds to update. I suppose, but
I'm not a veteran JS developer (mainly I develop server-side with .NET
but I'm finding very interesting and poweful "the client side of the
force" :D), this is due to the fact that getElementsById actually
*search* through DOM every element my JS is looking for because it
doesn't have a sort of index of elements. I'm trying caching founded
elements and it works great, but loading times are only moved from one
place to another.

I'm thinking about getting all elements (obviously only those I have
to update) via XPath, if possible (never used this tech yet). My
script is able to say "I require cells from--to", so it can be great
if I can extract only a snapshot of elements from the DOM, only those
requirest to be updated, and then iterate through them.

My cells (TD) are named c_R_C with R and C for row number and col
number. With a 100x100 table and 10x10 viewable are, say that I'm in
the almost in the center of the table (first visible cell, top-left
corner with ID c_40_50 and last visible cell, bottom-right corner with
ID c_50_60), I have to extract from DOM cells with row from 40 to 50
and col from 50 to 60 (c_40_50, c_40_51, c_40_52 ... c_50_58, c_50_59,
c_50_60).

If, AFAIK, XPath do extract items into an iterable collection, and if
this extraction can be done with a sort of regular expression I think
this is feasible.

Of course if any of you have other suggestions, that would be greatly
appreciated.

Thanks in advance,
tK
 
R

RobG

Hello folks,
I'm having some performance issues with the client-side part of my
application.
Basically, it renders a huge HTML table (about 20'000 cells in my
testing scenario), without content. Content is "pushed" from the back
via some JS, for the only displayed portion of table. Once the user
scrolls, JS updates visible cells with data. It's quite the philosophy
behind GMaps and similars.

The thought of a page with a 20,000 cell table boggles the mind.
Howver, if you *really* want to do that...
So, the server says to JS "update this group of cells with these
data". JS iterates (this is fast) through these instructions and push
data into cells. To find which object it has to update, JS uses
getElementByID(), and this is slow. With a 20k cells table and 300
displayied cells it takes 5 to 10 seconds to update.

I seriously doubt that getElementById is your problem. My ancient
400MHz G3 can find a cell using getElementById in a 20,000 cell table
in much less than half a second (manually timed). Try the following:

<button onclick="alert(getElementById('td101').innerHTML);">Click</
button>
<script type="text/javascript">
var s = [];
var i = 20000;
do {
if (!(i%10)) s.push('<tr>');
s.push('<td id="td' + i + '">' + 'td' + i);
} while (--i)
document.write('<table border=1>' + s.join('') + '<\/table>')
</script>

My PC takes 6 seconds with Safari and 15 seconds with Firefox to
render the table. Every cell has an ID and content, yet both browsers
manage to find a cell near the bottom of the table and return its
content in a very short time.

I suppose, but
I'm not a veteran JS developer (mainly I develop server-side with .NET
but I'm finding very interesting and poweful "the client side of the
force" :D), this is due to the fact that getElementsById actually
*search* through DOM every element my JS is looking for because it
doesn't have a sort of index of elements.

How do you know that? Have you looked at the source code for your
browser to see how it does it?
I'm trying caching founded
elements and it works great, but loading times are only moved from one
place to another.

I'm thinking about getting all elements (obviously only those I have
to update) via XPath, if possible (never used this tech yet). My
script is able to say "I require cells from--to", so it can be great
if I can extract only a snapshot of elements from the DOM, only those
requirest to be updated, and then iterate through them.

The following function gets 100 cells from the above table in an
inperceptably longer time than getting a single cell:

<button onclick="alert(getCellRange(1000, 1100).length);">Get one
hundred</button>
<script type="text/javascript">
function getCellRange(id0, id1) {
var obj = [];
for (var i=id0; i<id1; i++) {
obj.push(document.getElementById('td'+i));
}
return obj;
}
</script>

The primary lag is the initial loading and rendering of the HTML,
something that XPATH can't help you with. Now XPATH can certainly
help with some things, such as using CSS style selectors to find
elements rather than javascript to sift through them, but I doubt that
it provides a useful replacement for getElementById.

If you show how you are using getElementById, better help can probably
be provided.
 
C

Claudio Calboni

The thought of a page with a 20,000 cell table boggles the mind.
Howver, if you *really* want to do that...

Hello Rob and thank you for you support.
Tt's not a metter of how big is the table. Browser renders it
seamlessly even if so big.
I seriously doubt that getElementById is your problem. My ancient
400MHz G3 can find a cell using getElementById in a 20,000 cell table
in much less than half a second (manually timed). Try the following:

It's not a single getElementById that push my processor to 100% for
some seconds but an iteration with about 300 calls to getElementById
AND a big table (20K cells). 300 getElementById calls vs a DOM with
300 cells runs very fast. And of course a single call is always fast
to me too.
How do you know that? Have you looked at the source code for your
browser to see how it does it?

As I said "I suppose". Because if document's DOM is small, it's a lot
faster. But of course I can be terribly wrong :)!
Now XPATH can certainly
help with some things, such as using CSS style selectors to find
elements rather than javascript to sift through them, but I doubt that
it provides a useful replacement for getElementById.

Probably the XPath way is not the right way. I've made some tests and
is not right for me. I've tryied your scripts on both IE6 and Firefox
and, unfortunately for me, FF does a lot better (of course is my
primary browser). It's fast at first and seems to cache the research,
resulting even faster from second call. IE is slower and it stay
slower. Customer's selected browser, anyway, is IE (intranet)..
If you show how you are using getElementById, better help can probably
be provided.

I've made a sort of cache, as said before, but takes inacceptably long
time at startup:

function cacheElementRef(id) {
var res = elementCache[id]
if (res === void 0)
res = elementCache[id] = document.getElementById(id)
return res
}

and I'm investigating other possibilities, but after a lot of searches
I doubt that there is something faster that getElementById..

Thanks, anyway!

tK
 
S

sam.partington

Yikes that scares me a bit, still if that's the best solution

(snipped quite a lot about getElementById being too slow when updating
300+ cells at once)
If you show how you are using getElementById, better help can probably
be provided.

I've made a sort of cache, as said before, but takes inacceptably long
time at startup:

function cacheElementRef(id) {
var res = elementCache[id]
if (res === void 0)
res = elementCache[id] = document.getElementById(id)
return res

}

and I'm investigating other possibilities, but after a lot of searches
I doubt that there is something faster that getElementById..

I seriously doubt that your cache like that will be much quicker than
the built in cache.

But forget getElementById, it's meant for one off lookups. You have a
table, so it has a very regular form that is a piece of cake to
navigate using the DOM. Throw away the id's as well, because they're
a lazy method to do this sort of thing.

Get your row by finding the TBODY, and find your cell using
childNodes[row].childNodes[col]

You only problem might be if there are whitespace between the
elements.

A demo is attached. using getElementById on IE takes 23 seconds,
compared to 0.3 seconds using childNodes. in FF the difference is
closer : 3s vs 0.3 seconds.

Click the table once to see the slow method, twice to see the quick
method.

HTH

Sam

<script type="text/javascript">
var rows = 200;
var cols = 200;

// init table
var s = [];
for (var r=0; r < rows; ++r)
{
s.push('<tr>');
for (var c= 0; c < cols; ++c)
{
s.push('<td id="c_' + r + '_' + c + '">' + 'c_' + r + '_'
+ c);
}
}
document.write('<table border=1 id=\'table\'>' + s.join('') + '<\/
table>')

var table = document.getElementById('table');
var clicked = 0;
table.onclick = function()
{
var start_r = 10; var end_r = 40;
var start_c = 10; var end_c = 40;
var t1 = new Date();
var method = '';
++clicked;
if ((clicked%2) == 1)
{
for (var r = start_r; r < end_r; ++r)
{
for (var c = start_c; c < end_c; ++c)
{
document.getElementById('c_' + r + '_' + c).innerHTML =
"Slow!";
}
}
method = "Using getDocumentById";
}
else
{
var elems = table.getElementsByTagName('TBODY');
var tbody = elems[0];
for (var r = start_r; r < end_r; ++r)
{
var row = tbody.childNodes[r];
for (var c = start_c; c < end_c; ++c)
{
row.childNodes[start_c].innerHTML = "Fast!";
}
}
method = "Using childNodes";
}
var t2 = new Date();
alert((t2.getTime() - t1.getTime())/1000 + "s");
}
</script>
 
S

sam.partington

I seriously doubt that your cache like that will be much quicker than
the built in cache.

But forget getElementById, it's meant for one off lookups. You have a
table, so it has a very regular form that is a piece of cake to
navigate using the DOM. Throw away the id's as well, because they're
a lazy method to do this sort of thing.

Get your row by finding the TBODY, and find your cell using
childNodes[row].childNodes[col]

You only problem might be if there are whitespace between the
elements.

A demo is attached. using getElementById on IE takes 23 seconds,
compared to 0.3 seconds using childNodes. in FF the difference is
closer : 3s vs 0.3 seconds.

Sorry, there was a typo in what I posted, this line :
row.childNodes[start_c].innerHTML = "Fast!";

should read :
row.childNodes[c].innerHTML = "Fast!";

Sorry about that.

Sam
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top