Search DOM elements with XPath (getElementById too slow)

Discussion in 'Javascript' started by Claudio Calboni, Mar 29, 2007.

  1. Hello folks,
    I'm having some performance issues with the client-side part of my
    application.
    Basically, it renders a huge HTML table (about 20'000 cells in my
    testing scenario), without content. Content is "pushed" from the back
    via some JS, for the only displayed portion of table. Once the user
    scrolls, JS updates visible cells with data. It's quite the philosophy
    behind GMaps and similars.

    So, the server says to JS "update this group of cells with these
    data". JS iterates (this is fast) through these instructions and push
    data into cells. To find which object it has to update, JS uses
    getElementByID(), and this is slow. With a 20k cells table and 300
    displayied cells it takes 5 to 10 seconds to update. I suppose, but
    I'm not a veteran JS developer (mainly I develop server-side with .NET
    but I'm finding very interesting and poweful "the client side of the
    force" :D), this is due to the fact that getElementsById actually
    *search* through DOM every element my JS is looking for because it
    doesn't have a sort of index of elements. I'm trying caching founded
    elements and it works great, but loading times are only moved from one
    place to another.

    I'm thinking about getting all elements (obviously only those I have
    to update) via XPath, if possible (never used this tech yet). My
    script is able to say "I require cells from--to", so it can be great
    if I can extract only a snapshot of elements from the DOM, only those
    requirest to be updated, and then iterate through them.

    My cells (TD) are named c_R_C with R and C for row number and col
    number. With a 100x100 table and 10x10 viewable are, say that I'm in
    the almost in the center of the table (first visible cell, top-left
    corner with ID c_40_50 and last visible cell, bottom-right corner with
    ID c_50_60), I have to extract from DOM cells with row from 40 to 50
    and col from 50 to 60 (c_40_50, c_40_51, c_40_52 ... c_50_58, c_50_59,
    c_50_60).

    If, AFAIK, XPath do extract items into an iterable collection, and if
    this extraction can be done with a sort of regular expression I think
    this is feasible.

    Of course if any of you have other suggestions, that would be greatly
    appreciated.

    Thanks in advance,
    tK
     
    Claudio Calboni, Mar 29, 2007
    #1
    1. Advertising

  2. Claudio Calboni

    RobG Guest

    On Mar 29, 8:59 pm, "Claudio Calboni" <> wrote:
    > Hello folks,
    > I'm having some performance issues with the client-side part of my
    > application.
    > Basically, it renders a huge HTML table (about 20'000 cells in my
    > testing scenario), without content. Content is "pushed" from the back
    > via some JS, for the only displayed portion of table. Once the user
    > scrolls, JS updates visible cells with data. It's quite the philosophy
    > behind GMaps and similars.


    The thought of a page with a 20,000 cell table boggles the mind.
    Howver, if you *really* want to do that...

    >
    > So, the server says to JS "update this group of cells with these
    > data". JS iterates (this is fast) through these instructions and push
    > data into cells. To find which object it has to update, JS uses
    > getElementByID(), and this is slow. With a 20k cells table and 300
    > displayied cells it takes 5 to 10 seconds to update.


    I seriously doubt that getElementById is your problem. My ancient
    400MHz G3 can find a cell using getElementById in a 20,000 cell table
    in much less than half a second (manually timed). Try the following:

    <button onclick="alert(getElementById('td101').innerHTML);">Click</
    button>
    <script type="text/javascript">
    var s = [];
    var i = 20000;
    do {
    if (!(i%10)) s.push('<tr>');
    s.push('<td id="td' + i + '">' + 'td' + i);
    } while (--i)
    document.write('<table border=1>' + s.join('') + '<\/table>')
    </script>

    My PC takes 6 seconds with Safari and 15 seconds with Firefox to
    render the table. Every cell has an ID and content, yet both browsers
    manage to find a cell near the bottom of the table and return its
    content in a very short time.


    > I suppose, but
    > I'm not a veteran JS developer (mainly I develop server-side with .NET
    > but I'm finding very interesting and poweful "the client side of the
    > force" :D), this is due to the fact that getElementsById actually
    > *search* through DOM every element my JS is looking for because it
    > doesn't have a sort of index of elements.


    How do you know that? Have you looked at the source code for your
    browser to see how it does it?

    > I'm trying caching founded
    > elements and it works great, but loading times are only moved from one
    > place to another.
    >
    > I'm thinking about getting all elements (obviously only those I have
    > to update) via XPath, if possible (never used this tech yet). My
    > script is able to say "I require cells from--to", so it can be great
    > if I can extract only a snapshot of elements from the DOM, only those
    > requirest to be updated, and then iterate through them.


    The following function gets 100 cells from the above table in an
    inperceptably longer time than getting a single cell:

    <button onclick="alert(getCellRange(1000, 1100).length);">Get one
    hundred</button>
    <script type="text/javascript">
    function getCellRange(id0, id1) {
    var obj = [];
    for (var i=id0; i<id1; i++) {
    obj.push(document.getElementById('td'+i));
    }
    return obj;
    }
    </script>

    The primary lag is the initial loading and rendering of the HTML,
    something that XPATH can't help you with. Now XPATH can certainly
    help with some things, such as using CSS style selectors to find
    elements rather than javascript to sift through them, but I doubt that
    it provides a useful replacement for getElementById.

    If you show how you are using getElementById, better help can probably
    be provided.


    --
    Rob
     
    RobG, Mar 29, 2007
    #2
    1. Advertising

  3. On 29 Mar, 17:29, "RobG" <> wrote:
    > On Mar 29, 8:59 pm, "Claudio Calboni" <> wrote:
    >
    > > Hello folks,
    > > I'm having some performance issues with the client-side part of my
    > > application.
    > > Basically, it renders a huge HTML table (about 20'000 cells in my
    > > testing scenario), without content. Content is "pushed" from the back
    > > via some JS, for the only displayed portion of table. Once the user
    > > scrolls, JS updates visible cells with data. It's quite the philosophy
    > > behind GMaps and similars.

    >
    > The thought of a page with a 20,000 cell table boggles the mind.
    > Howver, if you *really* want to do that...


    Hello Rob and thank you for you support.
    Tt's not a metter of how big is the table. Browser renders it
    seamlessly even if so big.

    > > So, the server says to JS "update this group of cells with these
    > > data". JS iterates (this is fast) through these instructions and push
    > > data into cells. To find which object it has to update, JS uses
    > > getElementByID(), and this is slow. With a 20k cells table and 300
    > > displayied cells it takes 5 to 10 seconds to update.

    >
    > I seriously doubt that getElementById is your problem. My ancient
    > 400MHz G3 can find a cell using getElementById in a 20,000 cell table
    > in much less than half a second (manually timed). Try the following:
    >


    It's not a single getElementById that push my processor to 100% for
    some seconds but an iteration with about 300 calls to getElementById
    AND a big table (20K cells). 300 getElementById calls vs a DOM with
    300 cells runs very fast. And of course a single call is always fast
    to me too.

    > > I suppose, but
    > > I'm not a veteran JS developer (mainly I develop server-side with .NET
    > > but I'm finding very interesting and poweful "the client side of the
    > > force" :D), this is due to the fact that getElementsById actually
    > > *search* through DOM every element my JS is looking for because it
    > > doesn't have a sort of index of elements.

    >
    > How do you know that? Have you looked at the source code for your
    > browser to see how it does it?


    As I said "I suppose". Because if document's DOM is small, it's a lot
    faster. But of course I can be terribly wrong :)!

    > > I'm trying caching founded
    > > elements and it works great, but loading times are only moved from one
    > > place to another.

    >
    > > I'm thinking about getting all elements (obviously only those I have
    > > to update) via XPath, if possible (never used this tech yet). My
    > > script is able to say "I require cells from--to", so it can be great
    > > if I can extract only a snapshot of elements from the DOM, only those
    > > requirest to be updated, and then iterate through them.

    >
    > Now XPATH can certainly
    > help with some things, such as using CSS style selectors to find
    > elements rather than javascript to sift through them, but I doubt that
    > it provides a useful replacement for getElementById.


    Probably the XPath way is not the right way. I've made some tests and
    is not right for me. I've tryied your scripts on both IE6 and Firefox
    and, unfortunately for me, FF does a lot better (of course is my
    primary browser). It's fast at first and seems to cache the research,
    resulting even faster from second call. IE is slower and it stay
    slower. Customer's selected browser, anyway, is IE (intranet)..

    >
    > If you show how you are using getElementById, better help can probably
    > be provided.


    I've made a sort of cache, as said before, but takes inacceptably long
    time at startup:

    function cacheElementRef(id) {
    var res = elementCache[id]
    if (res === void 0)
    res = elementCache[id] = document.getElementById(id)
    return res
    }

    and I'm investigating other possibilities, but after a lot of searches
    I doubt that there is something faster that getElementById..

    Thanks, anyway!

    tK
     
    Claudio Calboni, Mar 29, 2007
    #3
  4. Claudio Calboni

    Guest

    On Mar 29, 5:11 pm, "Claudio Calboni" <> wrote:

    > > > Hello folks,
    > > > I'm having some performance issues with the client-side part of my
    > > > application.
    > > > Basically, it renders a huge HTML table (about 20'000 cells in my
    > > > testing scenario), without content. Content is "pushed" from the back
    > > > via some JS, for the only displayed portion of table. Once the user
    > > > scrolls, JS updates visible cells with data. It's quite the philosophy
    > > > behind GMaps and similars.


    Yikes that scares me a bit, still if that's the best solution

    (snipped quite a lot about getElementById being too slow when updating
    300+ cells at once)

    > > If you show how you are using getElementById, better help can probably
    > > be provided.

    >
    > I've made a sort of cache, as said before, but takes inacceptably long
    > time at startup:
    >
    > function cacheElementRef(id) {
    > var res = elementCache[id]
    > if (res === void 0)
    > res = elementCache[id] = document.getElementById(id)
    > return res
    >
    > }
    >
    > and I'm investigating other possibilities, but after a lot of searches
    > I doubt that there is something faster that getElementById..


    I seriously doubt that your cache like that will be much quicker than
    the built in cache.

    But forget getElementById, it's meant for one off lookups. You have a
    table, so it has a very regular form that is a piece of cake to
    navigate using the DOM. Throw away the id's as well, because they're
    a lazy method to do this sort of thing.

    Get your row by finding the TBODY, and find your cell using
    childNodes[row].childNodes[col]

    You only problem might be if there are whitespace between the
    elements.

    A demo is attached. using getElementById on IE takes 23 seconds,
    compared to 0.3 seconds using childNodes. in FF the difference is
    closer : 3s vs 0.3 seconds.

    Click the table once to see the slow method, twice to see the quick
    method.

    HTH

    Sam

    <script type="text/javascript">
    var rows = 200;
    var cols = 200;

    // init table
    var s = [];
    for (var r=0; r < rows; ++r)
    {
    s.push('<tr>');
    for (var c= 0; c < cols; ++c)
    {
    s.push('<td id="c_' + r + '_' + c + '">' + 'c_' + r + '_'
    + c);
    }
    }
    document.write('<table border=1 id=\'table\'>' + s.join('') + '<\/
    table>')

    var table = document.getElementById('table');
    var clicked = 0;
    table.onclick = function()
    {
    var start_r = 10; var end_r = 40;
    var start_c = 10; var end_c = 40;
    var t1 = new Date();
    var method = '';
    ++clicked;
    if ((clicked%2) == 1)
    {
    for (var r = start_r; r < end_r; ++r)
    {
    for (var c = start_c; c < end_c; ++c)
    {
    document.getElementById('c_' + r + '_' + c).innerHTML =
    "Slow!";
    }
    }
    method = "Using getDocumentById";
    }
    else
    {
    var elems = table.getElementsByTagName('TBODY');
    var tbody = elems[0];
    for (var r = start_r; r < end_r; ++r)
    {
    var row = tbody.childNodes[r];
    for (var c = start_c; c < end_c; ++c)
    {
    row.childNodes[start_c].innerHTML = "Fast!";
    }
    }
    method = "Using childNodes";
    }
    var t2 = new Date();
    alert((t2.getTime() - t1.getTime())/1000 + "s");
    }
    </script>
     
    , Mar 29, 2007
    #4
  5. Claudio Calboni

    Guest

    On Mar 29, 6:51 pm, wrote:
    > I seriously doubt that your cache like that will be much quicker than
    > the built in cache.
    >
    > But forget getElementById, it's meant for one off lookups. You have a
    > table, so it has a very regular form that is a piece of cake to
    > navigate using the DOM. Throw away the id's as well, because they're
    > a lazy method to do this sort of thing.
    >
    > Get your row by finding the TBODY, and find your cell using
    > childNodes[row].childNodes[col]
    >
    > You only problem might be if there are whitespace between the
    > elements.
    >
    > A demo is attached. using getElementById on IE takes 23 seconds,
    > compared to 0.3 seconds using childNodes. in FF the difference is
    > closer : 3s vs 0.3 seconds.


    Sorry, there was a typo in what I posted, this line :

    > row.childNodes[start_c].innerHTML = "Fast!";


    should read :

    > row.childNodes[c].innerHTML = "Fast!";


    Sorry about that.

    Sam
     
    , Mar 30, 2007
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. SB
    Replies:
    0
    Views:
    378
  2. Skeets
    Replies:
    4
    Views:
    5,987
    Skeets
    Feb 23, 2006
  3. Chris Seberino
    Replies:
    3
    Views:
    751
    Stefan Behnel
    Jun 12, 2009
  4. Tharanga Abeyseela

    delete xml elements - using xpath search

    Tharanga Abeyseela, Oct 16, 2012, in forum: Python
    Replies:
    0
    Views:
    158
    Tharanga Abeyseela
    Oct 16, 2012
  5. Stefan Behnel
    Replies:
    0
    Views:
    164
    Stefan Behnel
    Oct 16, 2012
Loading...

Share This Page