Xerces 2.7 vs 1.6 performance problem

Discussion in 'XML' started by Bryan, Apr 10, 2007.

  1. Bryan

    Bryan Guest

    Hi,

    We have an application that we just upgraded to xerces-c-2_7_0-win32.
    This same application used to use xerces-c1_6_0-win32.

    We didnt change any other code in our app other than the xerces libs and
    dlls that were used.

    We are loading up large (>20mb) xml files using DOM (we should use SAX,
    I know)- in 1.6 we can parse through the file and pull data on the order
    of 10sec. With 2.7, this same parsing takes more than 10 _minutes_!!!!!

    I have been scouring the net looking for info, but I am not an xml
    expert, nor am I particularly familiar with xerces.

    Can any offer any suggestions as to where I might look or clues as to
    what might be going on? I found some info on deferred node expansion,
    but I really dont know if this can explain this differece.

    Thanks,
    Bryan
    Bryan, Apr 10, 2007
    #1
    1. Advertising

  2. Have you tried asking on Xerces' own mailing list? That's where you're
    most likely to find folks who have current understanding of the
    internals of the parser and where possible bottlenecks might be. (My own
    best guess is that you're having a swapping problem, but it's been years
    since I looked at the Xerces-C code so I really can't advise you.)
    Joseph Kesselman, Apr 10, 2007
    #2
    1. Advertising

  3. Bryan

    Bryan Guest

    Joseph Kesselman wrote:
    > Have you tried asking on Xerces' own mailing list? That's where you're
    > most likely to find folks who have current understanding of the
    > internals of the parser and where possible bottlenecks might be. (My own
    > best guess is that you're having a swapping problem, but it's been years
    > since I looked at the Xerces-C code so I really can't advise you.)


    Didn't try the mailing list yet- hate those things, you get spammed with
    a load of emails and they are a pain to subscribe to.

    But I think I will have no choice but to give it a go soon...
    Bryan, Apr 10, 2007
    #3
  4. > Didn't try the mailing list yet- hate those things, you get spammed with
    > a load of emails and they are a pain to subscribe to.


    Apache's mailing lists are almost completely spam-free, in my
    experience. If you need expertise specifically about Apache code, they
    really are the best place to find it.


    --
    () ASCII Ribbon Campaign | Joe Kesselman
    /\ Stamp out HTML e-mail! | System architexture and kinetic poetry
    Joe Kesselman, Apr 11, 2007
    #4
  5. Hi,

    Bryan <> writes:

    > Can any offer any suggestions as to where I might look or clues as to
    > what might be going on?


    It is hard to say what exactly is causing this without seeing the
    code. My guess is that in order to support requirements of future
    DOM versions (e.g., DOM level 3), the implementation has changed
    and become less efficient. Here is a blog post about two DOM API
    functions that can slow things down significantly:

    http://www.codesynthesis.com/~boris/blog/2006/11/28/xerces-c-dom-potholes/


    Also the Xerces-C++ mailing list is a better place for this kind of
    questions.


    hth,
    -boris


    --
    Boris Kolpackov
    Code Synthesis Tools CC
    http://www.codesynthesis.com
    Open-Source, Cross-Platform C++ XML Data Binding
    Boris Kolpackov, Apr 11, 2007
    #5
  6. Boris Kolpackov wrote:
    > It is hard to say what exactly is causing this without seeing the
    > code. My guess is that in order to support requirements of future
    > DOM versions (e.g., DOM level 3), the implementation has changed
    > and become less efficient.


    If you can supply samples to the Xerces developers, I'm sure they'll be
    interested in investigating what has changed and improving it if they can.

    Appropos of
    http://www.codesynthesis.com/~boris/blog/2006/11/28/xerces-c-dom-potholes/
    .... For years, I've been telling people that the semantice of nodelists,
    specifically "live view" behavior, are a set of bugs and performance
    disasters waiting to happen. The DOM Level 2 Traversal chapter provides
    alternatives that can be implemented much more efficiently... or, as
    suggested on the website, you can switch to explicit traversal.


    --
    Joe Kesselman / Beware the fury of a patient man. -- John Dryden
    Joseph Kesselman, Apr 11, 2007
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. jm
    Replies:
    1
    Views:
    500
    alien2_51
    Dec 12, 2003
  2. Roy Benjamin
    Replies:
    3
    Views:
    524
    Mike Schilling
    Jul 23, 2003
  3. Grzegorz Wrazen

    Xerces problem with jar file

    Grzegorz Wrazen, Aug 30, 2004, in forum: Java
    Replies:
    1
    Views:
    335
    Aria Kokoschka
    Sep 1, 2004
  4. cvissy
    Replies:
    0
    Views:
    601
    cvissy
    Nov 16, 2004
  5. Software Engineer
    Replies:
    0
    Views:
    314
    Software Engineer
    Jun 10, 2011
Loading...

Share This Page