searching and storing large quantities of xml!

Discussion in 'C Programming' started by dads, Jan 16, 2010.

  1. dads

    dads Guest

    I work in as 1st line support and python is one of my hobbies. We get
    quite a few requests for xml from our website and its a long strung
    out process. So I thought I'd try and create a system that deals with
    it for fun.

    I've been tidying up the archived xml and have been thinking what's
    the best way to approach this issue as it took a long time to deal
    with big quantities of xml. If you have 5/6 years worth of 26000+
    5-20k xml files per year. The archived stuff is zipped but what is
    better, 26000 files in one big zip file, 26000 files in one big zip
    file but in folders for months and days, or zip files in zip files!

    I created an app in wxpython to search the unzipped xml files by the
    modified date and just open them up and just using the something like
    l.find('>%s<' % fiveDigitNumber) != -1: is this quicker than parsing
    the xml?

    Generally the requests are less than 3 months old so that got me into
    thinking should I create a script that finds all the file names and
    corresponding web number of old xml and bungs them into a db table one
    for each year and another script that after everyday archives the xml
    and after 3months zip it up, bungs info into table etc. Sorry for the
    ramble I just want other peoples opinions on the matter. =)
     
    dads, Jan 16, 2010
    #1
    1. Advertising

  2. On 01/15/2010 04:30 PM, dads wrote:
    > l.find('>%s<' % fiveDigitNumber) != -1: is this quicker than parsing
    > the xml?


    I would bet it is quicker, since searching for a substring is an easier
    task than parsing XML. But you could try it both ways and find out.

    For followups, try comp.lang.python or comp.programming.

    -Beej
     
    Beej Jorgensen, Jan 16, 2010
    #2
    1. Advertising

  3. dads

    dads Guest

    On Jan 16, 6:28 am, Beej Jorgensen <> wrote:
    > On 01/15/2010 04:30 PM, dads wrote:
    >
    > > l.find('>%s<' % fiveDigitNumber) != -1: is this quicker than parsing
    > > the xml?

    >
    > I would bet it is quicker, since searching for a substring is an easier
    > task than parsing XML.  But you could try it both ways and find out.
    >
    > For followups, try comp.lang.python or comp.programming.
    >
    > -Beej


    Sorry folks - just realised were I've posted. Thx Beej!
     
    dads, Jan 16, 2010
    #3
  4. In article <hirmao$5e6$-september.org>,
    Beej Jorgensen <> wrote:

    >I would bet it is quicker, since searching for a substring is an easier
    >task than parsing XML.


    Yes, provided you're sure that the XML file will have exactly that
    string. But from an XML point of view, "abc" and "abc" are
    the same, so you'd better be sure that there isn't going to be
    any unexpected escaping of that kind.

    -- Richard
    --
    Please remember to mention me / in tapes you leave behind.
     
    Richard Tobin, Jan 16, 2010
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Nancy
    Replies:
    4
    Views:
    1,125
    Mike Treseler
    Feb 5, 2004
  2. Robert Dodier
    Replies:
    2
    Views:
    389
    Stefan Ram
    Sep 15, 2006
  3. Replies:
    4
    Views:
    221
  4. dads
    Replies:
    7
    Views:
    286
    Stefan Behnel
    Jan 19, 2010
  5. yukatan

    adding two quantities

    yukatan, Oct 30, 2003, in forum: Javascript
    Replies:
    31
    Views:
    317
    Dr John Stockton
    Nov 3, 2003
Loading...

Share This Page