RE: Program inefficiency?

Discussion in 'Python' started by Michael.Coll-Barth@VerizonWireless.com, Sep 29, 2007.

  1. Guest


    > -----Original Message-----
    > From:
    > the program works great except for one thing. It's significantly
    > slower through the later files in the search then through the early
    > ones... Before anyone criticizes, I recognize that that middle section
    > could be simplified with a for loop... I just haven't cleaned it
    > up...
    >
    > The problem is that the first 300 files take about 10-15 seconds and
    > the last 300 take about 2 minutes... If we do more than about 1500
    > files in one run, it just hangs up and never finishes...
    >
    > Is there a solution here that I'm missing? What am I doing that is so
    > inefficient?


    You did not mention the OS, but because you are using
    "pathname\editfile.txt", it sounds like you are using an MS OS. From
    past experience with various MS OSes, I found that as the number of
    files in a directory increases the slower your process runs for each
    file. You might try putting these files into multiple sub-directories.


    The information contained in this message and any attachment may be
    proprietary, confidential, and privileged or subject to the work
    product doctrine and thus protected from disclosure. If the reader
    of this message is not the intended recipient, or an employee or
    agent responsible for delivering this message to the intended
    recipient, you are hereby notified that any dissemination,
    distribution or copying of this communication is strictly prohibited.
    If you have received this communication in error, please notify me
    immediately by replying to this message and deleting it and all
    copies and backups thereof. Thank you.
    , Sep 29, 2007
    #1
    1. Advertising

  2. Guest

    XP is the OS... the files are split across a ton of subdirectories
    already...

    I'm actually starting to think there's a problem with certain files,
    however...

    We create help files for clients using RoboHelp... RoboHelp has Source
    HTML and then "webhelp" html which is what actually goes to the
    client... I'm trying to mass maintenance the "source" files... Right
    now, my program works but you've got to delete the webhelp files
    first... I figured that (based on the exponential growth in processing
    time) it was the additional number of files... However, after
    streamlining the codes I got the following results

    done 300
    4.1904767226e-006
    done 600
    7.97062280262
    done 900
    22.3963802662
    done 1200
    29.9211888662
    done
    1375
    35.3465962853

    with the webhelp deleted and

    done 300
    4.1904767226e-006
    done 600
    7.6259175398
    done 900
    13.3994678095
    still processing 10 minutes later

    with the webhelp intact

    Since the system didn't hang sometime after 1375 (and in fact, still
    hasn't made it there), I can only assume that it hit one of the
    webhelp files and freaked out...

    The thing that's really weird is that the files it's hanging on appear
    to be some of the most basic files in the whole system (small, not
    alot going on... no hits on the RE search)... So I may just tell the
    users to delete the webhelp and have robohelp recreate it after
    they've run the program...
    , Sep 29, 2007
    #2
    1. Advertising

  3. stdazi Guest

    On Sep 29, 6:07 pm, wrote:

    > You did not mention the OS, but because you are using
    > "pathname\editfile.txt", it sounds like you are using an MS OS. From
    > past experience with various MS OSes, I found that as the number of
    > files in a directory increases the slower your process runs for each
    > file.


    how so?
    stdazi, Sep 29, 2007
    #3
  4. thebjorn Guest

    On Sep 29, 9:32 pm, stdazi <> wrote:
    > On Sep 29, 6:07 pm, wrote:
    >
    > > You did not mention the OS, but because you are using
    > > "pathname\editfile.txt", it sounds like you are using an MS OS. From
    > > past experience with various MS OSes, I found that as the number of
    > > files in a directory increases the slower your process runs for each
    > > file.

    >
    > how so?


    Not entirely sure why, but some of the ms docs allude to the fact that
    there is a linked list involved (at least for fat-style disks).

    -- bjorn
    thebjorn, Sep 29, 2007
    #4
  5. Guest


    > -----Original Message-----
    > From: stdazi
    > On Sep 29, 6:07 pm, wrote:
    >
    > > You did not mention the OS, but because you are using
    > > "pathname\editfile.txt", it sounds like you are using an MS OS.

    From
    > > past experience with various MS OSes, I found that as the number of
    > > files in a directory increases the slower your process runs for each
    > > file.

    >
    > how so?
    >


    I said "sounds like", which means I was guessing. In *nix ( the ones I
    know ), it would have been "pathname/editfile.txt". But, it was the
    file extension that caught my eye. I had a project a while back that
    required a mass conversion of jpeg files and ran into what sounded like
    a similar problem.


    The information contained in this message and any attachment may be
    proprietary, confidential, and privileged or subject to the work
    product doctrine and thus protected from disclosure. If the reader
    of this message is not the intended recipient, or an employee or
    agent responsible for delivering this message to the intended
    recipient, you are hereby notified that any dissemination,
    distribution or copying of this communication is strictly prohibited.
    If you have received this communication in error, please notify me
    immediately by replying to this message and deleting it and all
    copies and backups thereof. Thank you.
    , Sep 29, 2007
    #5
  6. Guest


    > -----Original Message-----
    > From: thebjorn
    >
    > On Sep 29, 9:32 pm, stdazi <> wrote:
    > > On Sep 29, 6:07 pm, wrote:
    > >
    > > > You did not mention the OS, but because you are using
    > > > "pathname\editfile.txt", it sounds like you are using an

    > MS OS. From
    > > > past experience with various MS OSes, I found that as the

    > number of
    > > > files in a directory increases the slower your process

    > runs for each
    > > > file.

    > >
    > > how so?

    >
    > Not entirely sure why, but some of the ms docs allude to the fact that
    > there is a linked list involved (at least for fat-style disks).
    >


    Wow! Talk about defending one's self ( me ). I took stdazi's question
    to mean why I thought it was an MS OS issue. And, yes, you are correct
    about the linked-list. Although, I do not know if that was the reason
    for the my problem.


    The information contained in this message and any attachment may be
    proprietary, confidential, and privileged or subject to the work
    product doctrine and thus protected from disclosure. If the reader
    of this message is not the intended recipient, or an employee or
    agent responsible for delivering this message to the intended
    recipient, you are hereby notified that any dissemination,
    distribution or copying of this communication is strictly prohibited.
    If you have received this communication in error, please notify me
    immediately by replying to this message and deleting it and all
    copies and backups thereof. Thank you.
    , Sep 29, 2007
    #6
  7. Sorry for intruding here,

    But one inefficiency of the Windows XP (not NTFS in general) is that
    NTFS must generate 8.3 names (for old-DOS/Win31/Win95/Win98/WinME
    compatibility).

    Generating such a name could slow down the system, as the name must be
    unique, and finding such unique name would be slow.

    Off course this should only affect the CREATION, not the SEARCH
    (although with 8.3 support enabled you might end up with x2 times more
    entries (if all your names are bigger than 8.3)).

    Here's more:

    http://www.windowsdevcenter.com/pub/a/windows/2005/02/08/NTFS_Hacks.html

    Thanks,
    Dimiter "malkia" Stanev.

    wrote:
    >
    >
    >> -----Original Message-----
    >> From:
    >> the program works great except for one thing. It's significantly
    >> slower through the later files in the search then through the early
    >> ones... Before anyone criticizes, I recognize that that middle section
    >> could be simplified with a for loop... I just haven't cleaned it
    >> up...
    >>
    >> The problem is that the first 300 files take about 10-15 seconds and
    >> the last 300 take about 2 minutes... If we do more than about 1500
    >> files in one run, it just hangs up and never finishes...
    >>
    >> Is there a solution here that I'm missing? What am I doing that is so
    >> inefficient?

    >
    > You did not mention the OS, but because you are using
    > "pathname\editfile.txt", it sounds like you are using an MS OS. From
    > past experience with various MS OSes, I found that as the number of
    > files in a directory increases the slower your process runs for each
    > file. You might try putting these files into multiple sub-directories.
    >
    >
    > The information contained in this message and any attachment may be
    > proprietary, confidential, and privileged or subject to the work
    > product doctrine and thus protected from disclosure. If the reader
    > of this message is not the intended recipient, or an employee or
    > agent responsible for delivering this message to the intended
    > recipient, you are hereby notified that any dissemination,
    > distribution or copying of this communication is strictly prohibited.
    > If you have received this communication in error, please notify me
    > immediately by replying to this message and deleting it and all
    > copies and backups thereof. Thank you.
    >
    Dimiter \malkia\ Stanev, Oct 1, 2007
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Roedy Green

    Serialisation inefficiency

    Roedy Green, Sep 16, 2003, in forum: Java
    Replies:
    8
    Views:
    380
    Robert Olofsson
    Sep 18, 2003
  2. Frederick Gotham

    Inherent inefficiency in domestic "for" loop?

    Frederick Gotham, Jun 26, 2006, in forum: C Programming
    Replies:
    34
    Views:
    788
  3. Program inefficiency?

    , Sep 29, 2007, in forum: Python
    Replies:
    17
    Views:
    608
    Florian Schmidt
    Oct 1, 2007
  4. Antoninus Twink

    Bug/Gross InEfficiency in HeathField's fgetline program

    Antoninus Twink, Oct 7, 2007, in forum: C Programming
    Replies:
    436
    Views:
    6,028
    user923005
    Nov 13, 2007
  5. Dik T. Winter
    Replies:
    2
    Views:
    344
    Charlie Gordon
    Nov 7, 2007
Loading...

Share This Page