Big speed boost in os.walk in Python 2.5

Discussion in 'Python' started by looping, Oct 13, 2006.

  1. looping

    looping Guest

    Hi,
    I noticed a big speed improvement in some of my script that use os.walk
    and I write a small script to check it:
    import os
    for path, dirs, files in os.walk('D:\\FILES\\'):
    pass

    Results on Windows XP after some run to fill the disk cache (with
    ~59000 files and ~3500 folders):
    Python 2.4.3 : 45s
    Python 2.5 : 10s

    Very nice, but somewhat strange...
    Is Python 2.4.3 os.walk buggy ???
    Is this results only valid in Windows or *nix system show the same
    difference ?
    The profiler show that most of time is spend in ntpath.isdir and this
    function is *a lot* faster in Python 2.5.
    Maybe this improvement could be backported in Python 2.4 branch for the
    next release ?


    Python 2.4.3
    604295 function calls (587634 primitive calls) in 48.629 CPU
    seconds

    Ordered by: standard name

    ncalls tottime percall cumtime percall filename:lineno(function)
    62554 0.264 0.000 0.264 0.000 :0(append)
    1 0.001 0.001 48.593 48.593 :0(execfile)
    66074 0.197 0.000 0.197 0.000 :0(len)
    3521 5.219 0.001 5.219 0.001 :0(listdir)
    1 0.036 0.036 0.036 0.036 :0(setprofile)
    62554 38.812 0.001 38.812 0.001 :0(stat)
    1 0.000 0.000 48.593 48.593 <string>:1(?)
    66074 0.218 0.000 0.218 0.000 ntpath.py:116(splitdrive)
    3520 0.009 0.000 0.009 0.000 ntpath.py:246(islink)
    62554 0.767 0.000 40.137 0.001 ntpath.py:268(isdir)
    66074 0.433 0.000 0.650 0.000 ntpath.py:51(isabs)
    66074 0.880 0.000 1.726 0.000 ntpath.py:59(join)
    20183/3522 1.217 0.000 48.573 0.014 os.py:211(walk)
    1 0.000 0.000 48.629 48.629
    profile:0(execfile('test.py'))
    0 0.000 0.000 profile:0(profiler)
    62554 0.174 0.000 0.174 0.000 stat.py:29(S_IFMT)
    62554 0.385 0.000 0.559 0.000 stat.py:45(S_ISDIR)
    1 0.019 0.019 48.592 48.592 test.py:1(?)


    Python 2.5:
    604295 function calls (587634 primitive calls) in 17.386 CPU
    seconds

    Ordered by: standard name

    ncalls tottime percall cumtime percall filename:lineno(function)
    62554 0.247 0.000 0.247 0.000 :0(append)
    1 0.001 0.001 17.315 17.315 :0(execfile)
    66074 0.168 0.000 0.168 0.000 :0(len)
    3521 5.287 0.002 5.287 0.002 :0(listdir)
    1 0.071 0.071 0.071 0.071 :0(setprofile)
    62554 7.812 0.000 7.812 0.000 :0(stat)
    1 0.000 0.000 17.315 17.315 <string>:1(<module>)
    66074 0.186 0.000 0.186 0.000 ntpath.py:116(splitdrive)
    3520 0.009 0.000 0.009 0.000 ntpath.py:245(islink)
    62554 0.712 0.000 9.013 0.000 ntpath.py:267(isdir)
    66074 0.394 0.000 0.581 0.000 ntpath.py:51(isabs)
    66074 0.815 0.000 1.564 0.000 ntpath.py:59(join)
    20183/3522 1.176 0.000 17.296 0.005 os.py:218(walk)
    1 0.000 0.000 17.386 17.386
    profile:0(execfile('test.py'))
    0 0.000 0.000 profile:0(profiler)
    62554 0.159 0.000 0.159 0.000 stat.py:29(S_IFMT)
    62554 0.331 0.000 0.489 0.000 stat.py:45(S_ISDIR)
    1 0.018 0.018 17.314 17.314 test.py:1(<module>)
    looping, Oct 13, 2006
    #1
    1. Advertising

  2. looping wrote:

    > Results on Windows XP after some run to fill the disk cache (with
    > ~59000 files and ~3500 folders):
    > Python 2.4.3 : 45s
    > Python 2.5 : 10s
    >
    > Very nice, but somewhat strange...
    > Is Python 2.4.3 os.walk buggy ???


    No. A few "os" function are now implemented in terms of Windows API:s,
    instead of using Microsoft C's POSIX compatibility layer. This includes
    os.stat(), which is what isdir() uses to check if something is a
    directory. The code was rewritten to work around problems with
    timestamps, so the speedup is purely a side effect.

    > Is this results only valid in Windows or *nix system show the same
    > difference ?


    On Unix system, Python uses POSIX API:s, not Windows API:s.

    > The profiler show that most of time is spend in ntpath.isdir and this
    > function is *a lot* faster in Python 2.5.


    Why are you asking if something's buggy when you've already figured out
    what's been improved?

    > Maybe this improvement could be backported in Python 2.4 branch for the
    > next release ?


    It's not really broken, so that's not very likely.

    </F>
    Fredrik Lundh, Oct 13, 2006
    #2
    1. Advertising

  3. looping

    looping Guest

    Fredrik Lundh wrote:
    > looping wrote:
    >
    > >
    > > Very nice, but somewhat strange...
    > > Is Python 2.4.3 os.walk buggy ???

    >
    >
    > Why are you asking if something's buggy when you've already figured out
    > what's been improved?
    >

    You're right, buggy isn't the right word...

    Anyway thanks for your detailed informations and I'm very pleased with
    the performance improvement even if it's only a side effect and only on
    Windows.
    looping, Oct 13, 2006
    #3
  4. looping schrieb:
    > Maybe this improvement could be backported in Python 2.4 branch for the
    > next release ?


    As Fredrik explains, this is probably the side-effect of a from-scratch
    rewrite of the relevant functions. Another (undesirable) side-effect is
    that the resulting binary won't work on Windows 95 anymore. So
    backporting it as-is is out of the question.

    However, even if the patch was improved to still work on W9x, and to not
    introduce the other behavioral changes that came with the rewrite, it
    still couldn't go into 2.4.x. Likely, 2.4.4 is the final 2.4 release,
    and the release candidate for that was already produced.

    Regards,
    Martin
    =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=, Oct 13, 2006
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Marcus Alves Grando
    Replies:
    7
    Views:
    451
    Marcus Alves Grando
    Nov 14, 2007
  2. Shaguf
    Replies:
    0
    Views:
    318
    Shaguf
    Dec 24, 2008
  3. Shaguf
    Replies:
    0
    Views:
    422
    Shaguf
    Dec 26, 2008
  4. Shaguf
    Replies:
    0
    Views:
    207
    Shaguf
    Dec 26, 2008
  5. Shaguf
    Replies:
    0
    Views:
    193
    Shaguf
    Dec 24, 2008
Loading...

Share This Page