Bengt Richter said:
You ought to be able to gain a little by hoisting the os.path.xxx
attribute lookups for join and isdir out of the loop. E.g, (not tested)
opj=os.path.join; oisd=os.path.isdir
[opj(path, p) for p in os.listdir(path) if oisd(opj(path, p))]
But it seems like you are asking the os to chase through full paths at
every isdir operation, rather than just telling it to make its current working
directory the directory you are interested in and doing it there. E.g., (untested)
savedir = os.getcwd()
os.chdir(path)
dirs = [opj(path, p) for p in os.listdir('.') if oisd(p)]
os.chdir(savedir)
with 1000 files in the directory
# 1) Original using '.'
/usr/lib/python2.3/timeit.py -s 'import os; path="."' \
'[os.path.join(path, p) for p in os.listdir(path) if os.path.isdir(os.path.join(path, p))]'
10 loops, best of 3: 2.69e+04 usec per loop
# 2) Original with long path
/usr/lib/python2.3/timeit.py -s 'import os;
path="/tmp/a/b/c/d/e/f/g/h/i/j/k/l/m/n/o/p/q/r/s/t/u/v/w/x/y/z"' \
'[os.path.join(path, p) for p in os.listdir(path) if os.path.isdir(os.path.join(path, p))]'
10 loops, best of 3: 4.16e+04 usec per loop
# 3) Using cwd
/usr/lib/python2.3/timeit.py -s 'import os' \
'[os.path.join(path, p) for p in os.listdir(".") if os.path.isdir(p)]'
100 loops, best of 3: 1.85e+04 usec per loop
Even if so, I doubt the os finds it via a hash of the full path instead
of checking that every element of the path exists and is a subdirectory.
IWT that could be a dangerous short cut
It is. Linux will look through each path entry. However they will be
hot in the dcache. It doesn't take much time hence the relatively
small difference between 1) and 2).
I expect the main difference between 1) and 3) is the fact it contains
one less os.path.join()
/usr/lib/python2.3/timeit.py -s 'import os;' 'os.path.join("a", "b")'
100000 loops, best of 3: 7.34 usec per loop
Its executed 1000 times above which is 7340 usec. The difference
between 1) and 3) is 8400 usec - pretty close!