os.listdir

H

Hari

Hi,

I have question regarding os.listdir()


I wanted to know if this is a reliable way to check for files added to
a directory??

files = os.lisdir(path)

currentlen = len(files)

// directory modified

files = os.listdor(path)

newlen = len(files)

then the files in the range (currentlen -1) to (newlen -1) are all the
files added to the directory.

What I wanted to know was, is it guaranteed that between 2 calls
os.listdir any files added to the directory are appended and the
earlier order is maintained?

thanks,

Hari
 
G

Graham Fawcett

Hari said:
Hi,

I have question regarding os.listdir()


I wanted to know if this is a reliable way to check for files added to
a directory??

files = os.lisdir(path)

currentlen = len(files)

// directory modified

files = os.listdor(path)

newlen = len(files)

then the files in the range (currentlen -1) to (newlen -1) are all the
files added to the directory.

What I wanted to know was, is it guaranteed that between 2 calls
os.listdir any files added to the directory are appended and the
earlier order is maintained?

Unless the documentation for os.listdir specifies such behaviour
explicitly, you should assume the answer is /no/. There are many
Pythons, running on many platforms; don't assume that they all share
identical unspecified behaviour.

There's no need for such sequence guarantees, though. It's probably not
the most efficient but here's a way to do it:

import os
import time

somedir = '/tmp'
snapshot1 = os.listdir(somedir)
time.sleep(...)
snapshot2 = os.listdir(somedir)

newfiles = [f for f in snapshot2 if not f in snapshot1]

Pretty efficient, really: that's an O(n) comparison if I remember my
Python internals correctly. Which I don't, so don't trust my word for
it. ;-)
thanks,

Hari
-- Graham
 
J

Jeremy Jones

* Hari ([email protected]) wrote:
What I wanted to know was, is it guaranteed that between 2 calls
os.listdir any files added to the directory are appended and the
earlier order is maintained?

Even if the module does guarantee this to you, you probably don't want to
just rely on the length of the list (nor the order of it). You probably
want to build yourself a dictionary and keep track of it that way. One
reason is that you may not be able to guarantee that a file didn't get
deleted out of the directory. Even if you think you are in total control
of that filesystem, I still wouldn't trust that no files would be deleted.
I dunno, maybe I'm just paranoid. Further, as the value of the dictionary
(didn't mention this above, but I'd let the filename serve as the key), I'd
do os.stat and use the last modified date. Again, maybe I'm just paranoid.

Jeremy Jones
 
C

Christos TZOTZIOY Georgiou

What I wanted to know was, is it guaranteed that between 2 calls
os.listdir any files added to the directory are appended and the
earlier order is maintained?

No, the order of filenames in os.listdir depends on underlying
(file|operating) system.

The easiest way to find differences would be to use sets. Example:
check the /tmp directory for changes in 30 seconds time:
Set(['new-file'])

ie shows only the file I created in another session for the purpose of
this example.
 
P

Peter Otten

Hari said:
I wanted to know if this is a reliable way to check for files added to
a directory??

Short answer: NO!

How about:
from sets import Set
s1 = Set("abc")
s2 = Set("abd")
print s2 - s1 Set(['d'])

Just replace "abc" and "abd" with os.listdir(path), and Python will show you
all new files, or rather files with new names. This is so simple that
there's no need to be paranoid (and I don't think Jeremy Jones is) to use
this approach, that does not rely on both OS specifics (stable file name
order) and user bahaviour (I promise I won't delete anything).

Peter
 
M

Michael Peuser

Graham Fawcett said:
Hari wrote:

As others pinted out, this is jighly improbably, espacially under windows

There's no need for such sequence guarantees, though. It's probably not
the most efficient but here's a way to do it:

import os
import time

somedir = '/tmp'
snapshot1 = os.listdir(somedir)
time.sleep(...)
snapshot2 = os.listdir(somedir)

newfiles = [f for f in snapshot2 if not f in snapshot1]

Pretty efficient, really: that's an O(n) comparison if I remember my
Python internals correctly. Which I don't, so don't trust my word for
it. ;-)


You are wright to mistrust it ;-)
Though 'f in list' looks harmless it itself is o(n) because there seems to
be a linear search (This is different whith 'f in dict' of course). So your
list comprehension is of o(n*n).

Kindly
Michael P
 
G

Graham Fawcett

Michael Peuser said:
Graham Fawcett said:
Hari wrote:

As others pinted out, this is jighly improbably, espacially under windows

There's no need for such sequence guarantees, though. It's probably not
the most efficient but here's a way to do it:

import os
import time

somedir = '/tmp'
snapshot1 = os.listdir(somedir)
time.sleep(...)
snapshot2 = os.listdir(somedir)

newfiles = [f for f in snapshot2 if not f in snapshot1]

Pretty efficient, really: that's an O(n) comparison if I remember my
Python internals correctly. Which I don't, so don't trust my word for
it. ;-)


You are wright to mistrust it ;-)
Though 'f in list' looks harmless it itself is o(n) because there seems to
be a linear search (This is different whith 'f in dict' of course). So your
list comprehension is of o(n*n).

Kindly
Michael P

D'oh! Thanks, Michael, I *knew* I'd messed that up...

I'll repeat "list searches linear, dict lookups constant..." a hundred
times before bed tonight.

-- Graham
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top