os walk() and threads problems (os.walk are thread safe?)

  • Thread starter Marcus Alves Grando
  • Start date
M

Marcus Alves Grando

Hello list,

I have a strange problem with os.walk and threads in python script. I
have one script that create some threads and consume Queue. For every
value in Queue this script run os.walk() and printing root dir. But if i
increase number of threads the result are inconsistent compared with one
thread.

For example, run this code plus sort with one thread and after run again
with ten threads and see diff(1).

--code--
#!/usr/local/bin/python

import os, time, glob
import Queue
import threading

EXIT=False
POOL=Queue.Queue(0)
NRO_THREADS=1
#NRO_THREADS=10

class Worker(threading.Thread):
def run(self):
global POOL, EXIT
while True:
try:
mydir=POOL.get(timeout=1)
if mydir == None:
continue

for root, dirs, files in os.walk(mydir):
print root

except Queue.Empty:
if EXIT:
break
else:
continue
except KeyboardInterrupt:
break
except Exception:
raise

for x in xrange(NRO_THREADS):
Worker().start()
try:
for i in glob.glob('/usr/ports/*'):
POOL.put(i)

while not POOL.empty():
time.sleep(1)
EXIT = True

while (threading.activeCount() > 1):
time.sleep(1)
except KeyboardInterrupt:
EXIT=True
--code--

If someone can help with this i appreciate.

Regards
 
D

Diez B. Roggisch

Marcus said:
Hello list,

I have a strange problem with os.walk and threads in python script. I
have one script that create some threads and consume Queue. For every
value in Queue this script run os.walk() and printing root dir. But if i
increase number of threads the result are inconsistent compared with one
thread.

For example, run this code plus sort with one thread and after run again
with ten threads and see diff(1).

I don't see any difference. I ran it with 1 and 10 workers + sorted the
output. No diff whatsoever.

And I don't know what you mean by diff(1) - was that supposed to be some
output?

Diez
 
M

Marcus Alves Grando

Diez said:
I don't see any difference. I ran it with 1 and 10 workers + sorted the
output. No diff whatsoever.

Do you test in one dir with many subdirs? like /usr or /usr/ports (in
freebsd) for example?
And I don't know what you mean by diff(1) - was that supposed to be some
output?

No. One thread produce one result and ten threads produce another result
with less lines.

Se example below:

@@ -13774,8 +13782,6 @@
/usr/compat/linux/proc/44
/usr/compat/linux/proc/45
/usr/compat/linux/proc/45318
-/usr/compat/linux/proc/45484
-/usr/compat/linux/proc/45532
/usr/compat/linux/proc/45857
/usr/compat/linux/proc/45903
/usr/compat/linux/proc/46

Regards
 
D

Diez B. Roggisch

Marcus said:
Do you test in one dir with many subdirs? like /usr or /usr/ports (in
freebsd) for example?

Yes, over 1000 subdirs/files.
No. One thread produce one result and ten threads produce another result
with less lines.

Se example below:

@@ -13774,8 +13782,6 @@
/usr/compat/linux/proc/44
/usr/compat/linux/proc/45
/usr/compat/linux/proc/45318
-/usr/compat/linux/proc/45484
-/usr/compat/linux/proc/45532
/usr/compat/linux/proc/45857
/usr/compat/linux/proc/45903
/usr/compat/linux/proc/46

I'm not sure what that directory is, but to me that looks like the
linux /proc dir, containing process ids. Which incidentially changes
between the two runs, as more threads will have process id aliases.

Try your script on another directory.

Diez
 
M

Marcus Alves Grando

Diez said:
Yes, over 1000 subdirs/files.

Strange, because to me accurs every time.
I'm not sure what that directory is, but to me that looks like the
linux /proc dir, containing process ids. Which incidentially changes
between the two runs, as more threads will have process id aliases.

My example are not good enough. I run this script in ports directory of
freebsd and imap folders in my linux server, same thing.

@@ -182,7 +220,6 @@
/usr/ports/archivers/p5-POE-Filter-Bzip2
/usr/ports/archivers/p5-POE-Filter-LZF
/usr/ports/archivers/p5-POE-Filter-LZO
-/usr/ports/archivers/p5-POE-Filter-LZW
/usr/ports/archivers/p5-POE-Filter-Zlib
/usr/ports/archivers/p5-PerlIO-gzip
/usr/ports/archivers/p5-PerlIO-via-Bzip2
@@ -234,7 +271,6 @@
/usr/ports/archivers/star-devel
/usr/ports/archivers/star-devel/files
/usr/ports/archivers/star/files
-/usr/ports/archivers/stuffit
/usr/ports/archivers/szip
/usr/ports/archivers/tardy
/usr/ports/archivers/tardy/files

Regards
 
C

Chris Mellon

Strange, because to me accurs every time.


My example are not good enough. I run this script in ports directory of
freebsd and imap folders in my linux server, same thing.

@@ -182,7 +220,6 @@
/usr/ports/archivers/p5-POE-Filter-Bzip2
/usr/ports/archivers/p5-POE-Filter-LZF
/usr/ports/archivers/p5-POE-Filter-LZO
-/usr/ports/archivers/p5-POE-Filter-LZW
/usr/ports/archivers/p5-POE-Filter-Zlib
/usr/ports/archivers/p5-PerlIO-gzip
/usr/ports/archivers/p5-PerlIO-via-Bzip2
@@ -234,7 +271,6 @@
/usr/ports/archivers/star-devel
/usr/ports/archivers/star-devel/files
/usr/ports/archivers/star/files
-/usr/ports/archivers/stuffit
/usr/ports/archivers/szip
/usr/ports/archivers/tardy
/usr/ports/archivers/tardy/files

Are you just diffing the output? There's no guarantee that
os.path.walk() will always have the same order, or that your different
working threads will produce the same output in the same order. On my
system, for example, I get a different order of subdirectory output
when I run with 10 threads than with 1.

walk() requires that stat() works for the next directory that will be
walked. It might be remotely possible that stat() is failing for some
reason and some directories are being lost (this is probably not going
to be reproducible). If you can reproduce it, trying using pdb to see
what's going on inside walk().
 
M

Marcus Alves Grando

I make one new version more equally to original version:

--code--
#!/usr/bin/python

import os, sys, time
import glob, random, Queue
import threading

EXIT = False
BRANDS = {}
LOCK=threading.Lock()
EV=threading.Event()
POOL=Queue.Queue(0)
NRO_THREADS=20

def walkerr(err):
print err

class Worker(threading.Thread):
def run(self):
EV.wait()
while True:
try:
mydir=POOL.get(timeout=1)
if mydir == None:
continue

for root, dirs, files in os.walk(mydir, onerror=walkerr):
if EXIT:
break

terra_user = 'test'
terra_brand = 'test'
user_du = '0 a'
user_total_files = 0

LOCK.acquire()
if not BRANDS.has_key(terra_brand):
BRANDS[terra_brand] = {}
BRANDS[terra_brand]['COUNT'] = 1
BRANDS[terra_brand]['SIZE'] = int(user_du.split()[0])
BRANDS[terra_brand]['FILES'] = user_total_files
else:
BRANDS[terra_brand]['COUNT'] = BRANDS[terra_brand]['COUNT'] + 1
BRANDS[terra_brand]['SIZE'] = BRANDS[terra_brand]['SIZE'] +
int(user_du.split()[0])
BRANDS[terra_brand]['FILES'] = BRANDS[terra_brand]['FILES'] +
user_total_files
LOCK.release()

except Queue.Empty:
if EXIT:
break
else:
continue
except KeyboardInterrupt:
break
except Exception:
print mydir
raise

if len(sys.argv) < 2:
print 'Usage: %s dir...' % sys.argv[0]
sys.exit(1)

glob_dirs = []
for i in sys.argv[1:]:
glob_dirs = glob_dirs + glob.glob(i+'/[a-z_]*')
random.shuffle(glob_dirs)

for x in xrange(NRO_THREADS):
Worker().start()

try:
for i in glob_dirs:
POOL.put(i)

EV.set()
while not POOL.empty():
time.sleep(1)
EXIT = True

while (threading.activeCount() > 1):
time.sleep(1)
except KeyboardInterrupt:
EXIT=True

for b in BRANDS:
print '%s:%i:%i:%i' % (b, BRANDS['SIZE'], BRANDS['COUNT'],
BRANDS['FILES'])
--code--

And run in make servers:

# uname -r
2.6.18-8.1.15.el5
# python test.py /usr
test:0:2267:0
# python test.py /usr
test:0:2224:0
# python test.py /usr
test:0:2380:0
# python -V
Python 2.4.3

# uname -r
7.0-BETA2
# python test.py /usr
test:0:1706:0
# python test.py /usr
test:0:1492:0
# python test.py /usr
test:0:1524:0
# python -V
Python 2.5.1

# uname -r
2.6.9-42.0.8.ELsmp
# python test.py /usr
test:0:1311:0
# python test.py /usr
test:0:1486:0
# python test.py /usr
test:0:1520:0
# python -V
Python 2.3.4

I really don't know what's happen.

Another ideia?

Regards
 
M

Marcus Alves Grando

Ok. I found the problem.

That's because in for i test "if EXIT" and break loop if it's true. In
main part i'm wait Queue to be empty and set EXIT after that, with this
subdirectories in for loop does not process and program exit.

Because that output are not same.

Removing "if EXIT" all works fine again.

Thanks all

Regards
I make one new version more equally to original version:

--code--
#!/usr/bin/python

import os, sys, time
import glob, random, Queue
import threading

EXIT = False
BRANDS = {}
LOCK=threading.Lock()
EV=threading.Event()
POOL=Queue.Queue(0)
NRO_THREADS=20

def walkerr(err):
print err

class Worker(threading.Thread):
def run(self):
EV.wait()
while True:
try:
mydir=POOL.get(timeout=1)
if mydir == None:
continue

for root, dirs, files in os.walk(mydir, onerror=walkerr):
if EXIT:
break

terra_user = 'test'
terra_brand = 'test'
user_du = '0 a'
user_total_files = 0

LOCK.acquire()
if not BRANDS.has_key(terra_brand):
BRANDS[terra_brand] = {}
BRANDS[terra_brand]['COUNT'] = 1
BRANDS[terra_brand]['SIZE'] = int(user_du.split()[0])
BRANDS[terra_brand]['FILES'] = user_total_files
else:
BRANDS[terra_brand]['COUNT'] = BRANDS[terra_brand]['COUNT'] + 1
BRANDS[terra_brand]['SIZE'] = BRANDS[terra_brand]['SIZE'] +
int(user_du.split()[0])
BRANDS[terra_brand]['FILES'] = BRANDS[terra_brand]['FILES'] +
user_total_files
LOCK.release()

except Queue.Empty:
if EXIT:
break
else:
continue
except KeyboardInterrupt:
break
except Exception:
print mydir
raise

if len(sys.argv) < 2:
print 'Usage: %s dir...' % sys.argv[0]
sys.exit(1)

glob_dirs = []
for i in sys.argv[1:]:
glob_dirs = glob_dirs + glob.glob(i+'/[a-z_]*')
random.shuffle(glob_dirs)

for x in xrange(NRO_THREADS):
Worker().start()

try:
for i in glob_dirs:
POOL.put(i)

EV.set()
while not POOL.empty():
time.sleep(1)
EXIT = True

while (threading.activeCount() > 1):
time.sleep(1)
except KeyboardInterrupt:
EXIT=True

for b in BRANDS:
print '%s:%i:%i:%i' % (b, BRANDS['SIZE'], BRANDS['COUNT'],
BRANDS['FILES'])
--code--

And run in make servers:

# uname -r
2.6.18-8.1.15.el5
# python test.py /usr
test:0:2267:0
# python test.py /usr
test:0:2224:0
# python test.py /usr
test:0:2380:0
# python -V
Python 2.4.3

# uname -r
7.0-BETA2
# python test.py /usr
test:0:1706:0
# python test.py /usr
test:0:1492:0
# python test.py /usr
test:0:1524:0
# python -V
Python 2.5.1

# uname -r
2.6.9-42.0.8.ELsmp
# python test.py /usr
test:0:1311:0
# python test.py /usr
test:0:1486:0
# python test.py /usr
test:0:1520:0
# python -V
Python 2.3.4

I really don't know what's happen.

Another ideia?

Regards

Chris said:
Are you just diffing the output? There's no guarantee that
os.path.walk() will always have the same order, or that your different
working threads will produce the same output in the same order. On my
system, for example, I get a different order of subdirectory output
when I run with 10 threads than with 1.

walk() requires that stat() works for the next directory that will be
walked. It might be remotely possible that stat() is failing for some
reason and some directories are being lost (this is probably not going
to be reproducible). If you can reproduce it, trying using pdb to see
what's going on inside walk().
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,572
Members
45,046
Latest member
Gavizuho

Latest Threads

Top