How To Do It Faster?!?

A

andrea.gavana

Hello NG,

in my application, I use os.walk() to walk on a BIG directory. I need
to retrieve the files, in each sub-directory, that are owned by a
particular user. Noting that I am on Windows (2000 or XP), this is what I
do:

for root, dirs, files in os.walk(MyBIGDirectory):

a = os.popen("dir /q /-c /a-d " + root).read().split()

# Retrieve all files owners
user = a[18::20]

# Retrieve all the last modification dates & hours
date = a[15::20]
hours = a[16::20]

# Retrieve all the filenames
name = a[19::20]

# Retrieve all the files sizes
size = a[17::20]

# Loop throu all files owners to see if they belong
# to that particular owner (a string)
for util in user:
if util.find(owner) >= 0:
DO SOME PROCESSING

Does anyone know if there is a faster way to do this job?

Thanks to you all.

Andrea.
 
A

Aquila Deus

Hello NG,

in my application, I use os.walk() to walk on a BIG directory. I need
to retrieve the files, in each sub-directory, that are owned by a
particular user. Noting that I am on Windows (2000 or XP), this is what I
do:

for root, dirs, files in os.walk(MyBIGDirectory):

a = os.popen("dir /q /-c /a-d " + root).read().split()

# Retrieve all files owners
user = a[18::20]

# Retrieve all the last modification dates & hours
date = a[15::20]
hours = a[16::20]

# Retrieve all the filenames
name = a[19::20]

# Retrieve all the files sizes
size = a[17::20]

# Loop throu all files owners to see if they belong
# to that particular owner (a string)
for util in user:
if util.find(owner) >= 0:
DO SOME PROCESSING

Does anyone know if there is a faster way to do this job?

You may use "dir /s", which lists everything recursively.
 
M

Max Erickson

I don't quite understand what your program is doing. The user=a[18::20]
looks really fragile/specific to a directory to me. Try something like
this:

Should give you the dir output split into lines, for every file below
root(notice that I added '/s' to the dir command). There will be some
extra lines in a that aren't about specific files...
' Volume in drive C has no label.'

but the files should be there.
232

To get a list containing files owned by a specific user, do something
like:
files=[line.split()[-1] for line in a if owner in line]
len(files)
118

This is throwing away directory information, but using os.walk()
instead of the /s switch to dir should work, if you need it...

max
 
J

Jeremy Bowers

Hello NG,

in my application, I use os.walk() to walk on a BIG directory. I
need
to retrieve the files, in each sub-directory, that are owned by a
particular user. Noting that I am on Windows (2000 or XP), this is what I
do:

You should *try* directly retrieving the relevant information from the OS,
instead of spawning a "dir" process. I have no idea how to do that and it
will probably require the win32 extensions for Python.

After that, you're done. Odds are you'll be disk bound. In fact, you may
get no gain if Windows is optimized enough that the process you describe
below is *still* disk-bound.

Your only hope then is two things:

* Poke around in the Windows API for a function that does what you want,
and hope it can do it faster due to being in the kernel.

* Somehow work this out to be lazy so it tries to grab what the user is
looking at, instead of absolutely everything. Whether or not this will
work depends on your application. If you post more information about how
you are using this data, I can try to help you. (I've had some experience
in this domain, but what is good heavily depends on what you are doing.
For instance, if you're batch processing a whole bunch of records after
the user gave a bulk command, there's not much you can do. But if they're
looking at something in a Windows Explorer-like tree view, there's a lot
you can do to improve responsiveness, even if you can't speed up the
process overall.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,780
Messages
2,569,609
Members
45,254
Latest member
Top Crypto TwitterChannel

Latest Threads

Top