Python glob and raw string

X

Xaxa Urtiz

Hello everybody, i've got a little problem, i've made a script which look after some files in some directory, typically my folder are organized like this :

[share]
folder1
->20131201
-->file1.xml
-->file2.txt
->20131202
-->file9696009.tmp
-->file421378932.xml
etc....
so basically in the share i've got some folder (=folder1,folder2.....) and inside these folder i've got these folder whose name is the date (20131201,20131202,20131203 etc...) and inside them i want to find all the xml files.
So, what i've done is to iterate over all the folder1/2/3 that i want and look, for each one, the xml file with that:


for f in glob.glob(dir +r"\20140115\*.xml"):
->yield f

dir is the folder1/2/3 everything is ok but i want to do something like that :


for i in range(10,16):
->for f in glob.glob(dir +r"\201401{0}\*.xml".format(i)):
-->yield f

but the glob does not find any file.... (and of course there is some xml and the old way found them...)
Any help would be appreciate :)
 
X

Xaxa Urtiz

Le jeudi 16 janvier 2014 17:49:57 UTC+1, Xaxa Urtiz a écrit :
Hello everybody, i've got a little problem, i've made a script which lookafter some files in some directory, typically my folder are organized likethis :



[share]

folder1

->20131201

-->file1.xml

-->file2.txt

->20131202

-->file9696009.tmp

-->file421378932.xml

etc....

so basically in the share i've got some folder (=folder1,folder2.....) and inside these folder i've got these folder whose name is the date (20131201,20131202,20131203 etc...) and inside them i want to find all the xml files.

So, what i've done is to iterate over all the folder1/2/3 that i want andlook, for each one, the xml file with that:





for f in glob.glob(dir +r"\20140115\*.xml"):

->yield f



dir is the folder1/2/3 everything is ok but i want to do something like that :





for i in range(10,16):

->for f in glob.glob(dir +r"\201401{0}\*.xml".format(i)):

-->yield f



but the glob does not find any file.... (and of course there is some xml and the old way found them...)

Any help would be appreciate :)

I feel stupid, my mistake, it works :

for i in range(1,16):
->for f in glob.glob(dir +r"\201401{0:02}\*.xml".format(i)):
-->yield f
 
N

Neil Cerutti

Hello everybody, i've got a little problem, i've made a script
which look after some files in some directory, typically my
folder are organized like this :

[share]
folder1
->20131201
-->file1.xml
-->file2.txt
->20131202
-->file9696009.tmp
-->file421378932.xml
etc....
so basically in the share i've got some folder
(=folder1,folder2.....) and inside these folder i've got these
folder whose name is the date (20131201,20131202,20131203
etc...) and inside them i want to find all the xml files.
So, what i've done is to iterate over all the folder1/2/3 that
i want and look, for each one, the xml file with that:

for f in glob.glob(dir +r"\20140115\*.xml"):
->yield f

dir is the folder1/2/3 everything is ok but i want to do
something like that :

for i in range(10,16):
->for f in glob.glob(dir +r"\201401{0}\*.xml".format(i)):
-->yield f

but the glob does not find any file.... (and of course there is
some xml and the old way found them...)
Any help would be appreciate :)

I've done this two different ways. The simple way is very similar
to what you are now doing. It sucks because I have to manually
maintain the list of subdirectories to traverse every time I
create a new subdir.

Here's the other way, using glob and isdir from os.path, adapted
from actual production code.

class Miner:
def __init__(self, archive):
# setup goes here; prepare to acquire the data
self.descend(os.path.join(archive, '*'))

def descend(self, path):
for fname in glob.glob(os.path.join(path, '*')):
if os.path.isdir(fname):
self.descend(fname)
else:
self.process(fname)

def process(self, path):
# Do what I want done with an actual file path.
# This is where I add to the data.

In your case you might not want to process unless the path also
looks like an xml file.

mine = Miner('myxmldir')

Hmmm... I might be doing too much in __init__. ;)
 
C

Chris Angelico

class Miner:
def __init__(self, archive):
# setup goes here; prepare to acquire the data
self.descend(os.path.join(archive, '*'))

def descend(self, path):
for fname in glob.glob(os.path.join(path, '*')):
if os.path.isdir(fname):
self.descend(fname)
else:
self.process(fname)

def process(self, path):
# Do what I want done with an actual file path.
# This is where I add to the data.

In your case you might not want to process unless the path also
looks like an xml file.

mine = Miner('myxmldir')

Hmmm... I might be doing too much in __init__. ;)

Hmm, why is it even a class? :) I guess you elided all the stuff that
makes it impractical to just use a non-class function.

ChrisA
 
N

Neil Cerutti

Hmm, why is it even a class? :) I guess you elided all the
stuff that makes it impractical to just use a non-class
function.

I didn't remove anything that makes it obviously class-worthy,
just timestamp checking, and several dicts and sets to store
data.

The original version of that code is just a set of three
functions, but the return result of that version was a single
dict. Once the return value got complicated enough to require
building up a class instance, it became a convenient place to
hang the functions.
 
X

Xaxa Urtiz

Le jeudi 16 janvier 2014 19:14:30 UTC+1, Neil Cerutti a écrit :
Hello everybody, i've got a little problem, i've made a script
which look after some files in some directory, typically my
folder are organized like this :
->20131201
-->file1.xml
-->file2.txt
->20131202
-->file9696009.tmp
-->file421378932.xml

so basically in the share i've got some folder
(=folder1,folder2.....) and inside these folder i've got these
folder whose name is the date (20131201,20131202,20131203
etc...) and inside them i want to find all the xml files.
So, what i've done is to iterate over all the folder1/2/3 that
i want and look, for each one, the xml file with that:

for f in glob.glob(dir +r"\20140115\*.xml"):
->yield f

dir is the folder1/2/3 everything is ok but i want to do
something like that :

for i in range(10,16):
->for f in glob.glob(dir +r"\201401{0}\*.xml".format(i)):
-->yield f

but the glob does not find any file.... (and of course there is
some xml and the old way found them...)
Any help would be appreciate :)



I've done this two different ways. The simple way is very similar

to what you are now doing. It sucks because I have to manually

maintain the list of subdirectories to traverse every time I

create a new subdir.



Here's the other way, using glob and isdir from os.path, adapted

from actual production code.



class Miner:

def __init__(self, archive):

# setup goes here; prepare to acquire the data

self.descend(os.path.join(archive, '*'))



def descend(self, path):

for fname in glob.glob(os.path.join(path, '*')):

if os.path.isdir(fname):

self.descend(fname)

else:

self.process(fname)



def process(self, path):

# Do what I want done with an actual file path.

# This is where I add to the data.



In your case you might not want to process unless the path also

looks like an xml file.



mine = Miner('myxmldir')



Hmmm... I might be doing too much in __init__. ;)

i only have 1 level of subdirectory, it's just in the case when i don't want to process all the date (otherwise i make a glob on '/*/*', no need to doany recursion.
thanks for the answer !
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,066
Latest member
VytoKetoReviews

Latest Threads

Top