python noob, multiple file i/o

H

hiro

Hi there,

I'm very new to python, the problem I need to solve is whats the "best/
simplest/cleanest" way to read in multiple files (ascii), do stuff to
them, and write them out(ascii).

--
import os

filePath = ('O:/spam/eggs/')
for file in os.listdir(filePath): #straight from docs
# iterate the function through all the files in the directory
# write results to separate files <- this is where I'm mostly
stuck.

--
For clarity's sake, the file naming conventions for the files I'm
reading from are file.1.txt -> file.nth.txt

It's been a long day, i'm at my wits end, so I apologize in advance if
I'm not making much sense here.
syntax would also be great if you can share some recipes.

Thanks in advance.

h.
 
J

Jon Clements

Hi there,

I'm very new to python, the problem I need to solve is whats the "best/
simplest/cleanest" way to read in multiple files (ascii), do stuff to
them, and write them out(ascii).

--
import os

filePath = ('O:/spam/eggs/')
for file in os.listdir(filePath): #straight from docs
# iterate the function through all the files in the directory
# write results to separate files <- this is where I'm mostly
stuck.

--
For clarity's sake, the file naming conventions for the files I'm
reading from are file.1.txt -> file.nth.txt

It's been a long day, i'm at my wits end, so I apologize in advance if
I'm not making much sense here.
syntax would also be great if you can share some recipes.

I'd try the glob module.

Code:
import glob

# Get a list of filenames matching wildcard criteria
# (note that path is relative to working directory of program)
matching_file_list = glob.glob('O:/spam/eggs/*.txt')

# For each file that matches, open it and process it in some way...
for filename in matching_file_list:
    infile = file(filename)
    outfile = file(filename + '.out','w')
    # Process the input file line by line...
    for line in infile:
        pass # Do something more useful here, change line and write to
outfile?
    # Be explicit with file closures
    outfile.close()
    infile.close()

Of course, you can change the wild card criteria in the glob
statement, and also then filter further using regular expressions to
choose only files matching more specific criteria. This should be
enough to get you started though.

hth

Jon.
 
J

Jon Clements

Hi there,
I'm very new to python, the problem I need to solve is whats the "best/
simplest/cleanest" way to read in multiple files (ascii), do stuff to
them, and write them out(ascii).
filePath = ('O:/spam/eggs/')
for file in os.listdir(filePath): #straight from docs
# iterate the function through all the files in the directory
# write results to separate files <- this is where I'm mostly
stuck.
It's been a long day, i'm at my wits end, so I apologize in advance if
I'm not making much sense here.
syntax would also be great if you can share some recipes.

I'd try the glob module.

Code:
import glob

# Get a list of filenames matching wildcard criteria
# (note that path is relative to working directory of program)
matching_file_list = glob.glob('O:/spam/eggs/*.txt')

# For each file that matches, open it and process it in some way...
for filename in matching_file_list:
infile = file(filename)
outfile = file(filename + '.out','w')
# Process the input file line by line...
for line in infile:
pass # Do something more useful here, change line and write to
outfile?
# Be explicit with file closures
outfile.close()
infile.close()

Of course, you can change the wild card criteria in the glob
statement, and also then filter further using regular expressions to
choose only files matching more specific criteria. This should be
enough to get you started though.

hth

Jon.- Hide quoted text -

- Show quoted text -

Okies; postcoding before finishing your early morning coffee is not
the greatest of ideas!

I forgot to mention that glob will return pathnames as well. You'll
need to check that os.path.isfile(filename) returns True before
processing it...

Jon.
 
L

Laurent Rahuel

Maybe the walk method in os module is what you need
http://docs.python.org/lib/os-file-dir.html

Regards

Jon said:
Hi there,
I'm very new to python, the problem I need to solve is whats the "best/
simplest/cleanest" way to read in multiple files (ascii), do stuff to
them, and write them out(ascii).
filePath = ('O:/spam/eggs/')
for file in os.listdir(filePath): #straight from docs
# iterate the function through all the files in the directory
# write results to separate files <- this is where I'm mostly
stuck.
It's been a long day, i'm at my wits end, so I apologize in advance if
I'm not making much sense here.
syntax would also be great if you can share some recipes.

I'd try the glob module.

Code:
import glob

# Get a list of filenames matching wildcard criteria
# (note that path is relative to working directory of program)
matching_file_list = glob.glob('O:/spam/eggs/*.txt')

# For each file that matches, open it and process it in some way...
for filename in matching_file_list:
infile = file(filename)
outfile = file(filename + '.out','w')
# Process the input file line by line...
for line in infile:
pass # Do something more useful here, change line and write to
outfile?
# Be explicit with file closures
outfile.close()
infile.close()

Of course, you can change the wild card criteria in the glob
statement, and also then filter further using regular expressions to
choose only files matching more specific criteria. This should be
enough to get you started though.

hth

Jon.- Hide quoted text -

- Show quoted text -

Okies; postcoding before finishing your early morning coffee is not
the greatest of ideas!

I forgot to mention that glob will return pathnames as well. You'll
need to check that os.path.isfile(filename) returns True before
processing it...

Jon.
 
J

Jordan

Maybe the walk method in os module is what you needhttp://docs.python.org/lib/os-file-dir.html

Regards

Jon said:
Hi there,
I'm very new to python, the problem I need to solve is whats the "best/
simplest/cleanest" way to read in multiple files (ascii), do stuff to
them, and write them out(ascii).
--
import os
filePath = ('O:/spam/eggs/')
for file in os.listdir(filePath): #straight from docs
# iterate the function through all the files in the directory
# write results to separate files <- this is where I'm mostly
stuck.
--
For clarity's sake, the file naming conventions for the files I'm
reading from are file.1.txt -> file.nth.txt
It's been a long day, i'm at my wits end, so I apologize in advance if
I'm not making much sense here.
syntax would also be great if you can share some recipes.
I'd try the glob module.
Code:
import glob 
# Get a list of filenames matching wildcard criteria
# (note that path is relative to working directory of program)
matching_file_list = glob.glob('O:/spam/eggs/*.txt') 
# For each file that matches, open it and process it in some way...
for filename in matching_file_list:
infile = file(filename)
outfile = file(filename + '.out','w')
# Process the input file line by line...
for line in infile:
pass # Do something more useful here, change line and write to
outfile?
# Be explicit with file closures
outfile.close()
infile.close()
Of course, you can change the wild card criteria in the glob
statement, and also then filter further using regular expressions to
choose only files matching more specific criteria. This should be
enough to get you started though.
hth
Jon.- Hide quoted text -
- Show quoted text -
Okies; postcoding before finishing your early morning coffee is not
the greatest of ideas!
I forgot to mention that glob will return pathnames as well. You'll
need to check that os.path.isfile(filename) returns True before
processing it...

Also, leaving the format as .out is not necessarily convenient. You
had glob do a search for .txt, so how about doing:

Also, Python advises using open() over file() (although I admit to
using file() myself more often than not)
for filename in matching_file_list:
infile = open(filename,'r') # add 'r' for clarity if nothing else
outfile = open(filename[:-4] + '.out.txt','w') # assumes file ext of original file is .txt
# Process the input file line by line...
for line in infile:
pass # do thing --> you don't have to iterate line by line, if you specified what you wanted to do to each file we could probably help out here if you need it.
# Be explicit with file closures
outfile.close()
infile.close()

Might also add some try/except statements to be safe ;).

Cheers,
Jordan
 
7

7stud

The general idiom for altering lines in a file is to open the original
file and write the alterations to a temp file. After you are done
writing to the temp file, delete the original file, and change the
temp file name to the original file name.

If instead you were to read the whole file into a variable, and then
start overwriting the original file with the altered data, if your
program should happen to crash after writing one line to the file, all
the data in your variable would disappear into the ether, and your
file would only contain one line of data. You can imagine what that
would be like if you had lots of important data in the file.

Here's my attempt that incorporates a temp file:

-----
import os

filepath = "./change_files"

li = os.listdir(filepath)
for name in li:
fullpath = filepath + "/" + name
if os.path.isfile(fullpath):
infile = open(fullpath, 'r')

lastDotPos = fullpath.rfind(".")
fileName = fullpath[:lastDotPos]
ext = fullpath[lastDotPos:]
tempName = fileName + ext + ".temp"

outfile = open(tempName, "w")
for line in infile:
outfile.write(line + "altered\n")
outfile.close()

os.remove(fullpath)
os.rename(tempName, tempName[:-5])
 
7

7stud

-----
import os

filepath = "./change_files"

li = os.listdir(filepath)
for name in li:
fullpath = filepath + "/" + name
if os.path.isfile(fullpath):
infile = open(fullpath, 'r')

lastDotPos = fullpath.rfind(".")
fileName = fullpath[:lastDotPos]
ext = fullpath[lastDotPos:]
tempName = fileName + ext + ".temp"

outfile = open(tempName, "w")
for line in infile:
outfile.write(line + "altered\n")
outfile.close()

os.remove(fullpath)
os.rename(tempName, tempName[:-5])

I did some unnecessary name manipulation in there. Try this:

filepath = "./change_files"

li = os.listdir(filepath)
for name in li:
fullpath = filepath + "/" + name
if os.path.isfile(fullpath):
infile = open(fullpath, 'r')

tempName = fullpath + ".temp"
outfile = open(tempName, "w")
for line in infile:
outfile.write(line + "altered\n")
outfile.close()

os.remove(fullpath)
os.rename(tempName, tempName[:-5])
 
H

hiro

Thanks a lot for the help guys, I'm at work right now and I will go
over your suggestions one by one this weekend. Being more alert now,
taking a look at the examples you posted, I now see how to approach
this problem. The thing with python that I'm starting to realize is
that there are a million different ways to approach a problem, so I
find it great for experimenting (when time allows) yet very
challenging to choose an approach.

Cheers,

h.
 
H

hiro

Thanks a lot for the help guys, I'm at work right now and I will go
over your suggestions one by one this weekend. Being more alert now,
taking a look at the examples you posted, I now see how to approach
this problem. The thing with python that I'm starting to realize is
that there are a million different ways to approach a problem, so I
find it great for experimenting (when time allows) yet very
challenging to choose an approach.

Cheers,

h.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,754
Messages
2,569,521
Members
44,995
Latest member
PinupduzSap

Latest Threads

Top