Basic file operation questions

A

alex

Hi,

I am a beginner with python and here is my first question:
How can I read the contents of a file using a loop or something? I open
the file with file=open(filename, 'r') and what to do then? Can I use
something like

for xxx in file:
....


Thanks for help
Alex
 
S

Steve Holden

alex said:
Hi,

I am a beginner with python and here is my first question:
How can I read the contents of a file using a loop or something? I open
the file with file=open(filename, 'r') and what to do then? Can I use
something like

for xxx in file:
....
Yes, indeed you can. That's by no means *all* you can do, but to iterate
over the lines of the file that will wrok exactly. Note that the lines
will still have their terminating "\n" on the end, which is why the
print statement inthe following example ends in a comma (this stops
print from putting out its own newline).
... print l,
...
import os.path

def getHomeDir():
''' Try to find user's home directory, otherwise return current
directory.''
'
try:
path1=os.path.expanduser("~")
except:
path1=""
try:
path2=os.environ["HOME"]
except:
path2=""
try:
path3=os.environ["USERPROFILE"]
except:
path3=""

if not os.path.exists(path1):
if not os.path.exists(path2):
if not os.path.exists(path3):
return os.getcwd()
else: return path3
else: return path2
else: return path1

print getHomeDir()
regards
Steve
 
D

David Douard

David said:
Or even have a look at the excellent Gnosis book on the subject (and very
much more further, but...):
http://gnosis.cx/TPiP/ which is freely available in text format.

David

Just to tell (it's not clear at all in my message): the author of the book
and creator of Gnosis Software is David Mertz, and the book is published
by Addison Wesley.
Sorry

David
 
C

Caleb Hattingh

Hi Alex

Assuming you have a file called "data.txt":

***
f = open('data.txt','r')
lines = f.readlines()
f.close()
for line in lines:
print line
***
Will print each line of the file.

You can make a huge investment by setting 2 or 3 hours aside to go through
the Python tutorial, which gets installed as part of the documentation.
That tutorial can get you much of the knowledge you will ever need with
Python.

thx
Caleb
 
P

Peter Nuttall

Hi Alex

Assuming you have a file called "data.txt":

***
f = open('data.txt','r')
lines = f.readlines()
f.close()
for line in lines:
print line
***

Can you not write this:

f=open("data.txt", "r")
for line in f.readlines():
#do stuff to line
f.close()

Pete
 
M

Michael.Lang

Can you not write this:

f=open("data.txt", "r")
for line in f.readlines():
#do stuff to line
f.close()

sure you can

f = open("data.txt", "rb")
while [ 1 ]:
line = f.readlines()
if not line: break
line = somethingelse ...
f.close()
 
P

Peter Otten

Peter said:
Can you not write this:

f=open("data.txt", "r")
for line in f.readlines():
#do stuff to line
f.close()

Pete

Yes, you can even write

f = open("data.txt")
for line in f:
# do stuff with line
f.close()

This has the additional benefit of not slurping in the entire file at once.
Be aware, though, that this (newer) style of using a file as an iterator
doesn't mix well with seek() operations.

Peter
 
S

Steve Holden

Can you not write this:

f=open("data.txt", "r")
for line in f.readlines():
#do stuff to line
f.close()


sure you can

f = open("data.txt", "rb")
while [ 1 ]:
line = f.readlines()
if not line: break
line = somethingelse ...
f.close()
Shall we charitably assume this was untested code? For a non-empty file
it executes the loop body twice, once to read the whole content of the
file and throw it away, the second time to do something else
unspecified. The intention wasn't, therefore, entirely clear, but
newbies should not be using it as any kind of model.

regards
Steve
 
S

Steven Bethard

Caleb said:
Peter


Is there disk access on every iteration? I'm guessing yes? It
shouldn't be an issue in the vast majority of cases, but I'm naturally
curious :)

Short answer:
No, it's buffered.

Long answer:
This buffer is actually what causes the problems in interactions between
uses of the next method and readline, seek, etc:

py> f = file('temp.txt')
py> for line in f:
.... print line,
.... break
....
line 1
py> f.read()
''
py> for line in f:
.... print line,
....
line 2
line 3

Using the iteration protocol (specificaly, when file.next is called)
causes the file object to read part of the file into a buffer for the
iterator. The read method doesn't access the same buffer, and sees that
(because the file is so small) we've already seeked to the end of the
file, so it returns '' to signal that the entire file has been read,
even though we have not finished iterating. The iterator however, which
has access to the buffer, can still complete its iteration.

The moral of the story is that, in general, you should only use the file
as an iterator after you are done calling read, readline, etc. unless
you want to keep track of the file position and do an appropriate
file.seek() call after each use of the iterator.

Steve
 
J

Jeff Shannon

Caleb said:
Peter


Is there disk access on every iteration? I'm guessing yes? It
shouldn't be an issue in the vast majority of cases, but I'm naturally
curious :)

Disk access should be buffered, possibly both at the C-runtime level
and at the file-iterator level (though I couldn't swear to that). I'm
sure that the C-level buffering happens, though.

Jeff Shannon
Technician/Programmer
Credit International
 
P

Peter Otten

Caleb said:
Is there disk access on every iteration? I'm guessing yes? It shouldn't
be an issue in the vast majority of cases, but I'm naturally curious :)

Well, you will hardly find an OS that does no buffering of disk access --
but file.next() does some extra optimization as Steven already explained.
Here are some timings performed on the file that has the first-hand
information about Python's file buffering strategy :)

$ python2.4 -m timeit 'for line in file("fileobject.c"): pass'
1000 loops, best of 3: 528 usec per loop
$ python2.4 -m timeit 'for line in file("fileobject.c").readlines(): pass'
1000 loops, best of 3: 635 usec per loop
$ python2.4 -m timeit 'for line in iter(file("fileobject.c").readline, ""):
pass'
1000 loops, best of 3: 1.59 msec per loop
$ python2.4 -m timeit 'f = file("fileobject.c")' 'while 1:' ' if not
f.readline(): break'
100 loops, best of 3: 2.08 msec per loop

So not only is

for line in file(...):
# do stuff

the most elegant, it is also the fastest. file.readlines() comes close, but
is only viable for "small" files.

Peter
 
C

Caleb Hattingh

Peter
Yes, you can even write

f = open("data.txt")
for line in f:
# do stuff with line
f.close()

This has the additional benefit of not slurping in the entire file at
once.

Is there disk access on every iteration? I'm guessing yes? It shouldn't
be an issue in the vast majority of cases, but I'm naturally curious :)

thx
Caleb
 
J

Jeff Shannon

Marc said:
When you read a file with that method, is there an implied close() call
on the file? I assume there is, but how is that handled?
[...]

As I understand it, the disk file will be closed when the file object
is garbage collected. In CPython, that will be as soon as there are
no active references to the file; i.e., in the above case, it should
happen as soon as the for loop finishes. Jython uses Java's garbage
collector, which is a bit less predictable, so the file may not be
closed immediately. It *will*, however, be closed during program
shutdown if it hasn't happened before then.

Jeff Shannon
Technician/Programmer
Credit International
 
G

Grant Edwards

When you read a file with that method, is there an implied close() call
on the file? I assume there is, but how is that handled?

The file will be closed when the the file object is deleted by
the garbage collection algorithm. That will happen sometime
after the for loop exits and before the program exits. In
normal C-Python I believe it happens immediately after the for
loop exits. However, that behavior is not guaranteed by the
language spec.
 
C

Caleb Hattingh

Marc

I don't know how it is handled, but I expect also that there is an implied
close().

thanks
Caleb
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top