linecache and glob

J

jo3c

hi everyone happy new year!
im a newbie to python
i have a question
by using linecache and glob
how do i read a specific line from a file in a batch and then insert
it into database?

because it doesn't work! i can't use glob wildcard with linecache

doens't work

is there any better methods??? thank you very much in advance

jo3c
 
J

Jeremy Dillworth

Hello,

Welcome to Python!

glob returns a list of filenames, but getline is made to work on just
one filename.
So you'll need to iterate over the list returned by glob.

Maybe you could explain more about what you are trying to do and we
could help more?

Hope this helps,

Jeremy
 
J

jo3c

i have a 2000 files with header and data
i need to get the date information from the header
then insert it into my database
i am doing it in batch so i use glob.glob('/mydata/*/*/*.txt')
to get the date on line 4 in the txt file i use
linecache.getline('/mydata/myfile.txt/, 4)

but if i use
linecache.getline('glob.glob('/mydata/*/*/*.txt', 4) won't work

i am running out of ideas

thanks in advance for any help

jo3c
 
S

Shane Geiger

import linecache
import glob

# reading from one file
print linecache.getline('notes/python.txt',4)
'http://www.python.org/doc/current/lib/\n'

# reading from many files
for filename in glob.glob('/etc/*'):
print linecache.getline(filename,4)



hi everyone happy new year!
im a newbie to python
i have a question
by using linecache and glob
how do i read a specific line from a file in a batch and then insert
it into database?

because it doesn't work! i can't use glob wildcard with linecache



doens't work

is there any better methods??? thank you very much in advance

jo3c


--
Shane Geiger
IT Director
National Council on Economic Education
(e-mail address removed) | 402-438-8958 | http://www.ncee.net

Leading the Campaign for Economic and Financial Literacy
 
F

Fredrik Lundh

jo3c said:
i have a 2000 files with header and data
i need to get the date information from the header
then insert it into my database
i am doing it in batch so i use glob.glob('/mydata/*/*/*.txt')
to get the date on line 4 in the txt file i use
linecache.getline('/mydata/myfile.txt/, 4)

but if i use
linecache.getline('glob.glob('/mydata/*/*/*.txt', 4) won't work

glob.glob returns a list of filenames, so you need to call getline once
for each file in the list.

but using linecache is absolutely the wrong tool for this; it's designed
for *repeated* access to arbitrary lines in a file, so it keeps all the
data in memory. that is, all the lines, for all 2000 files.

if the files are small, and you want to keep the code short, it's easier
to just grab the file's content and using indexing on the resulting list:

for filename in glob.glob('/mydata/*/*/*.txt'):
line = list(open(filename))[4-1]
... do something with line ...

(note that line numbers usually start with 1, but Python's list indexing
starts at 0).

if the files might be large, use something like this instead:

for filename in glob.glob('/mydata/*/*/*.txt'):
f = open(filename)
# skip first three lines
f.readline(); f.readline(); f.readline()
# grab the line we want
line = f.readline()
... do something with line ...

</F>
 
J

jo3c

jo3c said:
i have a 2000 files with header and data
i need to get the date information from the header
then insert it into my database
i am doing it in batch so i use glob.glob('/mydata/*/*/*.txt')
to get the date on line 4 in the txt file i use
linecache.getline('/mydata/myfile.txt/, 4)
but if i use
linecache.getline('glob.glob('/mydata/*/*/*.txt', 4) won't work

glob.glob returns a list of filenames, so you need to call getline once
for each file in the list.

but using linecache is absolutely the wrong tool for this; it's designed
for *repeated* access to arbitrary lines in a file, so it keeps all the
data in memory. that is, all the lines, for all 2000 files.

if the files are small, and you want to keep the code short, it's easier
to just grab the file's content and using indexing on the resulting list:

for filename in glob.glob('/mydata/*/*/*.txt'):
line = list(open(filename))[4-1]
... do something with line ...

(note that line numbers usually start with 1, but Python's list indexing
starts at 0).

if the files might be large, use something like this instead:

for filename in glob.glob('/mydata/*/*/*.txt'):
f = open(filename)
# skip first three lines
f.readline(); f.readline(); f.readline()
# grab the line we want
line = f.readline()
... do something with line ...

</F>

thank you guys, i did hit a wall using linecache, due to large file
loading into memory.. i think this last solution works well for me
thanks
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,796
Messages
2,569,645
Members
45,371
Latest member
TroyHursey

Latest Threads

Top