linecache and glob

Discussion in 'Python' started by jo3c, Jan 4, 2008.

  1. jo3c

    jo3c Guest

    hi everyone happy new year!
    im a newbie to python
    i have a question
    by using linecache and glob
    how do i read a specific line from a file in a batch and then insert
    it into database?

    because it doesn't work! i can't use glob wildcard with linecache

    >>> import linecache
    >>> linecache.getline(glob.glob('/etc/*', 4)


    doens't work

    is there any better methods??? thank you very much in advance

    jo3c
     
    jo3c, Jan 4, 2008
    #1
    1. Advertising

  2. Hello,

    Welcome to Python!

    glob returns a list of filenames, but getline is made to work on just
    one filename.
    So you'll need to iterate over the list returned by glob.

    >>> import linecache, glob
    >>> for filename in glob.glob('/etc/*'):
    >>> print linecache.getline(filename, 4)


    Maybe you could explain more about what you are trying to do and we
    could help more?

    Hope this helps,

    Jeremy



    On Jan 3, 10:02 pm, jo3c <> wrote:
    > hi everyone happy new year!
    > im a newbie to python
    > i have a question
    > by using linecache and glob
    > how do i read a specific line from a file in a batch and then insert
    > it into database?
    >
    > because it doesn't work! i can't use glob wildcard with linecache
    >
    > >>> import linecache
    > >>> linecache.getline(glob.glob('/etc/*', 4)

    >
    > doens't work
    >
    > is there any better methods??? thank you very much in advance
    >
    > jo3c
     
    Jeremy Dillworth, Jan 4, 2008
    #2
    1. Advertising

  3. jo3c

    jo3c Guest

    i have a 2000 files with header and data
    i need to get the date information from the header
    then insert it into my database
    i am doing it in batch so i use glob.glob('/mydata/*/*/*.txt')
    to get the date on line 4 in the txt file i use
    linecache.getline('/mydata/myfile.txt/, 4)

    but if i use
    linecache.getline('glob.glob('/mydata/*/*/*.txt', 4) won't work

    i am running out of ideas

    thanks in advance for any help

    jo3c
     
    jo3c, Jan 4, 2008
    #3
  4. jo3c

    Shane Geiger Guest

    import linecache
    import glob

    # reading from one file
    print linecache.getline('notes/python.txt',4)
    'http://www.python.org/doc/current/lib/\n'

    # reading from many files
    for filename in glob.glob('/etc/*'):
    print linecache.getline(filename,4)




    jo3c wrote:
    > hi everyone happy new year!
    > im a newbie to python
    > i have a question
    > by using linecache and glob
    > how do i read a specific line from a file in a batch and then insert
    > it into database?
    >
    > because it doesn't work! i can't use glob wildcard with linecache
    >
    >
    >>>> import linecache
    >>>> linecache.getline(glob.glob('/etc/*', 4)
    >>>>

    >
    > doens't work
    >
    > is there any better methods??? thank you very much in advance
    >
    > jo3c
    >



    --
    Shane Geiger
    IT Director
    National Council on Economic Education
    | 402-438-8958 | http://www.ncee.net

    Leading the Campaign for Economic and Financial Literacy
     
    Shane Geiger, Jan 4, 2008
    #4
  5. jo3c wrote:

    > i have a 2000 files with header and data
    > i need to get the date information from the header
    > then insert it into my database
    > i am doing it in batch so i use glob.glob('/mydata/*/*/*.txt')
    > to get the date on line 4 in the txt file i use
    > linecache.getline('/mydata/myfile.txt/, 4)
    >
    > but if i use
    > linecache.getline('glob.glob('/mydata/*/*/*.txt', 4) won't work


    glob.glob returns a list of filenames, so you need to call getline once
    for each file in the list.

    but using linecache is absolutely the wrong tool for this; it's designed
    for *repeated* access to arbitrary lines in a file, so it keeps all the
    data in memory. that is, all the lines, for all 2000 files.

    if the files are small, and you want to keep the code short, it's easier
    to just grab the file's content and using indexing on the resulting list:

    for filename in glob.glob('/mydata/*/*/*.txt'):
    line = list(open(filename))[4-1]
    ... do something with line ...

    (note that line numbers usually start with 1, but Python's list indexing
    starts at 0).

    if the files might be large, use something like this instead:

    for filename in glob.glob('/mydata/*/*/*.txt'):
    f = open(filename)
    # skip first three lines
    f.readline(); f.readline(); f.readline()
    # grab the line we want
    line = f.readline()
    ... do something with line ...

    </F>
     
    Fredrik Lundh, Jan 4, 2008
    #5
  6. jo3c

    jo3c Guest

    On Jan 4, 5:25 pm, Fredrik Lundh <> wrote:
    > jo3c wrote:
    > > i have a 2000 files with header and data
    > > i need to get the date information from the header
    > > then insert it into my database
    > > i am doing it in batch so i use glob.glob('/mydata/*/*/*.txt')
    > > to get the date on line 4 in the txt file i use
    > > linecache.getline('/mydata/myfile.txt/, 4)

    >
    > > but if i use
    > > linecache.getline('glob.glob('/mydata/*/*/*.txt', 4) won't work

    >
    > glob.glob returns a list of filenames, so you need to call getline once
    > for each file in the list.
    >
    > but using linecache is absolutely the wrong tool for this; it's designed
    > for *repeated* access to arbitrary lines in a file, so it keeps all the
    > data in memory. that is, all the lines, for all 2000 files.
    >
    > if the files are small, and you want to keep the code short, it's easier
    > to just grab the file's content and using indexing on the resulting list:
    >
    > for filename in glob.glob('/mydata/*/*/*.txt'):
    > line = list(open(filename))[4-1]
    > ... do something with line ...
    >
    > (note that line numbers usually start with 1, but Python's list indexing
    > starts at 0).
    >
    > if the files might be large, use something like this instead:
    >
    > for filename in glob.glob('/mydata/*/*/*.txt'):
    > f = open(filename)
    > # skip first three lines
    > f.readline(); f.readline(); f.readline()
    > # grab the line we want
    > line = f.readline()
    > ... do something with line ...
    >
    > </F>


    thank you guys, i did hit a wall using linecache, due to large file
    loading into memory.. i think this last solution works well for me
    thanks
     
    jo3c, Jan 8, 2008
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Georgy Pruss
    Replies:
    15
    Views:
    725
    Tim Roberts
    Dec 1, 2003
  2. Tim Peters
    Replies:
    1
    Views:
    360
    Duncan Booth
    Dec 1, 2003
  3. Sean Berry

    Question about glob.glob <--newbie

    Sean Berry, May 4, 2004, in forum: Python
    Replies:
    3
    Views:
    347
    David M. Cooke
    May 4, 2004
  4. Matthew Denner
    Replies:
    1
    Views:
    188
  5. rocky
    Replies:
    0
    Views:
    108
    rocky
    Oct 23, 2009
Loading...

Share This Page