How should I use grep from python?

Discussion in 'Python' started by Matthew Wilson, May 7, 2009.

  1. I'm writing a command-line application and I want to search through lots
    of text files for a string. Instead of writing the python code to do
    this, I want to use grep.

    This is the command I want to run:

    $ grep -l foo dir

    In other words, I want to list all files in the directory dir that
    contain the string "foo".

    I'm looking for the "one obvious way to do it" and instead I found no
    consensus. I could os.popen, commands.getstatusoutput, the subprocess
    module, backticks, etc.

    As of May 2009, what is the recommended way to run an external process
    like grep and capture STDOUT and the error code?


    TIA

    Matt
     
    Matthew Wilson, May 7, 2009
    #1
    1. Advertising

  2. Matthew Wilson wrote:

    > I'm writing a command-line application and I want to search through lots
    > of text files for a string. Instead of writing the python code to do
    > this, I want to use grep.
    >
    > This is the command I want to run:
    >
    > $ grep -l foo dir
    >
    > In other words, I want to list all files in the directory dir that
    > contain the string "foo".
    >
    > I'm looking for the "one obvious way to do it" and instead I found no
    > consensus. I could os.popen, commands.getstatusoutput, the subprocess
    > module, backticks, etc.
    >
    > As of May 2009, what is the recommended way to run an external process
    > like grep and capture STDOUT and the error code?


    subprocess. Which becomes pretty clear when reading it's docs:

    """
    The subprocess module allows you to spawn new processes, connect to their
    input/output/error pipes, and obtain their return codes. This module
    intends to replace several other, older modules and functions, such as:
    os.system
    os.spawn*
    os.popen*
    popen2.*
    commands.*
    """

    Diez
     
    Diez B. Roggisch, May 7, 2009
    #2
    1. Advertising

  3. Matthew Wilson

    Tim Chase Guest

    > I'm writing a command-line application and I want to search through lots
    > of text files for a string. Instead of writing the python code to do
    > this, I want to use grep.
    >
    > This is the command I want to run:
    >
    > $ grep -l foo dir
    >
    > In other words, I want to list all files in the directory dir that
    > contain the string "foo".
    >
    > I'm looking for the "one obvious way to do it" and instead I found no
    > consensus. I could os.popen, commands.getstatusoutput, the subprocess
    > module, backticks, etc.


    While it doesn't use grep or external processes, I'd just do it
    in pure Python:

    def files_containing(location, search_term):
    for fname in os.listdir(location):
    fullpath = os.path.join(location, fname)
    if os.isfile(fullpath):
    for line in file(fullpath):
    if search_term in line:
    yield fname
    break
    for fname in files_containing('/tmp', 'term'):
    print fname

    It's fairly readable, you can easily tweak the search methods
    (case sensitive, etc), change it to be recursive by using
    os.walk() instead of listdir(), it's cross-platform, and doesn't
    require the overhead of an external process (along with the
    "which call do I use to spawn the function" questions that come
    with it :)

    However, to answer your original question, I'd use os.popen which
    is the one I see suggested most frequently.

    -tkc
     
    Tim Chase, May 7, 2009
    #3
  4. On Thu 07 May 2009 09:09:53 AM EDT, Diez B. Roggisch wrote:
    > Matthew Wilson wrote:
    >>
    >> As of May 2009, what is the recommended way to run an external process
    >> like grep and capture STDOUT and the error code?

    >
    > subprocess. Which becomes pretty clear when reading it's docs:


    Yeah, that's what I figured, but I wondered if there was already
    something newer and shinier aiming to bump subprocess off its throne.

    I'll just stick with subprocess for now. Thanks for the feedback!
     
    Matthew Wilson, May 7, 2009
    #4
  5. On Thu 07 May 2009 09:25:52 AM EDT, Tim Chase wrote:
    > While it doesn't use grep or external processes, I'd just do it
    > in pure Python:


    Thanks for the code!

    I'm reluctant to take that approach for a few reasons:

    1. Writing tests for that code seems like a fairly large amount of work.
    I think I'd need to to either mock out lots of stuff or create a bunch
    of temporary directories and files for each test run.

    I don't intend to test that grep works like it says it does. I'll
    just test that my code calls a mocked-out grep with the right options
    and arguments, and that my code behaves nicely when my mocked-out
    grep returns errors.

    2. grep is crazy fast. For a search through just a few files, I doubt
    it would matter, but when searching through a thousand files (which is
    likely) I suspect that an all-python approach might lag behind. I'm
    speculating here, though.

    3. grep has lots and lots of cute options. I don't want to think about
    implementing stuff like --color, for example. If I just pass all the
    heavy lifting to grep, I'm already done.

    On the other hand, your solution is platform-independent and has no
    dependencies. Mine depends on an external grep command.

    Thanks again for the feedback!

    Matt
     
    Matthew Wilson, May 7, 2009
    #5
  6. Matthew Wilson wrote:

    > consensus. I could os.popen, commands.getstatusoutput, the subprocess
    > module, backticks, etc.


    Backticks do_not_do what you think they do.

    And with py3k they're also as dead as a dead parrot.
     
    Marco Mariani, May 7, 2009
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Zlatko Hristov

    Python script to grep squid logs

    Zlatko Hristov, Apr 14, 2004, in forum: Python
    Replies:
    1
    Views:
    884
    Lee Harr
    Apr 15, 2004
  2. Mark Wilson
    Replies:
    2
    Views:
    114
    Jason Creighton
    Sep 17, 2003
  3. Mark
    Replies:
    3
    Views:
    112
    Tad McClellan
    Jul 7, 2005
  4. perlmbk
    Replies:
    3
    Views:
    563
  5. Justin C

    grep example's use of $_ confusing me.

    Justin C, Oct 29, 2010, in forum: Perl Misc
    Replies:
    2
    Views:
    156
    Justin C
    Nov 1, 2010
Loading...

Share This Page