Any fancy grep utility replacements out there?

S

samslists

So I need to recursively grep a bunch of gzipped files. This can't be
easily done with grep, rgrep or zgrep. (I'm sure given the right
pipeline including using the find command it could be done....but
seems like a hassle).

So I figured I'd find a fancy next generation grep tool. Thirty
minutes of searching later I find a bunch in Perl, and even one in
Ruby. But I can't find anything that interesting or up to date for
Python. Does anyone know of something?

Thanks
 
G

George Sakkis

So I need to recursively grep a bunch of gzipped files. This can't be
easily done with grep, rgrep or zgrep. (I'm sure given the right
pipeline including using the find command it could be done....but
seems like a hassle).

If it's for something quick & dirty, you can't beat the pipeline, e.g.
something like:

find some_dir -name "*gz" | xargs -i sh -c "echo '== {} =='; zcat {}
| grep some_pattern"

George
 
J

Jeff Schwab

So I need to recursively grep a bunch of gzipped files. This can't be
easily done with grep, rgrep or zgrep. (I'm sure given the right
pipeline including using the find command it could be done....but
seems like a hassle).

So I figured I'd find a fancy next generation grep tool. Thirty
minutes of searching later I find a bunch in Perl, and even one in
Ruby. But I can't find anything that interesting or up to date for
Python. Does anyone know of something?

I don't know of anything in Python, but it should be straight-forward to
write, and I'm betting somebody in this group can do it in one line.
(Did you see Arnaud's solution on the "Interesting math problem" thread?)

When you say "recursively," do you mean that you want to grep files in
nested subdirectories, or do you mean that archive elements should in
turn be expanded if they are themselves archives? If you encounter a
file that has been compressed twice (gzip|gzip), do you want to
uncompress it repeatedly until you get to the original file? For
example, given the following setup, what do you expect the output of
my_grep to be?

~$ mkdir sample && cd sample
sample$ for w in hello world; do echo $w |gzip -c >$w.gz; done
sample$ tar czf helloworld.tgz *.gz
sample$ my_grep hello -r .
 
R

Robert Kern

So I need to recursively grep a bunch of gzipped files. This can't be
easily done with grep, rgrep or zgrep. (I'm sure given the right
pipeline including using the find command it could be done....but
seems like a hassle).

So I figured I'd find a fancy next generation grep tool. Thirty
minutes of searching later I find a bunch in Perl, and even one in
Ruby. But I can't find anything that interesting or up to date for
Python. Does anyone know of something?

I have a grep-like utility I call "grin". I wrote it mostly to recursively grep
SVN source trees while ignoring the garbage under the .svn/ directories and more
or less do exactly what I need most frequently without configuration. It could
easily be extended to open gzip files with GzipFile.

https://svn.enthought.com/svn/sandbox/grin/trunk/

Let me know if you have any requests.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
 
P

Peter Wang

I have a grep-like utility I call "grin". I wrote it mostly to recursively grep
SVN source trees while ignoring the garbage under the .svn/ directories and more
or less do exactly what I need most frequently without configuration. It could
easily be extended to open gzip files with GzipFile.

https://svn.enthought.com/svn/sandbox/grin/trunk/

Let me know if you have any requests.

And don't forget: Colorized output! :)


-Peter
 
S

samslists

Thanks to everyone who responded, and sorry for my late response.

Grin seems like the perfect solution for me. I finally had a chance
to download it and play with it today. It's great.

Robert...you were kind enough to ask if I had any requests. Just the
one right now of grepping through gzip files. If for some reason you
don't want to do it or don't have time to do it, I could probably do
it and send you a patch. But I imagine that since you wrote the code,
you could do it more elegantly than I could.

Thanks!

P.S. Robert....this program totally deserves a real web page, not
just being buried in an svn repository. I spent a lot of time looking
for a tool like this that was written in python. I imagine others
have as well, and have simply given up.
 
F

Floris Bruynooghe

And don't forget: Colorized output! :)

I tried to find something similar a while ago and found ack[1]. I do
realise it's written in perl but it does the job nicely. Never needed
to search in zipfiles though, just unzipping them in /tmp would always
work...

I'll check out grin this afternoon!


Floris

[1] http://petdance.com/ack/
 
R

Robert Kern

Floris said:
I tried to find something similar a while ago and found ack[1]. I do
realise it's written in perl but it does the job nicely. Never needed
to search in zipfiles though, just unzipping them in /tmp would always
work...

Yup, I used ack for a little while before writing grin. Unfortunately, I got
hooked on context lines and these were unimplemented in ack.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
 
R

Robert Kern

Thanks to everyone who responded, and sorry for my late response.

Grin seems like the perfect solution for me. I finally had a chance
to download it and play with it today. It's great.

Robert...you were kind enough to ask if I had any requests. Just the
one right now of grepping through gzip files. If for some reason you
don't want to do it or don't have time to do it, I could probably do
it and send you a patch. But I imagine that since you wrote the code,
you could do it more elegantly than I could.

I was hoping that the code would be understandable enough that it shouldn't
matter who's modifying it. But I have a plane trip tomorrow; I'll take a stab at it.
P.S. Robert....this program totally deserves a real web page, not
just being buried in an svn repository. I spent a lot of time looking
for a tool like this that was written in python. I imagine others
have as well, and have simply given up.

It will. Eventually.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
 
J

John J. Lee

So I need to recursively grep a bunch of gzipped files. This can't be
easily done with grep, rgrep or zgrep. (I'm sure given the right
pipeline including using the find command it could be done....but
seems like a hassle).

So I figured I'd find a fancy next generation grep tool. Thirty
minutes of searching later I find a bunch in Perl, and even one in
Ruby. But I can't find anything that interesting or up to date for
Python. Does anyone know of something?

Thanks

There must be a million of these scripts out there, maybe one per
programmer :) Here's mine:

http://codespeak.net/svn/user/jjlee/trunk/pygrep/


It doesn't do zip files. It has the usual file / dir blacklisting
feature (for avoiding backup files, etc.).

Oddities of this particular script are support for searching for
Python tokens in .py files, doctests, doctest files, and preppy 2
..prep template files. It also outputs in a format that allows you to
click on matches in emacs.

A few years back I was going to release it in the hope that other
people would write plugins for other templating systems, but then I
stopped doing lots of web stuff.

Actually, tokenizing based on a simple fixed "word boundary" rule
seems to work as well in many cases (pygrep doesn't do that) -- though
sometimes proper tokenization can be quite handy -- searching for a
particular Python name, Python string or number can be just what's
needed (pygrep does support that -- e.g. <no options>, -sep, -sebp,
-nep). Most of the time I just use the -t option though, which is
just substring match, just because it's fast and good enough for most
cases (most search strings are longish and so don't give lots of false
positives). The default is tokenized search for files it knows how to
tokenize (.py, .prep, etc.) and substring match for every other file
that's not blacklisted -- I find this good for small projects, but
too slow (there's no caching) for large projects.

Somebody at work has a nice little web-based tool that you can run as
a local server, and turns tokens (e.g. Python names -- but it's based
on some fast simple tokenizer that doesn't know about Python) into
links you can click on. The CSS is written so the link styling
doesn't show up until you hover the mouse over a token, IIRC. It
seems very efficient for exploring/reading and navigating source code
-- I only don't use it because it's not integrated with emacs. It
would be great if somebody could do the same in emacs, with back /
forward buttons :)


John
 
R

Robert Kern

Thanks to everyone who responded, and sorry for my late response.

Grin seems like the perfect solution for me. I finally had a chance
to download it and play with it today. It's great.

Robert...you were kind enough to ask if I had any requests. Just the
one right now of grepping through gzip files. If for some reason you
don't want to do it or don't have time to do it, I could probably do
it and send you a patch. But I imagine that since you wrote the code,
you could do it more elegantly than I could.

I just checked it in.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,904
Latest member
HealthyVisionsCBDPrice

Latest Threads

Top