How to grep using an array of patterns?

P

Peng Yu

If I only have a small number of patterns(say 3), I can just spell out
the matching code as below.

grep {/$pattern1$/ or /$pattern2$/ or /$pattern3$/} @array;

But if I have @patterns with many patterns that I want grep, the above
way doesn't work. I'm wondering what is the best way to grep many
patterns.
 
W

Willem

Peng Yu wrote:
) If I only have a small number of patterns(say 3), I can just spell out
) the matching code as below.
)
) grep {/$pattern1$/ or /$pattern2$/ or /$pattern3$/} @array;
)
) But if I have @patterns with many patterns that I want grep, the above
) way doesn't work. I'm wondering what is the best way to grep many
) patterns.

That depends on a number of things, such as how many items and times you
will be matching with the same set of patterns, how complex these patterns
are, and if you value maintainability over execution speed.

Possibilities I can think of offhand:
- Nested greps (two possible ways to nest)
- Nested greps with precompiled regexes
- Combining search patterns into one big regex

And there are probably more.


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
 
P

Peng Yu

Peng Yu wrote:

) If I only have a small number of patterns(say 3), I can just spell out
) the matching code as below.
)
) grep {/$pattern1$/ or /$pattern2$/  or /$pattern3$/} @array;
)
) But if I have @patterns with many patterns that I want grep, the above
) way doesn't work. I'm wondering what is the best way to grep many
) patterns.

That depends on a number of things, such as how many items and times you
will be matching with the same set of patterns, how complex these patterns
are, and if you value maintainability over execution speed.

Let's say I have only 10 simple patterns (just have non-special
characters, a-z, A-Z, _, and \.) and they are mutually exclusive (if a
file match one pattern it can not match another).
Possibilities I can think of offhand:
- Nested greps (two possible ways to nest)

It seems that the above one is the simplest solution for this
particular problem. Would you pleas show me some code on how to use
nested greps?
 
W

Willem

Peng Yu wrote:
)> That depends on a number of things, such as how many items and times you
)> will be matching with the same set of patterns, how complex these patterns
)> are, and if you value maintainability over execution speed.
)
) Let's say I have only 10 simple patterns (just have non-special
) characters, a-z, A-Z, _, and \.) and they are mutually exclusive (if a
) file match one pattern it can not match another).

In that case you could probably join them into one regex quite easily:

my $patt = join('|', @patterns);
grep { /$patt/ } @array;

)> Possibilities I can think of offhand:
)> - Nested greps (two possible ways to nest)
)
) It seems that the above one is the simplest solution for this
) particular problem. Would you pleas show me some code on how to use
) nested greps?

grep { my $x = $_; grep { $x =~ /$_/ } @patterns } @array;

)> - Nested greps with precompiled regexes
)> - Combining search patterns into one big regex
)>
)> And there are probably more.

SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
 
J

John Bokma

Tad McClellan said:
Please refrain from re-asking Frequently Asked Questions, it
is getting tiresome...

So do ivory towers reply.
perldoc -q many

How do I efficiently match many regular expressions at once?

Where is the example using grep as the OP asked? Oh, wait...

Again Tad, you DON'T HAVE TO POST. Control that smartass urge of yours,
thanks. It will make this group way more friendly.
 
C

Charlton Wilbur

JB> Again Tad, you DON'T HAVE TO POST. Control that smartass urge of
JB> yours, thanks. It will make this group way more friendly.

Likewise, NOBODY REQUIRES YOU TO READ EVERY POST.

I'd rather the group be expert-friendly than lazy-newbie-friendly,
myself.

Charlton
 
J

John Bokma

Charlton Wilbur said:
JB> Again Tad, you DON'T HAVE TO POST. Control that smartass urge of
JB> yours, thanks. It will make this group way more friendly.

Likewise, NOBODY REQUIRES YOU TO READ EVERY POST.

Well, the problem is, one has to open a post to see what is in it. And
since Tad does now and then post something interesting, I still consider
it worth to open his posts. In short, while true, your statement is
somewhat pointless.
I'd rather the group be expert-friendly than lazy-newbie-friendly,
myself.

I prefer something in the middle [1]. Also, try to keep in mind that a
lazy-newbie question can result in an interesting discussion that people
who are not lazy and who want to learn might cherish. I am sure you are
able to recall several examples of this.

Personally I wouldn't mind if everybody stopped replying to lazy-newbies
and have the FAQ bot post daily a message:

If you wonder why you don't get a reply...

with the usual pointers.

I am sure that it would improve the readability and the friendliness of
this group. And as since this sounds reasonable (at least to me) it will
never see the light of day.

[1] I hate it when regulars can get away with stuff they wouldn't let
any newbie get away with.
 
U

Uri Guttman

JB> [1] I hate it when regulars can get away with stuff they wouldn't let
JB> any newbie get away with.

that is why they are called regulars. they have (usually) earned some
reputation points or have external (like cpan and other perl community)
experience worth listening too. teaching a newbie to use the FAQ is a
good thing. how that is done may be your issue but i like tad's way as
it hits hard and that is usually needed for newbies to get off the track
of asking FAQs. perl's faq collection is massive and written and edited
(here) very well. it is the best way to leverage the learning curve. i
always recommend to newbies to scan/skim the entire faq as soon as they
can. then read it in more depth as needed.

uri
 
J

John Bokma

Uri Guttman said:
JB> [1] I hate it when regulars can get away with stuff they wouldn't let
JB> any newbie get away with.

that is why they are called regulars. they have (usually) earned some
reputation points or have external (like cpan and other perl community)
experience worth listening too.

I am not talking about reputation etc. I am glad that you replied, to be
honest. Why do you think it's OK to write file::slurp when you mean
File::Slurp?
good thing. how that is done may be your issue but i like tad's way as
it hits hard

Yes, it's fucking annoying (since you like hard, there it is). As a
regular one should, IMO, learn to stand above it, and don't get pissed
off at every single newbie that shows up. It's not that much harder to
ask/reprimand nicely. I mean I can ask you nicely to press the shift now
and then as to make your postings a bit easier on the eye, and also not
to confuse newbies with non-existing pragmatic (!) modules like
file::slurp. Or I can slap you around every day. I prefer the first
because it has hopefully an effect.

I do like to keep reading posts by you, Tad and several others. But it
gets harder and harder to pick the fruit, since too many posts are just
bashing newbies. While I agree that this is not a help desk for people
who can't be bothered to scratch their own asses, I think there is a
better way to handle it. Like I wrote in an earlier post: have the
faq-bot post daily: "Why doesn't anyone answer my question?" with the
pointers to the FAQ, posting guidelines, etc. No more need to bash
newbies, and more time for fun discussions. And trust me, while they
will ignore stuff like "posting guidelines" they will read "Why doesn't"
because that's the question they have ;-).

If you have a better idea, I am all in for it. But personally I really
start to dislike the negative energy a lot of posts here have. While it
might feel good to set someone straight, it might scare off others (like
me, and I am not easily scared away).

I also read comp.lang.python and the climate is very, very different
from comp.lang.perl.misc. And no: "move over there, and stay away
from here" is not a good answer. I hope you agree.

Thanks for reading,
 
J

John Bokma

Tad McClellan said:
Hey! I agree with you about something!

It isn't the first time, and won't be the last, so why the surprise?
I don't get pissed at every single newbie that shows up.

OK, my bad: at every single newbie that should have done a bit more
research in the eyes of the regulars.
Usenet is a last-resort resource, not the first resort.

I agree, no question about that. I only think that in a group with
dwindling traffic it becomes annoying if quite some post just reiterate
what has been posted daily for the past years. I thought that
programming was also about seeing a problem and providing an automated
solution to it.
Right, but it is easier to ignore when delivered nicely.

Again, you don't have to reply. If your advice is ignored, score the
poster low, and igonre his/her future posts.
Ignoring accepted netiquette is certainly bad for our newsgroup.

So is behavior that most (I hope) wouldn't show when in public, actually
facing the newbie.
I don't want bad for this newsgroup.

Me neither, hence my attempt to look for a solution that will reduce the
constant stream of correcting newbies. It rarely works.
Uri's "style" annoys the hell out of me too.

So why don't you reply to his posts trying to correct him with harsh
language? OTOH, maybe my friendly request has worked. It doesn't matter
which happens since either will prove a point I made :).
None, none!, of my followups to this thread's OP were "just" bashing.

I talk in general: too many posts IMO in this newsgroup are bashing
newbies. If you were just bashing we wouldn't have this discussion to
begin with.
(there's that exaggeration thing yet again, your credibility with me
approaches zero)

Aren't too many posts here just bashing? IMO it is. You can disagree
with that, but it has nothing to do with my credibility. Maybe we have a
different definition of bashing.
Each and every one of my followups included help (along with the
bashing).

Help that comes with a kick in the balls is just a kick in the balls IMO.
It is a waste of your time to try and convince me otherwise, as I've
been here far too long to believe that your approach is better, and
I find it hard to believe what you say anyway.

OK, well, in that case this group isn't just for me.
Are you waiting for someone else to implement your idea for you?

Did I write that? I have no problem to provide the code. I have also no
problem to run a bot daily *but* I think it's easier to have the current
FAQ bot post one additional message. On top of that I am not going to
run a bot without approval of the majority of regulars here.
I doubt that, but go ahead and try it and we'll find out.

I doubt it, since it only works if everybody in this group ignores each
question that doesn't use this group as the last resort and/or doesn't
follow the posting guidelines.
If you are unwilling to spend 10 minutes grepping the std docs
before asking your question on Usenet, then Usenet is better off
if you _are_ scared away.

I thought you gave classes at Stonehenge but just read that you only
sell them. Figures.
 
J

John Bokma

Tad McClellan said:
Ran out of other approaches,

To be honest, I knew from the start it would be pointless to discuss
things with asshats like you and Uri. So, good luck with your Perl
"community". I am sure the newbies will find greener pastures at Stack
Overflow or elsewhere.
 
S

sln

If I only have a small number of patterns(say 3), I can just spell out
the matching code as below.

grep {/$pattern1$/ or /$pattern2$/ or /$pattern3$/} @array;

But if I have @patterns with many patterns that I want grep, the above
way doesn't work. I'm wondering what is the best way to grep many
patterns.

Late to the thread, you have some good potential solutions by now.

I just want to make a correction to your statement.
If you have @patterns, it is indeed possible to spell it out and it
does work:

grep {/$pattern[1]$/ or /$pattern[2]$/ or /$pattern[3]$/} @array;

I don't buy into the propaganda that @patterns are usefull.
Here's why.

When you split up a real regular expression, those with quantifiers,
groups, and modifiers, the potential for disaster increases exponentially.
Throwing them into "strings" to be dynamically compiled is another problem.
Then there's that dynamic string problem.

The faq's give the simplest of simple examples and are not real world.
Multi-part regular expressions are fraught with danger in the hands of the novice
(hell, even the experts).

So, be sure to put this note in the book you are having everybody write for you.
Sorry, no code examples, check the faq.

-sln
 
S

sln

If I only have a small number of patterns(say 3), I can just spell out
the matching code as below.

grep {/$pattern1$/ or /$pattern2$/ or /$pattern3$/} @array;

But if I have @patterns with many patterns that I want grep, the above
way doesn't work. I'm wondering what is the best way to grep many
patterns.

Late to the thread, you have some good potential solutions by now.

I just want to make a correction to your statement.
If you have @patterns, it is indeed possible to spell it out and it
does work:

grep {/$pattern[1]$/ or /$pattern[2]$/ or /$pattern[3]$/} @array;

I don't buy into the propaganda that @patterns are usefull.
Here's why.

When you split up a real regular expression, those with quantifiers,
groups, and modifiers, the potential for disaster increases exponentially.
Throwing them into "strings" to be dynamically compiled is another problem.
Then there's that dynamic string problem.

The faq's give the simplest of simple examples and are not real world.
Multi-part regular expressions are fraught with danger in the hands of the novice
(hell, even the experts).

So, be sure to put this note in the book you are having everybody write for you.
Sorry, no code examples, check the faq.

Oh, btw, grep() sucks. It can't be stopped whence you found what you need.
So don't use it for that. Its just a filter (that can't be stopped).
Put that in your book.

-sln
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,763
Messages
2,569,562
Members
45,039
Latest member
CasimiraVa

Latest Threads

Top