Perl regex - How to make my greedy quantifier greedier?

C

cibalo

Hello,

I would like to try some string matching in perl as is in the title.
Let's create some testfiles for testing as follows.
$ mkdir -vp testing/dir.a/dir_b/dir-c; cd testing/dir.a/dir_b/dir-c; \
touch This_is_testing1_org.txt This-is-testing2_org.txt \
this_is_testing3_org.txt this-is-testing4_org.txt; cd

What I am looking for is the result similar to:
$ find testing -type f -name "[a-z]*\.txt"
testing/dir.a/dir_b/dir-c/this-is-testing4_org.txt
testing/dir.a/dir_b/dir-c/this_is_testing3_org.txt
I know it is more easier to find the result this way.

Now I try with perl regex as:
$ ls testing/dir.a/dir_b/dir-c/* | perl -ne '/^(.*\/)([a-z].*)$/;
print $1, " - ", $2, "\n";'
testing/dir.a/dir_b/ - dir-c/This_is_testing1_org.txt
testing/dir.a/dir_b/ - dir-c/This-is-testing2_org.txt
testing/dir.a/dir_b/dir-c/ - this_is_testing3_org.txt
testing/dir.a/dir_b/dir-c/ - this-is-testing4_org.txt
Actually, I want my leftmost greedy quantifier, (.*\/), to be so
greedier that it can prevent the first two output items from listing.

What interests me most is this:
$ ls testing/dir.a/dir_b/dir-c/* | perl -ne '/^(.*\/)([A-Z].*)$/;
print $1, " - ", $2, "\n";'
testing/dir.a/dir_b/dir-c/ - This_is_testing1_org.txt
testing/dir.a/dir_b/dir-c/ - This-is-testing2_org.txt
testing/dir.a/dir_b/dir-c/ - This-is-testing2_org.txt
testing/dir.a/dir_b/dir-c/ - This-is-testing2_org.txt
"This-is-testing2_org.txt" is repeated three times.

Can you please let me know what I'm missing?

Thank you very much in advance!!!

Best Regards,
cibalo
 
D

Damien Wyart

* cibalo said:
Now I try with perl regex as:
$ ls testing/dir.a/dir_b/dir-c/* | perl -ne '/^(.*\/)([a-z].*)$/;
print $1, " - ", $2, "\n";'
testing/dir.a/dir_b/ - dir-c/This_is_testing1_org.txt
testing/dir.a/dir_b/ - dir-c/This-is-testing2_org.txt
testing/dir.a/dir_b/dir-c/ - this_is_testing3_org.txt
testing/dir.a/dir_b/dir-c/ - this-is-testing4_org.txt
Actually, I want my leftmost greedy quantifier, (.*\/), to be so
greedier that it can prevent the first two output items from listing.
[...]

To answer strictly to your question, what you were looking for is '*+' ;
but this will not work in your regex: you need to exclude '/' in the
second group to match only on the filename.

You can read more on the topic (using regexes with paths and filenames)
here: http://stackoverflow.com/questions/169008/regex-for-parsing-directory-and-filename
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,534
Members
45,008
Latest member
Rahul737

Latest Threads

Top