Perl regex - How to make my greedy quantifier greedier?

Discussion in 'Perl Misc' started by cibalo, May 17, 2013.

  1. cibalo

    cibalo Guest

    Hello,

    I would like to try some string matching in perl as is in the title.
    Let's create some testfiles for testing as follows.
    $ mkdir -vp testing/dir.a/dir_b/dir-c; cd testing/dir.a/dir_b/dir-c; \
    touch This_is_testing1_org.txt This-is-testing2_org.txt \
    this_is_testing3_org.txt this-is-testing4_org.txt; cd

    What I am looking for is the result similar to:
    $ find testing -type f -name "[a-z]*\.txt"
    testing/dir.a/dir_b/dir-c/this-is-testing4_org.txt
    testing/dir.a/dir_b/dir-c/this_is_testing3_org.txt
    I know it is more easier to find the result this way.

    Now I try with perl regex as:
    $ ls testing/dir.a/dir_b/dir-c/* | perl -ne '/^(.*\/)([a-z].*)$/;
    print $1, " - ", $2, "\n";'
    testing/dir.a/dir_b/ - dir-c/This_is_testing1_org.txt
    testing/dir.a/dir_b/ - dir-c/This-is-testing2_org.txt
    testing/dir.a/dir_b/dir-c/ - this_is_testing3_org.txt
    testing/dir.a/dir_b/dir-c/ - this-is-testing4_org.txt
    Actually, I want my leftmost greedy quantifier, (.*\/), to be so
    greedier that it can prevent the first two output items from listing.

    What interests me most is this:
    $ ls testing/dir.a/dir_b/dir-c/* | perl -ne '/^(.*\/)([A-Z].*)$/;
    print $1, " - ", $2, "\n";'
    testing/dir.a/dir_b/dir-c/ - This_is_testing1_org.txt
    testing/dir.a/dir_b/dir-c/ - This-is-testing2_org.txt
    testing/dir.a/dir_b/dir-c/ - This-is-testing2_org.txt
    testing/dir.a/dir_b/dir-c/ - This-is-testing2_org.txt
    "This-is-testing2_org.txt" is repeated three times.

    Can you please let me know what I'm missing?

    Thank you very much in advance!!!

    Best Regards,
    cibalo
    cibalo, May 17, 2013
    #1
    1. Advertising

  2. cibalo

    Damien Wyart Guest

    * cibalo <> in comp.lang.perl.misc:
    > [...]


    > Now I try with perl regex as:
    > $ ls testing/dir.a/dir_b/dir-c/* | perl -ne '/^(.*\/)([a-z].*)$/;
    > print $1, " - ", $2, "\n";'
    > testing/dir.a/dir_b/ - dir-c/This_is_testing1_org.txt
    > testing/dir.a/dir_b/ - dir-c/This-is-testing2_org.txt
    > testing/dir.a/dir_b/dir-c/ - this_is_testing3_org.txt
    > testing/dir.a/dir_b/dir-c/ - this-is-testing4_org.txt
    > Actually, I want my leftmost greedy quantifier, (.*\/), to be so
    > greedier that it can prevent the first two output items from listing.
    > [...]


    To answer strictly to your question, what you were looking for is '*+' ;
    but this will not work in your regex: you need to exclude '/' in the
    second group to match only on the filename.

    You can read more on the topic (using regexes with paths and filenames)
    here: http://stackoverflow.com/questions/169008/regex-for-parsing-directory-and-filename

    --
    DW
    Damien Wyart, May 17, 2013
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. -

    Greedy quantifier

    -, Jul 11, 2005, in forum: Java
    Replies:
    0
    Views:
    476
  2. Dan Kelly

    Greedy and non greedy quantifiers

    Dan Kelly, Jan 17, 2008, in forum: Ruby
    Replies:
    4
    Views:
    137
    Robert Klemme
    Jan 19, 2008
  3. Matt Garrish

    greedy v. non-greedy matching

    Matt Garrish, Feb 16, 2004, in forum: Perl Misc
    Replies:
    4
    Views:
    155
    Matt Garrish
    Feb 16, 2004
  4. Replies:
    19
    Views:
    400
    Dr.Ruud
    May 7, 2006
  5. Jack
    Replies:
    2
    Views:
    277
    Tad McClellan
    Oct 4, 2006
Loading...

Share This Page