Expressing AND, OR, and NOT in a Single Pattern

Discussion in 'Perl Misc' started by usaims, Mar 1, 2007.

  1. usaims

    usaims Guest

    I'm having a little problem with this example in the Perl Cookbook.

    True if pattern BAD does not match, but pattern GOOD does:
    /(?=(?:(?!BAD).)*$)GOOD/s

    My objective is to print only lines that have 'suspended' but not
    'Data_services'. It is still printing lines with 'suspended' and
    'Data_services' in the same line. So, ideally, this script should
    print any lines. Correct me if I am wrong.

    ##############################
    #!/usr/bin/perl
    use strict;
    use diagnostics;
    use warnings;

    my @stuff = <DATA>;

    foreach my $foo(@stuff) {
    if ($foo =~ /(?=(?:(?!Data_services).)*$)suspended/s) {
    print $foo;

    }
    }
    close(DATA);

    __DATA__
    <Query id='Data_services.LSSI_Weekly.42' suspended='1' error='Loading
    Data Only - cannot run query' wuid='W20070227-140132'
    associatedName='libW20070227-140132.so'/>
    <Query id='Data_services.SSNMapKeys.14' suspended='1' error='Loading
    Data Only - cannot run query' wuid='W20070105-115230'
    associatedName='libW20070105-114650.so'/>
    <Query id='Data_services.WatercraftKeys.5' suspended='1'
    error='Loading Data Only - cannot run query' wuid='W20070123-114242'
    associatedName='libW20070123-114242.so'/>
    usaims, Mar 1, 2007
    #1
    1. Advertising

  2. usaims

    Scott Bryce Guest

    usaims wrote:

    > My objective is to print only lines that have 'suspended' but not
    > 'Data_services'.


    I prefer to use index for something like this.

    > It is still printing lines with 'suspended' and
    > 'Data_services' in the same line. So, ideally, this script should
    > print any lines. Correct me if I am wrong.


    There are no lines in your given data that meet your criteria.

    Here's my shot at it...

    use strict;
    use warnings;

    while (<DATA>)
    {
    next if index ($_, 'Data_services') > -1;
    print $_ if index ($_, 'suspended') > -1;
    }

    __DATA__
    <Query id='Data_services.LSSI_Weekly.42' suspended='1' error='Loading
    Data Only - cannot run query' wuid='W20070227-140132'
    associatedName='libW20070227-140132.so'/>
    <Query id='Data_services.SSNMapKeys.14' suspended='1' error='Loading
    Data Only - cannot run query' wuid='W20070105-115230'
    associatedName='libW20070105-114650.so'/>
    <Query id='Data_services.WatercraftKeys.5' suspended='1' error='Loading
    Data Only - cannot run query' wuid='W20070123-114242'
    associatedName='libW20070123-114242.so'/>
    <Query id='Other_services.SSNMapKeys.14' suspended='1' error='Loading
    Data Only - cannot run query' wuid='W20070105-115230'
    associatedName='libW20070105-114650.so'/>
    Scott Bryce, Mar 1, 2007
    #2
    1. Advertising

  3. usaims

    Guest

    "usaims" <> wrote:
    > I'm having a little problem with this example in the Perl Cookbook.
    >
    > True if pattern BAD does not match, but pattern GOOD does:
    > /(?=(?:(?!BAD).)*$)GOOD/s


    Every character from the start of the match to the end of the string
    has to not (be the start of a) match to BAD. However, if BAD occurs before
    GOOD, the regex can still match, simply by not initiating the match until
    after the B of BAD.

    You want to the forced exclusion to start at the beginning of the string
    and run to the end:

    /^(?=(?:(?!BAD).)*$).*GOOD/;

    But I'd just use two different regex.

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
    , Mar 1, 2007
    #3
  4. usaims

    h3xx Guest

    I like doing things in one line:

    print grep { /suspended/ && ! /Data_services/ } <DATA>;
    h3xx, Mar 1, 2007
    #4
  5. usaims

    gf Guest

    On Mar 1, 3:18 pm, "h3xx" <> wrote:
    > I like doing things in one line:
    >
    > print grep { /suspended/ && ! /Data_services/ } <DATA>;



    I prefer this method too. For clarity and long-term maintenance it is
    much better because the esoterica of regex can make the desired
    results hard to figure out and the bugs in the pattern even harder to
    find.

    Also, speed wise, this is a lot faster. The regex engine has to do a
    lot of work that can be short circuited by the booleans.

    Sometimes it's better to break the search for matching patterns into
    single lines too. It's kind of macho programmer-wise to string it all
    together into one mondo regex pattern and have it work, but the logic
    can get fragile.

    The only thing I'd do differently to these patterns is add an anchor
    to the 'Data_services' pattern, like so...

    /^<Query id='Data_services/

    Anchors speed up regex an incredible amount. I did benchmarks of index
    vs various ways of using regex, and an anchored qr// that was
    initialized outside a loop was the fastest at finding patterns inside
    long strings, when the pattern was at the end of the string. At the
    beginning of a string it should be equal to index(). Index() was
    faster when finding a fixed string somewhere in the middle of another
    string.
    gf, Mar 2, 2007
    #5
  6. usaims

    gf Guest

    gf, Mar 2, 2007
    #6
  7. On Mar 1, 10:10 pm, wrote:
    > "usaims" <> wrote:
    > > I'm having a little problem with this example in the Perl Cookbook.

    >
    > > True if pattern BAD does not match, but pattern GOOD does:
    > > /(?=(?:(?!BAD).)*$)GOOD/s

    >
    > Every character from the start of the match to the end of the string
    > has to not (be the start of a) match to BAD. However, if BAD occurs before
    > GOOD, the regex can still match, simply by not initiating the match until
    > after the B of BAD.
    >
    > You want to the forced exclusion to start at the beginning of the string
    > and run to the end:
    >
    > /^(?=(?:(?!BAD).)*$).*GOOD/;


    That's exponentially (er, factorially?) ineficient!

    /^(?!.*BAD).*GOOD/;

    > But I'd just use two different regex.


    Yes, of course, that's still the best way.
    Brian McCauley, Mar 4, 2007
    #7
  8. usaims

    Guest

    "Brian McCauley" <> wrote:
    > On Mar 1, 10:10 pm, wrote:
    > > "usaims" <> wrote:
    > > > I'm having a little problem with this example in the Perl Cookbook.

    > >
    > > > True if pattern BAD does not match, but pattern GOOD does:
    > > > /(?=(?:(?!BAD).)*$)GOOD/s

    > >
    > > Every character from the start of the match to the end of the string
    > > has to not (be the start of a) match to BAD. However, if BAD occurs
    > > before GOOD, the regex can still match, simply by not initiating the
    > > match until after the B of BAD.
    > >
    > > You want to the forced exclusion to start at the beginning of the
    > > string and run to the end:
    > >
    > > /^(?=(?:(?!BAD).)*$).*GOOD/;

    >
    > That's exponentially (er, factorially?) ineficient!


    Under what condistions is it exponential? With the patterns I've tested,
    it seems to be linear, not exponential. (But still a quite a lot slower
    than yours, for reasons I don't quite understand. It would make more sense
    to me if it were exponentially slower, rather than constantly 30 times
    slower.)

    Xho

    >
    > /^(?!.*BAD).*GOOD/;
    >
    > > But I'd just use two different regex.

    >
    > Yes, of course, that's still the best way.


    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
    , Mar 4, 2007
    #8
  9. usaims

    Mirco Wahab Guest

    Brian McCauley wrote:
    > On Mar 1, 10:10 pm, wrote:
    >> You want to the forced exclusion to start at the beginning of the string
    >> and run to the end:
    >>
    >> /^(?=(?:(?!BAD).)*$).*GOOD/;

    >
    > That's exponentially (er, factorially?) ineficient!
    >
    > /^(?!.*BAD).*GOOD/;
    >
    >> But I'd just use two different regex.

    >
    > Yes, of course, that's still the best way.


    This

    /^(?!.*BAD).*GOOD/

    is, in my opinion, of "Maxwellian beauty".

    I tried some time to get the original
    expression somehow simplified, it (I)
    ended with 'throwing the gun'.

    Thanks,

    Mirco
    Mirco Wahab, Mar 4, 2007
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Rob Heersma

    [OWL] expressing relationships

    Rob Heersma, Jun 15, 2004, in forum: XML
    Replies:
    0
    Views:
    404
    Rob Heersma
    Jun 15, 2004
  2. Mike Wahler
    Replies:
    1
    Views:
    381
    Razmig K
    Apr 1, 2004
  3. Michel Rosien
    Replies:
    0
    Views:
    354
    Michel Rosien
    Apr 22, 2004
  4. Sean

    Expressing time.

    Sean, Jul 14, 2003, in forum: Python
    Replies:
    1
    Views:
    430
    Eddie Corns
    Jul 14, 2003
  5. mathieu

    Expressing dynamics in XML ?

    mathieu, Aug 10, 2009, in forum: XML
    Replies:
    9
    Views:
    1,255
    Joe Kesselman
    Aug 12, 2009
Loading...

Share This Page