Regex problem, match if line contains <a>, unless it also contains <b>

Discussion in 'Perl' started by James Dyer, Feb 19, 2004.

  1. James Dyer

    James Dyer Guest

    I'm having problems getting a regex to work.
    Basically, given two search parameters ($search1 and $search2), it
    should allow me to filter a log file such that lines with the $search1
    string in are printed, unless the $search2 string is also in that line
    somewhere (either before or after $search1).

    I'm creating my regex like this:
    $compiled_regex = qr/^(?!.*$search2)$search1(?!.*$search2)/;

    I then use it:

    while( <> ) {
    next if( $_ !~ /$compiled_regex/ );
    print $_ . "\n";
    }

    With the following test data:

    2004-02-18 04:06:50 1AtIua-0001Hh-00 -> R=lookuphost T=remot
    e_smtp H=mxhost-1.foo.bar [0.0.0.0]
    2004-02-19 04:02:02 1AtfNx-0008DC-00 -> R=lookuphost T=remot
    e_smtp H=mxhost-1.foo.bar [0.0.0.0]
    2004-02-19 04:07:26 1AtfO5-0008Gs-00 -> R=lookuphost T=remot
    e_smtp H=mxhost-1.foo.bar [0.0.0.0]

    If $search1 is set to 'sysadmin', and search2 is set to '0008Gs',
    none of the lines in the data are displayed, whereas I would expect the
    first two to be displayed.

    With this test data:
    foo
    foo foo
    foo foo foo
    foo bar
    bar foo
    foo bar foo
    foo bar bar
    bar foo bar
    bar foo foo
    bar
    bar bar
    bar bar bar

    $search1 set to 'foo', and $search2 set to 'bar', I get the
    expected results (foo, foo foo and foo foo foo displayed).

    I just can't figure out why nothing is being displayed in my first test case.
    My gut instinct is that it's got something to do with the special'ish
    characters in the data ('-', '>' etc.), but I'm not sure.

    Any thoughts?

    J
     
    James Dyer, Feb 19, 2004
    #1
    1. Advertising

  2. James Dyer

    toylet Guest

    Re: Regex problem, match if line contains <a>, unless it also contains<b>

    shouldn't it be ".*$search2?" rather than "?.*$search2" ?

    > I'm creating my regex like this:
    > $compiled_regex = qr/^(?!.*$search2)$search1(?!.*$search2)/;


    --
    .~. Might, Courage, Vision. In Linux We Trust.
    / v \ http://www.linux-sxs.org
    /( _ )\ Linux 2.4.22-xfs
    ^ ^ 4:14pm up 5:47 1 user 1.01 1.00
     
    toylet, Feb 20, 2004
    #2
    1. Advertising

  3. James Dyer

    James Dyer Guest

    OK, I was being stupid, and really not thinking about what my regex
    was actually doing.
    I've now solved the problem - for those of you who are interested,
    this appears to work:

    $compiled_regex = qr/^(?!.*$search2).*$search1/;

    J


    (James Dyer) wrote in message news:<>...
    > I'm having problems getting a regex to work.
    > Basically, given two search parameters ($search1 and $search2), it
    > should allow me to filter a log file such that lines with the $search1
    > string in are printed, unless the $search2 string is also in that line
    > somewhere (either before or after $search1).
    >
    > I'm creating my regex like this:
    > $compiled_regex = qr/^(?!.*$search2)$search1(?!.*$search2)/;
    >
    > I then use it:
    >
    > while( <> ) {
    > next if( $_ !~ /$compiled_regex/ );
    > print $_ . "\n";
    > }
    >
    > With the following test data:
    >
    > 2004-02-18 04:06:50 1AtIua-0001Hh-00 -> R=lookuphost T=remot
    > e_smtp H=mxhost-1.foo.bar [0.0.0.0]
    > 2004-02-19 04:02:02 1AtfNx-0008DC-00 -> R=lookuphost T=remot
    > e_smtp H=mxhost-1.foo.bar [0.0.0.0]
    > 2004-02-19 04:07:26 1AtfO5-0008Gs-00 -> R=lookuphost T=remot
    > e_smtp H=mxhost-1.foo.bar [0.0.0.0]
    >
    > If $search1 is set to 'sysadmin', and search2 is set to '0008Gs',
    > none of the lines in the data are displayed, whereas I would expect the
    > first two to be displayed.
    >
    > With this test data:
    > foo
    > foo foo
    > foo foo foo
    > foo bar
    > bar foo
    > foo bar foo
    > foo bar bar
    > bar foo bar
    > bar foo foo
    > bar
    > bar bar
    > bar bar bar
    >
    > $search1 set to 'foo', and $search2 set to 'bar', I get the
    > expected results (foo, foo foo and foo foo foo displayed).
    >
    > I just can't figure out why nothing is being displayed in my first test case.
    > My gut instinct is that it's got something to do with the special'ish
    > characters in the data ('-', '>' etc.), but I'm not sure.
    >
    > Any thoughts?
    >
    > J
     
    James Dyer, Feb 20, 2004
    #3
  4. James Dyer

    toylet Guest

    Re: Regex problem, match if line contains <a>, unless it also contains<b>

    > I've now solved the problem - for those of you who are interested,
    > this appears to work:
    > $compiled_regex = qr/^(?!.*$search2).*$search1/;


    what's the meaning of "?!" in the regex?

    --
    .~. Might, Courage, Vision. In Linux We Trust.
    / v \ http://www.linux-sxs.org
    /( _ )\ Linux 2.4.22-xfs
    ^ ^ 7:42pm up 9:15 1 user 0.97 0.93
     
    toylet, Feb 20, 2004
    #4
  5. James Dyer

    toylet Guest

    Re: Regex problem, match if line contains <a>, unless it also contains<b>

    >> I've now solved the problem - for those of you who are interested,
    >> this appears to work:
    >> $compiled_regex = qr/^(?!.*$search2).*$search1/;

    >
    > what's the meaning of "?!" in the regex?


    I figured it out. need to force the context of the $! variable.

    print int($!) . $!;

    int($i) prints the error number, 2nd $! prints the message.


    --
    .~. Might, Courage, Vision. In Linux We Trust.
    / v \ http://www.linux-sxs.org
    /( _ )\ Linux 2.4.22-xfs
    ^ ^ 7:54pm up 9:27 1 user 1.00 0.94
     
    toylet, Feb 20, 2004
    #5
  6. James Dyer

    Guest

    (James Dyer) wrote in message news:<>...

    > $compiled_regex = qr/^(?!.*$search2)$search1(?!.*$search2)/;


    Ignoring the possiblity that $search1 maches a newline, the second
    (?!.*$search2) is redundant. It can never fail to match since the re
    engine wouldn't get as that far if there was a match for $search2
    anywhere in the data.

    $compiled_regex = qr/^(?!.*$search2)$search1/s;

    > 2004-02-18 04:06:50 1AtIua-0001Hh-00 -> R=lookuphost T=remot


    > If $search1 is set to 'sysadmin', and search2 is set to '0008Gs',


    You are only looking for $search1 at the start of the string. You
    probably wanted.

    $compiled_regex = qr/^(?!.*$search2).*$search1/s;

    Note - using a single regex for this is probably not a good idea
    unless you are forced into doing so by the fact that you are calling
    an existing function that you can't modify and that takes a single
    regex as an argument.

    If you are not compelled to use a single regex it is clearer, and
    probably faster to use two.

    /$search1/ && !/$search2/

    > Any thoughts?


    Well since you ask...

    This topic has been frequently discussed in the Perl newsgroups that
    exist on Usenet. I think you should have done a search before you
    posted. Having decided you wanted to post I think you should have
    done so to a newsgroup that still exists. This one doesn't (see FAQ)
    so very few people will see what you post here.
     
    , Feb 20, 2004
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. hiwa
    Replies:
    0
    Views:
    658
  2. Replies:
    3
    Views:
    835
    Reedick, Andrew
    Jul 1, 2008
  3. Gábor SEBESTYÉN

    Unless unless

    Gábor SEBESTYÉN, Jun 17, 2005, in forum: Ruby
    Replies:
    3
    Views:
    178
    Gábor SEBESTYÉN
    Jun 17, 2005
  4. Bob Hatch
    Replies:
    2
    Views:
    463
    Jeremy Bopp
    Feb 2, 2011
  5. Jason C
    Replies:
    3
    Views:
    274
    Peter J. Holzer
    Nov 3, 2012
Loading...

Share This Page