Regex to find paired characters only

Discussion in 'Perl Misc' started by Graham Drabble, Jan 21, 2006.

  1. I have the following data in a file

    p p p p p p p p
    p b b p p p p p
    b b p p p p p
    s b b p p p p p
    p b b p b b p p
    s b p p p p p p
    s b b b p p p p
    p b b b b b p p
    b b p b p p p
    b b b b p p p
    p p p p p b b
    p s b s p p p b

    I'm looking to write a regular expression that will only match if a
    'b ' are paired together.

    I've currently got

    use strict;
    use warnings;

    open IN, '<', 'pb.txt' or die "Can't open IN: $!";

    while (<IN>){
    chomp;
    my $regex;
    if (!/[sp] b [sp]/){
    $regex = 1;
    }else{
    $regex = 0;
    }

    print "$_\t $regex\n";
    }

    which produces

    p p p p p p p p 1
    p b b p p p p p 1
    b b p p p p p 1
    s b b p p p p p 1
    p b b p b b p p 1
    s b p p p p p p 0
    s b b b p p p p 1
    p b b b b b p p 1
    b b p b p p p 0
    b b b b p p p 1
    p p p p p b b 1
    p s b s p p p b 0

    It should (if it was doing what I wanted!) produce

    p p p p p p p p 1
    p b b p p p p p 1
    b b p p p p p 1
    s b b p p p p p 1
    p b b p b b p p 1
    s b p p p p p p 0
    s b b b p p p p 0 ****
    p b b b b b p p 0 ****
    b b p b p p p 0
    b b b b p p p 1
    p p p p p b b 1
    p s b s p p p b 0

    (The **** are not output but are there to indicate the lines that it
    gets wrong.)

    The problem is that it treats 'b b b' as good where it shouldn't
    (although b b b b is fine).

    Any suggestions?

    --
    Graham Drabble
    http://www.drabble.me.uk/
    Graham Drabble, Jan 21, 2006
    #1
    1. Advertising

  2. Graham Drabble

    Xicheng Guest

    Graham Drabble wrote:
    > I have the following data in a file
    > I'm looking to write a regular expression that will only match if a
    > 'b ' are paired together.

    you dont really need regex, use tr/d// to count the number of 'b' in
    your string and %2 to know if they are paired together:

    > use strict;
    > use warnings;
    >
    > open IN, '<', 'pb.txt' or die "Can't open IN: $!";
    >

    while (<IN>) {
    my $num_of_b = tr/b//;
    my $pairs_flag =($num_of_b + 1)%2;
    print "$_\t$pairs_flag;
    }

    > It should (if it was doing what I wanted!) produce
    >
    > p p p p p p p p 1
    > p b b p p p p p 1
    > b b p p p p p 1
    > s b b p p p p p 1
    > p b b p b b p p 1
    > s b p p p p p p 0
    > s b b b p p p p 0 ****
    > p b b b b b p p 0 ****
    > b b p b p p p 0
    > b b b b p p p 1
    > p p p p p b b 1
    > p s b s p p p b 0
    >
    > (The **** are not output but are there to indicate the lines that it
    > gets wrong.)
    >
    > The problem is that it treats 'b b b' as good where it shouldn't
    > (although b b b b is fine).
    >
    > Any suggestions?
    >
    > --
    > Graham Drabble
    > http://www.drabble.me.uk/
    Xicheng, Jan 21, 2006
    #2
    1. Advertising

  3. Graham Drabble

    Xicheng Guest

    Xicheng wrote:
    > Graham Drabble wrote:
    > > I have the following data in a file
    > > I'm looking to write a regular expression that will only match if a
    > > 'b ' are paired together.

    > you dont really need regex, use tr/d// to count the number of 'b' in
    > your string and %2 to know if they are paired together:
    >
    > > use strict;
    > > use warnings;
    > >
    > > open IN, '<', 'pb.txt' or die "Can't open IN: $!";
    > >

    while (<IN>) {
    #my $num_of_b = tr/b//;
    #or use m// to count 'whatever may be more than one characters'
    my $num=()=/b/g;
    my $pairs_flag =($num_of_b + 1)%2;
    print "$_\t$pairs_flag;
    }

    Xicheng
    Xicheng, Jan 21, 2006
    #3
  4. On 21 Jan 2006 "Xicheng" <> wrote in
    news::

    > Graham Drabble wrote:
    >> I have the following data in a file
    >> I'm looking to write a regular expression that will only match if
    >> a 'b ' are paired together.


    > you dont really need regex, use tr/d// to count the number of 'b'
    > in your string and %2 to know if they are paired together:

    (I assume you mean tr/b//)

    That wouldn't take into account whether the b are next to each
    other.

    >> It should (if it was doing what I wanted!) produce

    [..]
    >> p s b s p p p b 0


    You solution would return 1 for this.


    --
    Graham Drabble
    http://www.drabble.me.uk/
    Graham Drabble, Jan 21, 2006
    #4
  5. Graham Drabble

    Xicheng Guest

    Graham Drabble wrote:
    > On 21 Jan 2006 "Xicheng" <> wrote in
    > news::
    >
    > > Graham Drabble wrote:
    > >> I have the following data in a file
    > >> I'm looking to write a regular expression that will only match if
    > >> a 'b ' are paired together.

    >
    > > you dont really need regex, use tr/d// to count the number of 'b'
    > > in your string and %2 to know if they are paired together:

    > (I assume you mean tr/b//)

    Sorry i got lots of typo, as I tested it on the command line and use
    some very short variable name when copying back, changing name, error
    occured.. :(

    > That wouldn't take into account whether the b are next to each
    > other.
    > >> It should (if it was doing what I wanted!) produce

    > [..]
    > >> p s b s p p p b 0

    Maybe you can remove all 'b b' and then see if there is any 'b' left to
    make a judgement:

    while (<>) {
    (my $b=$_)=~s/b b//g;
    print "$_\t",( ($b=~/b/) ? 0:1);
    }

    Xicheng

    > You solution would return 1 for this.
    > --
    > Graham Drabble
    > http://www.drabble.me.uk/
    Xicheng, Jan 21, 2006
    #5
  6. Graham Drabble

    yong Guest

    Graham Drabble wrote:
    > I have the following data in a file
    >
    > p p p p p p p p
    > p b b p p p p p
    > b b p p p p p
    > s b b p p p p p
    > p b b p b b p p
    > s b p p p p p p
    > s b b b p p p p
    > p b b b b b p p
    > b b p b p p p
    > b b b b p p p
    > p p p p p b b
    > p s b s p p p b
    >
    > I'm looking to write a regular expression that will only match if a
    > 'b ' are paired together.
    >
    > I've currently got
    >
    > use strict;
    > use warnings;
    >
    > open IN, '<', 'pb.txt' or die "Can't open IN: $!";
    >
    > while (<IN>){
    > chomp;
    > my $regex;
    > if (!/[sp] b [sp]/){
    > $regex = 1;
    > }else{
    > $regex = 0;
    > }
    >
    > print "$_\t $regex\n";
    > }
    >
    > which produces
    >
    > p p p p p p p p 1
    > p b b p p p p p 1
    > b b p p p p p 1
    > s b b p p p p p 1
    > p b b p b b p p 1
    > s b p p p p p p 0
    > s b b b p p p p 1
    > p b b b b b p p 1
    > b b p b p p p 0
    > b b b b p p p 1
    > p p p p p b b 1
    > p s b s p p p b 0
    >
    > It should (if it was doing what I wanted!) produce
    >
    > p p p p p p p p 1
    > p b b p p p p p 1
    > b b p p p p p 1
    > s b b p p p p p 1
    > p b b p b b p p 1
    > s b p p p p p p 0
    > s b b b p p p p 0 ****
    > p b b b b b p p 0 ****
    > b b p b p p p 0
    > b b b b p p p 1
    > p p p p p b b 1
    > p s b s p p p b 0
    >
    > (The **** are not output but are there to indicate the lines that it
    > gets wrong.)
    >
    > The problem is that it treats 'b b b' as good where it shouldn't
    > (although b b b b is fine).
    >
    > Any suggestions?
    >


    -----------------------------------
    use strict;

    open(my $fh,"pb.txt") or die "could not open file($!).stop";
    while(<$fh>) {
    chomp;
    my $line_2=$_;
    $line_2=~s/b\sb//g;
    if ($line_2=~/b/) {
    print $_."\t0\n";
    }else {
    print $_."\t1\n";
    }
    }
    -----------------------------------
    yong, Jan 21, 2006
    #6
  7. Graham Drabble wrote:
    > I have the following data in a file
    >
    > p p p p p p p p
    > p b b p p p p p
    > b b p p p p p
    > s b b p p p p p
    > p b b p b b p p
    > s b p p p p p p
    > s b b b p p p p
    > p b b b b b p p
    > b b p b p p p
    > b b b b p p p
    > p p p p p b b
    > p s b s p p p b
    >
    > I'm looking to write a regular expression that will only match if a
    > 'b ' are paired together.


    /^(b b|[^b])*$/

    Or if you are feeling pedantic...

    /^(?:b b|[^b])*$/

    This, of course only works because 'b' is a single character. It does
    not generalise to longer targets.
    Brian McCauley, Jan 21, 2006
    #7
  8. Graham Drabble

    Guest

    Graham Drabble wrote:
    > I have the following data in a file
    >
    > p p p p p p p p
    > p b b p p p p p
    > b b p p p p p
    > s b b p p p p p
    > p b b p b b p p
    > s b p p p p p p
    > s b b b p p p p
    > p b b b b b p p
    > b b p b p p p
    > b b b b p p p
    > p p p p p b b
    > p s b s p p p b
    >
    > I'm looking to write a regular expression that will only match if a
    > 'b ' are paired together.
    >
    > I've currently got
    >
    > use strict;
    > use warnings;
    >
    > open IN, '<', 'pb.txt' or die "Can't open IN: $!";
    >
    > while (<IN>){
    > chomp;
    > my $regex;
    > if (!/[sp] b [sp]/){
    > $regex = 1;
    > }else{
    > $regex = 0;
    > }
    >
    > print "$_\t $regex\n";
    > }
    >
    > which produces
    >
    > p p p p p p p p 1
    > p b b p p p p p 1
    > b b p p p p p 1
    > s b b p p p p p 1
    > p b b p b b p p 1
    > s b p p p p p p 0
    > s b b b p p p p 1
    > p b b b b b p p 1
    > b b p b p p p 0
    > b b b b p p p 1
    > p p p p p b b 1
    > p s b s p p p b 0
    >
    > It should (if it was doing what I wanted!) produce
    >
    > p p p p p p p p 1
    > p b b p p p p p 1
    > b b p p p p p 1
    > s b b p p p p p 1
    > p b b p b b p p 1
    > s b p p p p p p 0
    > s b b b p p p p 0 ****
    > p b b b b b p p 0 ****
    > b b p b p p p 0
    > b b b b p p p 1
    > p p p p p b b 1
    > p s b s p p p b 0
    >
    > (The **** are not output but are there to indicate the lines that it
    > gets wrong.)
    >
    > The problem is that it treats 'b b b' as good where it shouldn't
    > (although b b b b is fine).


    Yet another possibility:

    $c = () = /(?:b b )/g; print $c*2 == tr/b// ? 1 : 0;

    --
    Charles DeRykus
    , Jan 23, 2006
    #8
  9. On 21 Jan 2006 "Brian McCauley" <> wrote in
    news::

    > /^(b b|[^b])*$/
    >
    > Or if you are feeling pedantic...
    >
    > /^(?:b b|[^b])*$/


    Many thanks for this (and to all those of you who have found alternate
    solutions).

    Is there any real benefit from using the latter solution?

    --
    Graham Drabble
    http://www.drabble.me.uk/
    Graham Drabble, Jan 23, 2006
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?Utf-8?B?SmViQnVzaGVsbA==?=

    Is ASP Validator Regex Engine Same As VS2003 Find Regex Engine?

    =?Utf-8?B?SmViQnVzaGVsbA==?=, Oct 22, 2005, in forum: ASP .Net
    Replies:
    2
    Views:
    699
    =?Utf-8?B?SmViQnVzaGVsbA==?=
    Oct 22, 2005
  2. Replies:
    2
    Views:
    295
  3. rvino
    Replies:
    0
    Views:
    4,651
    rvino
    Aug 14, 2007
  4. Replies:
    3
    Views:
    757
    Reedick, Andrew
    Jul 1, 2008
  5. kelticeye

    Matching paired arrays into hash

    kelticeye, Aug 15, 2007, in forum: Perl Misc
    Replies:
    5
    Views:
    147
    kelticeye
    Aug 15, 2007
Loading...

Share This Page