Regex to find paired characters only

G

Graham Drabble

I have the following data in a file

p p p p p p p p
p b b p p p p p
b b p p p p p
s b b p p p p p
p b b p b b p p
s b p p p p p p
s b b b p p p p
p b b b b b p p
b b p b p p p
b b b b p p p
p p p p p b b
p s b s p p p b

I'm looking to write a regular expression that will only match if a
'b ' are paired together.

I've currently got

use strict;
use warnings;

open IN, '<', 'pb.txt' or die "Can't open IN: $!";

while (<IN>){
chomp;
my $regex;
if (!/[sp] b [sp]/){
$regex = 1;
}else{
$regex = 0;
}

print "$_\t $regex\n";
}

which produces

p p p p p p p p 1
p b b p p p p p 1
b b p p p p p 1
s b b p p p p p 1
p b b p b b p p 1
s b p p p p p p 0
s b b b p p p p 1
p b b b b b p p 1
b b p b p p p 0
b b b b p p p 1
p p p p p b b 1
p s b s p p p b 0

It should (if it was doing what I wanted!) produce

p p p p p p p p 1
p b b p p p p p 1
b b p p p p p 1
s b b p p p p p 1
p b b p b b p p 1
s b p p p p p p 0
s b b b p p p p 0 ****
p b b b b b p p 0 ****
b b p b p p p 0
b b b b p p p 1
p p p p p b b 1
p s b s p p p b 0

(The **** are not output but are there to indicate the lines that it
gets wrong.)

The problem is that it treats 'b b b' as good where it shouldn't
(although b b b b is fine).

Any suggestions?
 
X

Xicheng

Graham said:
I have the following data in a file
I'm looking to write a regular expression that will only match if a
'b ' are paired together.
you dont really need regex, use tr/d// to count the number of 'b' in
your string and %2 to know if they are paired together:
use strict;
use warnings;

open IN, '<', 'pb.txt' or die "Can't open IN: $!";
while (<IN>) {
my $num_of_b = tr/b//;
my $pairs_flag =($num_of_b + 1)%2;
print "$_\t$pairs_flag;
}
 
X

Xicheng

Xicheng said:
you dont really need regex, use tr/d// to count the number of 'b' in
your string and %2 to know if they are paired together:
while (<IN>) {
#my $num_of_b = tr/b//;
#or use m// to count 'whatever may be more than one characters'
my $num=()=/b/g;
my $pairs_flag =($num_of_b + 1)%2;
print "$_\t$pairs_flag;
}

Xicheng
 
G

Graham Drabble

you dont really need regex, use tr/d// to count the number of 'b'
in your string and %2 to know if they are paired together:
(I assume you mean tr/b//)

That wouldn't take into account whether the b are next to each
other.
It should (if it was doing what I wanted!) produce [..]
p s b s p p p b 0

You solution would return 1 for this.
 
X

Xicheng

Graham said:
(I assume you mean tr/b//)
Sorry i got lots of typo, as I tested it on the command line and use
some very short variable name when copying back, changing name, error
occured.. :(
That wouldn't take into account whether the b are next to each
other.
It should (if it was doing what I wanted!) produce [..]
p s b s p p p b 0
Maybe you can remove all 'b b' and then see if there is any 'b' left to
make a judgement:

while (<>) {
(my $b=$_)=~s/b b//g;
print "$_\t",( ($b=~/b/) ? 0:1);
}

Xicheng
 
Y

yong

Graham said:
I have the following data in a file

p p p p p p p p
p b b p p p p p
b b p p p p p
s b b p p p p p
p b b p b b p p
s b p p p p p p
s b b b p p p p
p b b b b b p p
b b p b p p p
b b b b p p p
p p p p p b b
p s b s p p p b

I'm looking to write a regular expression that will only match if a
'b ' are paired together.

I've currently got

use strict;
use warnings;

open IN, '<', 'pb.txt' or die "Can't open IN: $!";

while (<IN>){
chomp;
my $regex;
if (!/[sp] b [sp]/){
$regex = 1;
}else{
$regex = 0;
}

print "$_\t $regex\n";
}

which produces

p p p p p p p p 1
p b b p p p p p 1
b b p p p p p 1
s b b p p p p p 1
p b b p b b p p 1
s b p p p p p p 0
s b b b p p p p 1
p b b b b b p p 1
b b p b p p p 0
b b b b p p p 1
p p p p p b b 1
p s b s p p p b 0

It should (if it was doing what I wanted!) produce

p p p p p p p p 1
p b b p p p p p 1
b b p p p p p 1
s b b p p p p p 1
p b b p b b p p 1
s b p p p p p p 0
s b b b p p p p 0 ****
p b b b b b p p 0 ****
b b p b p p p 0
b b b b p p p 1
p p p p p b b 1
p s b s p p p b 0

(The **** are not output but are there to indicate the lines that it
gets wrong.)

The problem is that it treats 'b b b' as good where it shouldn't
(although b b b b is fine).

Any suggestions?

-----------------------------------
use strict;

open(my $fh,"pb.txt") or die "could not open file($!).stop";
while(<$fh>) {
chomp;
my $line_2=$_;
$line_2=~s/b\sb//g;
if ($line_2=~/b/) {
print $_."\t0\n";
}else {
print $_."\t1\n";
}
}
-----------------------------------
 
B

Brian McCauley

Graham said:
I have the following data in a file

p p p p p p p p
p b b p p p p p
b b p p p p p
s b b p p p p p
p b b p b b p p
s b p p p p p p
s b b b p p p p
p b b b b b p p
b b p b p p p
b b b b p p p
p p p p p b b
p s b s p p p b

I'm looking to write a regular expression that will only match if a
'b ' are paired together.

/^(b b|[^b])*$/

Or if you are feeling pedantic...

/^(?:b b|[^b])*$/

This, of course only works because 'b' is a single character. It does
not generalise to longer targets.
 
C

ced

Graham said:
I have the following data in a file

p p p p p p p p
p b b p p p p p
b b p p p p p
s b b p p p p p
p b b p b b p p
s b p p p p p p
s b b b p p p p
p b b b b b p p
b b p b p p p
b b b b p p p
p p p p p b b
p s b s p p p b

I'm looking to write a regular expression that will only match if a
'b ' are paired together.

I've currently got

use strict;
use warnings;

open IN, '<', 'pb.txt' or die "Can't open IN: $!";

while (<IN>){
chomp;
my $regex;
if (!/[sp] b [sp]/){
$regex = 1;
}else{
$regex = 0;
}

print "$_\t $regex\n";
}

which produces

p p p p p p p p 1
p b b p p p p p 1
b b p p p p p 1
s b b p p p p p 1
p b b p b b p p 1
s b p p p p p p 0
s b b b p p p p 1
p b b b b b p p 1
b b p b p p p 0
b b b b p p p 1
p p p p p b b 1
p s b s p p p b 0

It should (if it was doing what I wanted!) produce

p p p p p p p p 1
p b b p p p p p 1
b b p p p p p 1
s b b p p p p p 1
p b b p b b p p 1
s b p p p p p p 0
s b b b p p p p 0 ****
p b b b b b p p 0 ****
b b p b p p p 0
b b b b p p p 1
p p p p p b b 1
p s b s p p p b 0

(The **** are not output but are there to indicate the lines that it
gets wrong.)

The problem is that it treats 'b b b' as good where it shouldn't
(although b b b b is fine).

Yet another possibility:

$c = () = /(?:b b )/g; print $c*2 == tr/b// ? 1 : 0;
 
G

Graham Drabble

/^(b b|[^b])*$/

Or if you are feeling pedantic...

/^(?:b b|[^b])*$/

Many thanks for this (and to all those of you who have found alternate
solutions).

Is there any real benefit from using the latter solution?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,056
Latest member
GlycogenSupporthealth

Latest Threads

Top