check columns /tabs between two patterns

M

Marek

Hello all!


I have a tab separated text file. I want to check, whether the
contents are in the right fields. I have constructed here a little
example. Question is, how to check for right number of tabs between
"pattern" and "number" ... In my example, there is checked only one
wrong example. To be clear I take this example:

(pattern3)\t{4,}(number3)

this is checking 4 or more tabs between pattern3 and number3, which is
wrong. Only 3 tabs are right! So I have to check also for 2 tabs or 1
tab, which would be wrong too ...


Hope this was clear


best greetings marek


#! /usr/local/bin/perl

use warnings;
use strict;

while (<DATA>) {

s/\s+#.+//;
next if /^\s*$/;

if (/(pattern1)\t{2,}(number1)\s*/i) {
print
"Wrong number of tabs between \"$1\" and the number \"$2\" in the line:
\n\t$_\n\n";

}
elsif (/(pattern2)\t{3,}(number2)\s*/i) {
print
"Wrong number of tabs between \"$1\" and the number \"$2\" in the line:
\n\t$_\n\n";

}
elsif (/(pattern3)\t{4,}(number3)\s*/i) {
print
"Wrong number of tabs between \"$1\" and the number \"$2\" in the line:
\n\t$_\n\n";

}
else {

print "\nno match!\n\n";

}
}

__DATA__

pattern1 number1
pattern2 number2
pattern3 number3
pattern1 number1 # wrong number of tabs (2tabs)
pattern2 number2 # wrong number of tabs (3tabs)
pattern3 number3 # wrong number of tabs (4tabs)
pattern2 number2 # wrong number of tabs (1tab)
pattern3 number3 # wrong number of tabs (2tabs)
 
D

Dr.Ruud

Marek schreef:
I have a tab separated text file. I want to check, whether the
contents are in the right fields.

Maybe you can use this approach:

Read a line, split on /\t/ and store in an array, then do tests.


while ( <> ) {
/^\s*(?:#|$)/ and next; # comment- and blank lines

my @data = split /\t/;
... # tests
}

"Wrong number of tabs between \"$1\" and the number \"$2\" in the
line: \n\t$_\n\n";
[...]
"Wrong number of tabs between \"$1\" and the number \"$2\" in the
line: \n\t$_\n\n";
[...]
"Wrong number of tabs between \"$1\" and the number \"$2\" in the
line: \n\t$_\n\n";

See also `perldoc -f sprintf`, you want to create a single format string
for that.
 
T

Tad J McClellan

Marek said:
Question is, how to check for right number of tabs between
"pattern" and "number" ... In my example, there is checked only one
wrong example. To be clear I take this example:

(pattern3)\t{4,}(number3)

this is checking 4 or more tabs between pattern3 and number3, which is
wrong. Only 3 tabs are right! So I have to check also for 2 tabs or 1
tab, which would be wrong too ...

elsif (/(pattern3)\t{4,}(number3)\s*/i) {


elsif ( ! /(pattern3)\t{3}(number3)\s*/i ) {
 
M

Martijn Lievaart

Hello all!


I have a tab separated text file. I want to check, whether the contents
are in the right fields. I have constructed here a little example.
Question is, how to check for right number of tabs between "pattern" and
"number" ... In my example, there is checked only one wrong example. To
be clear I take this example:

(pattern3)\t{4,}(number3)

this is checking 4 or more tabs between pattern3 and number3, which is
wrong. Only 3 tabs are right! So I have to check also for 2 tabs or 1
tab, which would be wrong too ...

Something like (untested) /(pattern3)(?:\t{,2}|\t{4,}(number3?)/ maybe?

But I would do something like:

elsif (/(pattern2)(\t+)(number2)\s*/i) {
print
"Wrong number of tabs between \"$1\" and the number \"$2\" in the line:
\n\t$_\n\n"
if length($2) != 2;

}

And even that can be optimized further, but this should get you going.

HTH,
M4
 
M

Marek

Thank you all for your answers! I will stick to

elsif (/(pattern2)(\t+)(number2)\s*/i) {
print
"Wrong number of tabs between \"$1\" and the number \"$2\" in the
line:
\n\t$_\n\n"
if length($2) != 2;

}

Martijns suggestion! Instead of searching all positive possibilities
of wrong numbers of \tabs, it is easier to say if not the right number
of tabs ...

Did not know this possibility to check with "length". Fantastic :)


Thank you all again


marek
 
M

Martijn Lievaart

Thank you all for your answers! I will stick to

elsif (/(pattern2)(\t+)(number2)\s*/i) {
print
"Wrong number of tabs between \"$1\" and the number \"$2\" in the line:
\n\t$_\n\n"
if length($2) != 2;

}

Martijns suggestion! Instead of searching all positive possibilities of
wrong numbers of \tabs, it is easier to say if not the right number of
tabs ...

Did not know this possibility to check with "length". Fantastic :)

I like Tads suggestion better. I'm a bit ashamed I didn't think of that
myself.

M4
 
M

Marek

But Tad's suggestion I don't understand!

If something is *not* matching

elsif ( ! /(pattern3)\t{3}(number3)\s*/i ) { ... }

it will always match. Consider first line of __DATA__

pattern1 number1

this would match already, because it *does not* match, because of > !
<

But I am a simple beginner; probably I did not understand something
here!



greetings marek
 
T

Tad J McClellan

Marek said:
But Tad's suggestion I don't understand!

If something is *not* matching

elsif ( ! /(pattern3)\t{3}(number3)\s*/i ) { ... }


You said "Only 3 tabs are right", so to test if the data is "right":

elsif ( /(pattern3)\t{3}(number3)\s*/i ) { ... } # right

but the body of the elsif deals with when it is NOT right, so you need:

elsif ( ! /(pattern3)\t{3}(number3)\s*/i ) { ... } # not right
 
M

Marek

Sorry Tad,

but I understood you like follows:

#! /usr/local/bin/perl

use warnings;
use strict;

while (<DATA>) {

s/\s+#.+//;
next if /^\s*$/;

if ( ! /(pattern1)\t(number1)\s*/i) {
print
"[First if:] Wrong number of tabs between \"$1\" and the number \"$2\"
in the line:\n\t$_\n\n";

}
elsif ( ! /(pattern2)\t{2}(number2)\s*/i) {
print
"[First elsif:] Wrong number of tabs between \"$1\" and the number
\"$2\" in the line:\n\t$_\n\n";

}
elsif ( ! /(pattern3)\t{3}(number3)\s*/i) {
print
"[First elsif:] Wrong number of tabs between \"$1\" and the number
\"$2\" in the line:\n\t$_\n\n";

}
else {

print "\nno match!\n\n";

}
}

__DATA__

pattern1 number1
pattern2 number2
pattern3 number3
pattern1 number1 # wrong number of tabs (2tabs)
pattern2 number2 # wrong number of tabs (3tabs)
pattern3 number3 # wrong number of tabs (4tabs)
pattern2 number2 # wrong number of tabs (1tab)
pattern3 number3 # wrong number of tabs (2tabs)

And this is not working. Certainly a misunderstanding? And here the
version, how I understood Petr and Martjin ...

#! /usr/local/bin/perl

use warnings;
use strict;

while (<DATA>) {

s/\s+#.+//;
next if /^\s*$/;

if (/(pattern1)(\t+)(number1)\s*/i) {
print
"Wrong number of tabs between \"$1\" and the number \"$2\" in the line:
\n\t$.: $_\n\n" if length($2) !=1 ;

}
elsif (/(pattern2)(\t+)(number2)\s*/i) {
print
"Wrong number of tabs between \"$1\" and the number \"$2\" in the line:
\n\t$.: $_\n\n" if length($2) !=2 ;

}
elsif (/(pattern3)(\t+)(number3)\s*/i) {
print
"Wrong number of tabs between \"$1\" and the number \"$2\" in the line:
\n\t$.: $_\n\n" if length($2) !=3 ;

}
else {

print "\nno match!\n\n";

}
}

__DATA__

pattern1 number1
pattern2 number2
pattern3 number3
pattern1 number1 # wrong number of tabs (2tabs)
pattern2 number2 # wrong number of tabs (3tabs)
pattern3 number3 # wrong number of tabs (4tabs)
pattern2 number2 # wrong number of tabs (1tab)
pattern3 number3 # wrong number of tabs (2tabs)
 
M

Martijn Lievaart

But Tad's suggestion I don't understand!

If something is *not* matching

elsif ( ! /(pattern3)\t{3}(number3)\s*/i ) { ... }

it will always match. Consider first line of __DATA__

pattern1 number1

this would match already, because it *does not* match, because of > ! <

But I am a simple beginner; probably I did not understand something
here!

No, you're quite right and I was quite wrong, as well as Tad. Sorry,
brain misfiring I guess.

M4
 
T

Tim Greer

Marek said:
Sorry Tad,
.... snip...

And this is not working. Certainly a misunderstanding? And here the
version, how I understood Petr and Martjin ...

#! /usr/local/bin/perl

use warnings;
use strict;

while (<DATA>) {

s/\s+#.+//;
next if /^\s*$/;

if (/(pattern1)(\t+)(number1)\s*/i) {
print
"Wrong number of tabs between \"$1\" and the number \"$2\" in the
line: \n\t$.: $_\n\n" if length($2) !=1 ;

}
elsif (/(pattern2)(\t+)(number2)\s*/i) {
print
"Wrong number of tabs between \"$1\" and the number \"$2\" in the
line: \n\t$.: $_\n\n" if length($2) !=2 ;

}
elsif (/(pattern3)(\t+)(number3)\s*/i) {
print
"Wrong number of tabs between \"$1\" and the number \"$2\" in the
line: \n\t$.: $_\n\n" if length($2) !=3 ;

}
else {

print "\nno match!\n\n";

}
}

__DATA__

pattern1 number1
pattern2 number2
pattern3 number3
pattern1 number1 # wrong number of tabs (2tabs)
pattern2 number2 # wrong number of tabs (3tabs)
pattern3 number3 # wrong number of tabs (4tabs)
pattern2 number2 # wrong number of tabs (1tab)
pattern3 number3 # wrong number of tabs (2tabs)


You are capturing $2 and printing the tab(s) as the number in your
output when the number is wrong. You should use $3 for the actual
number in the output.

Also, just for the sake of getting rid of all of the if/else's, I've
modified the example to use one line and capture the number itself
(from pattern) to work more dynamically:

Note: Watch for word wrapping in the below code example:

#!/usr/bin/perl
use warnings;
use strict;

while (<DATA>) {
s/\s+#.+//;
next if /^\s*$/;
if (/pattern(\d+)(\t+)number(\d+)\s*/i) {
print "Wrong number of tabs between \"pattern$1\" ",
"and the number \"number$3\" in the line:\n\t$.:
",
" $_\n\n" if length($2) != $1;
} else {
print "\nno match!\n\n";
}
}

__DATA__
pattern1 number1
pattern2 number2
pattern3 number3
pattern1 number1 # wrong number
of tabs (2tabs)
pattern2 number2 # wrong number
of tabs        (3tabs)
pattern3 number3 # wrong number
of tabs        (4tabs)
pattern2 number2 # wrong number
of tabs (1tab)
pattern3 number3
# wrong number of tabs        (2tabs)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,599
Members
45,175
Latest member
Vinay Kumar_ Nevatia
Top