extract range of lines using range op bug?

  • Thread starter it_says_BALLS_on_your forehead
  • Start date
I

it_says_BALLS_on_your forehead

i was looking in the Cookbook (2nd ed. pg. 199), and tried to extract a
range of lines using the examples given. one uses '..' (inclusive) and
the other uses '...' (exclusive...supposedly).

it appears that they both do the same thing however. for 3dot, i expect
lines between the patterns. for 2dot, i expect lines between and
including the patterns. i get lines between and including for both
however. can anyone explain this? or am i misinterpreting something?

here is the code:
use strict; use warnings;

my $file = 'data/regex_ranges.txt';
open my $fh, '<', $file or die "can't open $file: $!\n";
while ( <$fh> ) {
chomp;
if ( /START/ ... /END/ ) {
print "3dot: >>$_<<\n";
}
}
close $fh;

print "\n", '-' x 40, "\n";

open my $fh2, '<', $file or die "can't open $file: $!\n";
while ( <$fh2> ) {
chomp;
if ( /START/ .. /END/ ) {
print "2dot: >>$_<<\n";
}
}
close $fh2;


# i used files, b/c with __DATA__ the first while used it up, but the
file contains
# the same thing as __DATA__

__DATA__
START test
first name=Homer
middle name=Jay
last name=Simpson
END test
START test
first name=Bart
middle name=
last name=Simpson
END test
START test
first name=Lisa
last name=Simpson
END test
 
P

Paul Lalli

it_says_BALLS_on_your forehead said:
i was looking in the Cookbook (2nd ed. pg. 199), and tried to extract a
range of lines using the examples given. one uses '..' (inclusive) and
the other uses '...' (exclusive...supposedly).

it appears that they both do the same thing however. for 3dot, i expect
lines between the patterns. for 2dot, i expect lines between and
including the patterns. i get lines between and including for both
however. can anyone explain this? or am i misinterpreting something?

I don't have my copy of the Cookbook on it, so I don't know the exact
explanation it gives, but the terms 'inclusive' and 'exclusive' sound
suspicious to me. Here's how perldoc perlop defines the difference
between the two:

[the .. operator] is false as long as its left operand is false.
Once the left operand is true, the range operator stays true
until the right operand is true, AFTER which the range
operator becomes false again. It doesn't become false till
the next time the range operator is evaluated. It can test
the right operand and become false on the same evaluation it
became true (as in awk), but it still returns true once. If
you don't want it to test the right operand till the next
evaluation, as in sed, just use three dots ("...") instead
of two. In all other regards, "..." behaves just like ".."
does.


So the only difference between .. and ... is that .. checks the right
operand on the same iteration that the left operand became true. The
.... operator ignores the right operand until the following iteration.

Example:
#!/usr/bin/perl
use strict;
use warnings;

my @lines = <DATA>;
my $num = 0;
for (@lines) {
$num++;
chomp ;
print "Line $num: '$_'\n" if /START/ .. /END/;
}

print "First loop done\n";

$num = 0;
for (@lines){
$num++;
chomp;
print "Line $num: '$_'\n" if /START/ ... /END/;
}


__DATA__
some stuff
the STARTing line
more stuff
the ENDing line
more more stuff
the next STARTing line - that also ENDs
and final more stuff
that will only be printed for ...
until it ENDs


Line 2: 'the STARTing line'
Line 3: 'more stuff'
Line 4: 'the ENDing line'
Line 6: 'the next STARTing line - that also ENDs'
First loop done
Line 2: 'the STARTing line'
Line 3: 'more stuff'
Line 4: 'the ENDing line'
Line 6: 'the next STARTing line - that also ENDs'
Line 7: 'and final more stuff'
Line 8: 'that will only be printed for ...'
Line 9: 'until it ENDs'

In the first one, Line 6 caused the operator to become true and then
immediately false, because both the left and right were true. In the
second one, line six caused the operator to become true, but it did not
become false again until line 9.

Hope that clears it up.

Paul Lalli
 
I

it_says_BALLS_on_your forehead

it_says_BALLS_on_your forehead said:
i was looking in the Cookbook (2nd ed. pg. 199), and tried to extract a
range of lines using the examples given. one uses '..' (inclusive) and
the other uses '...' (exclusive...supposedly).

it appears that they both do the same thing however. for 3dot, i expect
lines between the patterns. for 2dot, i expect lines between and
including the patterns. i get lines between and including for both
however. can anyone explain this? or am i misinterpreting something?

here is the code:
use strict; use warnings;

my $file = 'data/regex_ranges.txt';
open my $fh, '<', $file or die "can't open $file: $!\n";
while ( <$fh> ) {
chomp;
if ( /START/ ... /END/ ) {
print "3dot: >>$_<<\n";
}
}
close $fh;

print "\n", '-' x 40, "\n";

open my $fh2, '<', $file or die "can't open $file: $!\n";
while ( <$fh2> ) {
chomp;
if ( /START/ .. /END/ ) {
print "2dot: >>$_<<\n";
}
}
close $fh2;


# i used files, b/c with __DATA__ the first while used it up, but the
file contains
# the same thing as __DATA__

__DATA__
START test
first name=Homer
middle name=Jay
last name=Simpson
END test
START test
first name=Bart
middle name=
last name=Simpson
END test
START test
first name=Lisa
last name=Simpson
END test


nvm, it's my misinterpretation. if END appears on the same line as
START, then the 2dot will not proceed to the next line, while 3dot
will. that's the difference. 3dot never tries to test both operators on
the same line., as it says in the book...
 
I

it_says_BALLS_on_your forehead

Paul said:
I don't have my copy of the Cookbook on it, so I don't know the exact
explanation it gives, but the terms 'inclusive' and 'exclusive' sound
suspicious to me.

you are correct. the book does use the term 'inclusive', but i dragged
it kicking and screaming out of its context, to my own chagrin when i
discovered my error. and then i supplied the word 'exclusive' on my
own. which exacerbated my embarrassment.
Here's how perldoc perlop defines the difference
between the two:

[the .. operator] is false as long as its left operand is false.
Once the left operand is true, the range operator stays true
until the right operand is true, AFTER which the range
operator becomes false again. It doesn't become false till
the next time the range operator is evaluated. It can test
the right operand and become false on the same evaluation it
became true (as in awk), but it still returns true once. If
you don't want it to test the right operand till the next
evaluation, as in sed, just use three dots ("...") instead
of two. In all other regards, "..." behaves just like ".."
does.


So the only difference between .. and ... is that .. checks the right
operand on the same iteration that the left operand became true. The
... operator ignores the right operand until the following iteration.


yup, that's what i discovered, thx Paul :).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,013
Latest member
KatriceSwa

Latest Threads

Top