Burned using the .. operator

J

J. Romano

Dear Perl community,

I was recently burned using the .. operator in a Perl program. I
initially thought it was a bug that was causing the problem, but I
eventually figured out the reason behind the "burn." Therefore, I'll
share my findings here in the hopes that it might help someone in the
future not to be burned by it like I was.

Basically, I had code that read an input file a line at a time,
processed the line, and then put the processed line into an output
file. In the loop that read each line, there was a condition that was
something like:

if ($mandatory or m/BEGIN/ .. m/END/)
{
print OUT $_;
}

The condition was supposed to mean: "If the line is mandatory or it is
between the 'BEGIN' and 'END' lines, print the line to the output
file."

Sounds simple enough.

Well, what was happening was that sometimes that condition worked
exactly as expected, but other times it always evaluated to true, even
when the line was not mandatory nor fell between BEGIN and END lines.

I thought this was a bug in Perl, because when I ran the program in
the debugger, stopped at a breakpoint set at that condition, and
evaluated the condition with a command like:

print "True" if ($mandatory or m/BEGIN/ .. m/END/);

the word "True" would not be printed. But when I hit "n" to advance
to the next line, the program counter would advance to the print
statement, as if it evaluated to true!

I was ready to call it a Perl bug until I realized that the lines:

if ($mandatory or m/BEGIN/ .. m/END/) # line 1

and

if (m/BEGIN/ .. m/END/ or $mandatory) # line 2

are NOT equivalent!

Because of a thing called "short circuit evaluation," the ..
operator may not get evaluated in line 1, whereas it will always be
evaluated in line 2. And if you are expecting the .. operator to be
evaluated when in fact it doesn't, then that can mean that you think
the .. operator will get set to "false," when in reality it never
does, and instead will continue to return "true."

That's what was happening with me. Because the .. operator was not
being evaluated when I thought it was, it never got a chance to "shut
itself off," and since I was thinking of it in terms of the current
line being between the BEGIN and END lines, I could not figure out why
it was behaving the way it did.

So why did typing the condition right into the debugger return the
behavior that I was expecting (which was opposite of the program
behavior)? Because, when I typed the line:

print "True" if ($mandatory or m/BEGIN/ .. m/END/);

into the debugger, it was evaluating that particular .. operator FOR
THE FIRST TIME. And since the current line was not mandatory nor
between the BEGIN and END lines, it correctly evaluated to false.
Although they looked identical, the .. operator I typed into the
debugger was NOT the same one as the one in the Perl script, and
therefore they kept track of their own states, independent of each
other.

Which brings me to another point to be careful of:

Anytime you type a line like this (that contains the .. or the ...
operator) in the debugger:

print "True" if m/BEGIN/ .. m/END/;

it is essentially the same as the line:

print "True" if m/BEGIN/;

because that line, when typed in the debugger, only gets evaluated
once, and therefore only the first part of the .. operator will make a
difference when evaluated. This is true even if you type that
condition into the debugger multiple times (because, according to the
interpreter, those are separate instantiations of the .. operator, and
therefore they all keep their own separate state).

Chances are, if you are using the line:

if ($mandatory or m/BEGIN/ .. m/END/)

in a Perl program you probably meant to use:

if (m/BEGIN/ .. m/END/ or $mandatory)

instead, since short-circuit evaluation won't affect the state of
$mandatory (but can definitely affect the state of the .. operator).

So be careful of using the .. and ... operators in a condition
where short-circuit evaluation is an issue. Hopefully fewer
programmers will be burned by this now that I have shared this with
you.

-- Jean-Luc
 
M

Michele Dondi

So be careful of using the .. and ... operators in a condition
where short-circuit evaluation is an issue. Hopefully fewer
programmers will be burned by this now that I have shared this with
you.

Well, I appreciate your efforts, but IMHO, and I say IMHO, short
circuiting of logical operators is so charachteristic of Perl, and so
often used to one's great advantage that I doubt that any but a very
minority of programmers could get burnt with it in connection with
C<..> and C<...>; they have themselves other gotchas due to their
"exotic nature" that are more likely to byte an (inexperienced) user
on the neck...


Michele
 
A

Ala Qumsieh

J. Romano wrote:

[snip description of short-circuiting]
Which brings me to another point to be careful of:

Anytime you type a line like this (that contains the .. or the ...
operator) in the debugger:

print "True" if m/BEGIN/ .. m/END/;

it is essentially the same as the line:

print "True" if m/BEGIN/;

because that line, when typed in the debugger, only gets evaluated
once, and therefore only the first part of the .. operator will make a
difference when evaluated. This is true even if you type that
condition into the debugger multiple times (because, according to the
interpreter, those are separate instantiations of the .. operator, and
therefore they all keep their own separate state).

I don't understand what you mean here. The following two lines are
different:

print "True" if /BEGIN/ .. /END/;
print "True" if /BEGIN/;

The perlop docs explain the range operator:

In scalar context, ".." returns a boolean value. The operator is
bistable, like a flip-flop, and emulates the line-range (comma)
operator of sed, awk, and various editors. Each ".." operator
maintains its own boolean state. It is false as long as its left
operand is false. Once the left operand is true, the range operator
stays true until the right operand is true, *AFTER* which the range
operator becomes false again.
So be careful of using the .. and ... operators in a condition
where short-circuit evaluation is an issue. Hopefully fewer
programmers will be burned by this now that I have shared this with
you.

Short-circuiting is very useful and ubiquitous in Perl. Have you never
used the following idiom:

open my $fh, $file or die $!;

? Short-circuiting is what prevents the die() from executing if the
open() succeeds.

--Ala
 
J

J. Romano

Ala Qumsieh said:
Short-circuiting is very useful and ubiquitous in Perl. Have you never
used the following idiom:

open my $fh, $file or die $!;

? Short-circuiting is what prevents the die() from executing if the
open() succeeds.

Oh, yes, I've used that all the time. I happen to find
short-circuit evaluation very useful (especially when handling C
strings). But this was a case where I got stung by it because I was
treating the .. operator like a "between" operator, instead of a
"flip-flop" operator.
I don't understand what you mean here. The following two lines are
different:

print "True" if /BEGIN/ .. /END/;
print "True" if /BEGIN/;

That's part of my point. They certainly look different, but they
behave the same WHEN TYPED INTO THE DEBUGGER. If you don't believe
me, try it out yourself. Here, I made a sample program that you can
try out (explanation follows):


#!/usr/bin/perl -w
use strict;
$| = 1; # autoflush

while (<DATA>)
{
last if m/^__END__$/;

print "At line $.: $_";

if (m/BEGIN/ .. m/END/) # set breakpoint here (line 11)
{
print ".. operator returned true\n";
}
}

__DATA__
1
2
3 BEGIN
4
5
6 END
7
8
9
__END__


Save this program to a file named "dotdot.pl" and start the
debugger with the command:

perl -d dotdot.pl

Then set a breakpoint at line 11 with the command "b 11". Then type
"c" to start the program. As you go through the program with the "c"
command, the first three lines behave exactly as you'd expect,
returning true only for the third line.

But the fourth time you hit that breakpoint, don't continue.
Instead, type out the condition yourself, with a line like:

print "True" if (m/BEGIN/ .. m/END/)

Will it print "True"? Many people would think so, because the fourth
line falls between the BEGIN and END lines. But if you try it out, it
doesn't print "True"! Strangely enough, if you hit "n" in the
debugger to advance to the next line, it will go into the block, as if
the condition "m/BEGIN/ .. m/END/" evaluated to true!

So what is going on? Well, like I said in a previous post, the
following two lines are equivalent when typed into the debugger:

print "True" if /BEGIN/ .. /END/;
print "True" if /BEGIN/;

I have already explained why, but because apparently this isn't
intuitive, I'll explain it again:

Each instance of the .. and ... operators (in scalar context)
maintains its own state. If identical .. conditions happen more than
once in a script their states will be independent of each other,
meaning that one .. condition could return true and another could
return false, even if they look identical. That also means that every
time you type a condition with a .. operator in the debugger it will
use its own state, and act like it was invoked for the very first
time, making the condition "if (m/BEGIN/ .. m/END/)" behave the same
as "if (m/BEGIN/)".

(Like I said before, test this in the debugger if you don't believe
me. Try to find an instance where the above two lines won't evaluate
to the same thing when typed one after the other in the debugger.)

If you think this is confusing, you're not alone. I took me a
while to find this, which is why I originally thought this was a bug
in Perl. But it's not a bug in Perl -- it's behaving exactly as it
should. It's not intuitive (at least not to me), which is exactly the
reason I got stung.

So that's why I'm sharing it here -- so that others who use the ..
and ... operators with short-circuit evaluation won't think they are
going crazy when the debugger logic seems to be contradicting the
logic written in the script.

-- Jean-Luc
 
J

Joe Smith

J. Romano said:
That's part of my point. They certainly look different, but they
behave the same WHEN TYPED INTO THE DEBUGGER.

You mean, they act differently when one of them is a string eval().
#!/usr/bin/perl -w
use strict;
$| = 1; # autoflush

use vars qw($temp);
while (<DATA>)
{
last if m/^__END__$/;

print "At line $.: $_";

if (m/BEGIN/ .. m/END/) # set breakpoint here (line 11)
{
print ".. operator returned true\n";
}

eval '$temp = m/BEGIN/ .. m/END/';
print "eval() returned '$temp'\n";
}

__DATA__
1
2
3 BEGIN
4
5
6 END
7
8
9
__END__

The .. operator when executed by a string eval (which the debugger does)
does not keep state in the same way that the .. operator does when it
is part of a compiled statement.
-Joe
 
J

J. Romano

Joe Smith said:
You mean, they act differently when one of them is a string eval().
[instead of part of a compiled statement]

Sure, that sounds about right. It never really crossed my mind
that the debugger was doing a string eval(), but if that's what it's
doing, that would explain that particular behavior.

Thanks for pointing that out, Joe.

-- Jean-Luc
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top