Requiring Lexical $_ / Obliterating Global $_?

T

Tim McDaniel

Currently I do
local $_;
at the start of each sub, but it's an annoyance and it's not
fail-safe (I can forget).

Just found out about Perl 5.10. Pretty cool.

Lexical $_

The default variable $_ can now be lexicalized, by declaring it
like any other lexical variable, with a simple
my $_;

The operations that default on $_ will use the lexically-scoped
version of $_ when it exists, instead of the global $_.

In a map or a grep block, if $_ was previously my'ed, then the $_
inside the block is lexical as well (and scoped to the block).

In a scope where $_ has been lexicalized, you can still have
access to the global version of $_ by using $::_, or, more simply,
by overriding the lexical declaration with our $_. (Rafael
Garcia-Suarez)

I like the convenience of the implicit USE of $_ in things like
"s/foo/bar/", but I've had an abhorrence of implicitly-declared
variables since FORTRAN days, and hatred of accidental variable
leakage since C.

Is there any way that I can dispose of any possibility of accidentally
using the global dynamic variable $_? That is, where by default

foreach (@array)

errors out (ideally at compile time), requiring that I do

foreach my $_ (@array)

or declare it per sub or whatever I like?

Should I look into playing with main's package symbol table (I know
nothing about the subject other than that it exists)?
 
M

Martijn Lievaart

Currently I do
local $_;
at the start of each sub, but it's an annoyance and it's not fail-safe
(I can forget).

Just found out about Perl 5.10. Pretty cool.

Lexical $_

The default variable $_ can now be lexicalized, by declaring it like
any other lexical variable, with a simple
my $_;

The operations that default on $_ will use the lexically-scoped
version of $_ when it exists, instead of the global $_.

In a map or a grep block, if $_ was previously my'ed, then the $_
inside the block is lexical as well (and scoped to the block).

In a scope where $_ has been lexicalized, you can still have access
to the global version of $_ by using $::_, or, more simply, by
overriding the lexical declaration with our $_. (Rafael
Garcia-Suarez)

I like the convenience of the implicit USE of $_ in things like
"s/foo/bar/", but I've had an abhorrence of implicitly-declared
variables since FORTRAN days, and hatred of accidental variable leakage
since C.

Is there any way that I can dispose of any possibility of accidentally
using the global dynamic variable $_? That is, where by default

foreach (@array)

errors out (ideally at compile time), requiring that I do

foreach my $_ (@array)

or declare it per sub or whatever I like?

If I read that correctly, using 'my $_;' once at the top of your program
should do what you want.

HTH,
M4
 
D

Dr.Ruud

Tim McDaniel schreef:
Currently I do
local $_;
at the start of each sub

Why? At least I never needed to do that, and I assume virtually nobody
does.

The rule is simple:
Either a sub can change $_ in a documented way, or it leaves it
untouched.
 
P

Peter J. Holzer

Currently I do
local $_;
at the start of each sub, but it's an annoyance and it's not fail-safe
(I can forget).

Just found out about Perl 5.10. Pretty cool.

Lexical $_ [...]

I like the convenience of the implicit USE of $_ in things like
"s/foo/bar/", but I've had an abhorrence of implicitly-declared
variables since FORTRAN days, and hatred of accidental variable leakage
since C.

Is there any way that I can dispose of any possibility of accidentally
using the global dynamic variable $_? That is, where by default

foreach (@array)

errors out (ideally at compile time), requiring that I do

foreach my $_ (@array)

or declare it per sub or whatever I like?

If I read that correctly, using 'my $_;' once at the top of your program
should do what you want.

I don't think so. In that case, if he forgets to declare $_ in a sub,
the globally defined lexical $_ will be used, which isn't much of an
improvement over using the predefined $_. In both cases the omission
will not be noticed and two subs may inadvertently change the same $_.

hp
 
P

Peter J. Holzer

Tim McDaniel schreef:

Why? At least I never needed to do that, and I assume virtually nobody
does.

I needed to do that in a number of cases. I've never made a habit of it
although I can see why that might be a good idea.
The rule is simple:
Either a sub can change $_ in a documented way, or it leaves it
untouched.

Right. And if your sub should leave $_ untouched, but either changes it
itself (e.g., by using "while (<>) {}") or calls another sub which does,
then adding "local $_;" at the start ensures that your sub indeed leaves
$_ untouched (as far as the caller can tell).

hp
 
T

Tim McDaniel

Tim McDaniel schreef:


Why? At least I never needed to do that, and I assume virtually
nobody does.

The rule is simple:
Either a sub can change $_ in a documented way, or it leaves it
untouched.

That's like saying "In FORTRAN, the rule is simple: never use an
undeclared variable". What about typos and oversights? If having
rules was sufficient to prevent bugs, there'd be little need for error
messages or warnings.

But I didn't consider sufficiently the fact that foreach, map, and
grep localize changes to $_. If I do an accidental "s///" without
"=~", then the bug will probably be noticed just because it's using a
unexpected value of $_ rather than whatever variable I meant.
"while (<>)" and explicit assignment to $_ (to make further m// and
s/// actions more consise) are the only idioms I can think of where a
persistent change to $_ occurs naturally.

But I do both of those with some frequency.
 
M

Martijn Lievaart

I don't think so. In that case, if he forgets to declare $_ in a sub,
the globally defined lexical $_ will be used, which isn't much of an
improvement over using the predefined $_. In both cases the omission
will not be noticed and two subs may inadvertently change the same $_.

You are right.

But then, where draw the line? Every block? Every sub? I can see the
advantage of implicitly my-ing $_ in every sub, but it's not very logical
in my eyes.

The question by the OP, forbidding to use global $_, is a definite
improvement, still leaves room for errors, but would be worthwhile none
the less.

M4
 
T

Tim McDaniel

You are right.

But then, where draw the line? Every block? Every sub?

I'd like to draw the line in the same place I draw the line on
declaring any other variable. I want a way to make $_ work by the
same rules as any other variable, except for being the implicit
operand in a number of operators.

I would do
foreach my $_ (...)
in the same circumstances that I would do
foreach my $i (...)
or
my $_;
in the same circumstances that I would do
my $file_line;
Because of the ugliness of this [1]
while (defined(my $_ = <FILE>)) {
I'd probably have to make $_ a block or sub my variable to be able to
still write
while (<FILE>) {

Perhaps, in places where I have to use < Perl 5.10, I should be
local-ing it in more places, in the ways shown above.


[footnote 1] Can I just say that I still can't wrap my head around how
a declaration can be buried in the middle of a statement, which is
bizarre to someone who grew up on FORTRAN, C, or most procedural
languages, and that its "enclosing" scope is outside the {...}?
 
D

Dr.Ruud

Tim McDaniel schreef:
[ accidently changing $_ ]
What about typos and oversights? If having
rules was sufficient to prevent bugs, there'd be little need for error
messages or warnings.

I find it hard to come up with an example of that in the last year, in
the group of about 30 Perl developers that I work with.
The only one I remember is a bug found in legacy code that looks like
$s =~ /x/ and /y/ and run();
which was probably once $_ oriented code (I never actually checked).

If I do an accidental "s///" without
"=~", then the bug will probably be noticed just because it's using a
unexpected value of $_ rather than whatever variable I meant.

Yes, that is one that someone might do wrong. The scope of an implicit
localized $_ is best small, often not more than 1 or 2 lines.
Just use a "for my $_value (@values) { ... }" where feasible.

I do see some use now for something like "DEBUG>1 and local $_;" at the
start of each sub.

explicit assignment to $_ (to make further m// and
s/// actions more consise)

Stop doing that. You can use

for ( $scalar ) { ... }

and code like

s/^\s+//, s/\s+$//, s/\s+/ /g for $old, $new;

instead.
 
M

Martijn Lievaart

I'd like to draw the line in the same place I draw the line on declaring
any other variable. I want a way to make $_ work by the same rules as
any other variable, except for being the implicit operand in a number of
operators.

The implication of such is that simply ::$_ should not be defined[1].
Which could be set by a (to be proposed) pragma I suppose. I like it.
[footnote 1] Can I just say that I still can't wrap my head around how a
declaration can be buried in the middle of a statement, which is bizarre
to someone who grew up on FORTRAN, C, or most procedural languages, and
that its "enclosing" scope is outside the {...}?

To me, having moved from FORTRAN/Basic/COBOL to C to C++ to Perl a.o., it
is bizarre not to be able to do this. Although I can see why embedding a
complex declaration in the middle of a C statement would pose a parsing
challenge.

[1] Not set to undef, just does not exist!
 
T

Tim McDaniel

I do see some use now for something like "DEBUG>1 and local $_;" at
the start of each sub.

Where DEBUG is a numeric constant? I think it's a nightmare to have
code change its behavior in subtle ways while debugging. ("Subtle
ways" meaning not the usual debug actions, like outputting log
messages, assertion checking, and such, which usually shouldn't affect
the program's state.) A $_ leakage problem would silently go away the
moment you turned on debugging!
Stop doing that.

That came across as rather a harsh command and, um, you're not my
boss.
You can use

for ( $scalar ) { ... }

and code like

s/^\s+//, s/\s+$//, s/\s+/ /g for $old, $new;

instead.

Thank you for the suggestions, though I'd like to mull over how
understandable "foreach" with one item would be to other readers.
 
D

Dr.Ruud

Tim McDaniel schreef:
Dr.Ruud:

Where DEBUG is a numeric constant?

Put

use constant DEBUG => 0;

in a convenient place. That DEBUG is actually a "constant sub". Perl's
compiler uses this to ignore that code, so it will not be compiled but
skipped. See `perldoc constant` for details.

I think it's a nightmare to have
code change its behavior in subtle ways while debugging.

That is exactly why the ">1" is there, so you can still set DEBUG to 1
to not have the line.
Give it more thought, because it is useful.


$ perl -MO=Deparse -e'
use constant DEBUG => 0;
DEBUG>1 and local $_;
'
use constant ('DEBUG', 0);
'???';
-e syntax OK


$ perl -MO=Deparse -e'
use constant DEBUG => 1;
DEBUG>1 and local $_;
'
use constant ('DEBUG', 1);
'???';
-e syntax OK


$ perl -MO=Deparse -e'
use constant DEBUG => 2;
DEBUG>1 and local $_;
'
use constant ('DEBUG', 2);
local $_;
-e syntax OK


The 1 is arbitrary, you can set the lower limit as high as you need.


As you know, because "local $_;" sets $_ to undef, it can expose some
bugs, like in

$ perl -wle 's/a/b/'
Use of uninitialized value in substitution (s///) at -e line 1.

(I am assuming that "use strict; use warnings;" are in every file.)


("Subtle
ways" meaning not the usual debug actions, like outputting log
messages, assertion checking, and such, which usually shouldn't affect
the program's state.) A $_ leakage problem would silently go away the
moment you turned on debugging!

Once you grok the ">1", you'll see that you have the choice.

That came across as rather a harsh command and, um, you're not my
boss.

One is what one reads. I can't mind you picking it up as harsh, because
I can't know in what mood you will be when reading what I wrote. Always
read such things as if written by a friend, it really helps.

And it was a reference to the well known story that starts with "Doctor,
it hurts when I do this.".

Thank you for the suggestions, though I'd like to mull over how
understandable "foreach" with one item would be to other readers.

Do with it whatever you want. I like it a lot.
My second example has two items, but only after I substituted "$input"
by "$old, $new".
 
B

Ben Morrow

Quoth (e-mail address removed):
Currently I do
local $_;
at the start of each sub, but it's an annoyance and it's not
fail-safe (I can forget).

Just found out about Perl 5.10. Pretty cool.

Lexical $_
Is there any way that I can dispose of any possibility of accidentally
using the global dynamic variable $_? That is, where by default

You want to tie (the global) $_ to a class that looks like

sub FETCH { croak "Global \$_ forbidden" }
sub STORE { croak "Global \$_ forbidden" }

I swear there's a module on CPAN that implements this, but I can't find
it :(.

Ben
 
M

Martijn Lievaart

Tim McDaniel schreef:

[ piggying here, didn't see the original ]

As it's a fairly standard idiom, I would expect any competent Perl
programmer to either know it, or look it up.

HTH,
M4
 
P

Peter J. Holzer

I'd like to draw the line in the same place I draw the line on
declaring any other variable. I want a way to make $_ work by the
same rules as any other variable, except for being the implicit
operand in a number of operators.

I would do
foreach my $_ (...)
in the same circumstances that I would do
foreach my $i (...)

I would not want this.

foreach (...) {
}

already implicitely declares a new $_ for a defined scope. Making the
declaration explicit just adds noise, not readability. Same for map and
grep.

However:
my $_;
in the same circumstances that I would do
my $file_line;

I agree with this, and I'd like a less ugly version of this:
Because of the ugliness of this [1]
while (defined(my $_ = <FILE>)) {

(Actually, I'm not even sure if this does what I think it should do
(restrict the scope of $_ to the body of the loop).

hp
 
B

Ben Morrow

Quoth "Peter J. Holzer said:
Because of the ugliness of this [1]
while (defined(my $_ = <FILE>)) {

There's no need for it to be that ugly: you get the 'defined' for free:

~% perl -MO=Deparse -e'while (my $x = <>) {1}'
while (defined(my $x = <ARGV>)) {
'???';
}
-e syntax OK
(Actually, I'm not even sure if this does what I think it should do
(restrict the scope of $_ to the body of the loop).

You get a fresh copy of $_ for each iteration:

~% echo "\n" | perl5.10.0 -lE'
print "global ", \$_;
my $_;
print "outer ", \$_;
my @a;
while (my $_ = <>) {
print "inner ", \$_;
push @a, \$_; # we must keep a ref, or perl reuses
} # the memory anyway
print "outer ", \$_'
global SCALAR(0x28361380)
outer SCALAR(0x28361340)
inner SCALAR(0x28361680)
inner SCALAR(0x28307310)
outer SCALAR(0x28361340)

Ben
 
T

Tad J McClellan

Martijn Lievaart said:
Tim McDaniel schreef:

[ piggying here, didn't see the original ]

As it's a fairly standard idiom, I would expect any competent Perl
programmer to either know it, or look it up.


While it is a common enough idiom, I still don't like it "bare".

I usually take the extra 4 seconds to type something like

for ( $scalar ) # because I'm gonna do a bunch of pattern matches
 
B

Ben Morrow

Quoth Tad J McClellan said:
Martijn Lievaart said:
Tim McDaniel schreef:
Thank you for the suggestions, though I'd like to mull over how
understandable "foreach" with one item would be to other readers.

[ piggying here, didn't see the original ]

As it's a fairly standard idiom, I would expect any competent Perl
programmer to either know it, or look it up.


While it is a common enough idiom, I still don't like it "bare".

I usually take the extra 4 seconds to type something like

for ( $scalar ) # because I'm gonna do a bunch of pattern matches

This is clearly a stylistic matter, so I should probably stay out of it,
but...

I really don't like comments like that. IMHO comments should explain
algorithms and the purpose of the content of the code, not the syntax.
If you don't think

for ($scalar) {
s/foo/bar/;
s/baz/quux/;
}

is clear enough as it stands, I would say you should pick a clearer
idiom rather than try to explain an unclear one with a comment.

Of course, with 5.10 you can say

given ($scalar) {
s/foo/bar/;
}

without fear of misunderstanding (once everyone's familiar with 5.10 :)).

Ben
 
T

Tim McDaniel

I would not want this.

foreach (...) {
}

already implicitely declares a new $_ for a defined scope.

But they're different scopes: "my" is lexical scopes but the implicit
declaration is dynamic scope. Dynamic Scope Considered Harmful.
 
T

Tim McDaniel

Tim McDaniel schreef:

[ piggying here, didn't see the original ]

As it's a fairly standard idiom,

I've never seen
for ($some_scalar) {...}
While I've not read much Perl code written by others, I've not seen it
in examples either. ... though when looking up something else in
"man perlsyn", I see it's used there.

It's also not intuitive to me. I'm used to foreach iterating over a
list. My first thought was "you're guaranteed that there's no list.
Why bother with a 'loop' that you know will run only once?"

{
local $_ = $some_scalar;
...
}
is equivalent even down to allowing "last" and "continue" blocks.

While it's a few characters longer, it's also immediately clear what's
going on. And with "my $_ = ...", it's got lexical scoping.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,776
Messages
2,569,603
Members
45,189
Latest member
CryptoTaxSoftware

Latest Threads

Top