Lexical reference to an anonymous recursive subroutine: impossible?

F

florian

Hello

It might sound slightly like a joke, but I am precisely trying to
implement the above. The reasons are as follows:

- I am programming a library function (in a module) that has a helper
function. Ideally, this helper function should be visible only to the
library function which uses it. Knowing that it is possible to say

{
my $lexical_var = 0;
sub foo {$lexical_var++;}
sub bar {print $lexical_var;}
}

to the effect is that

foo(); foo(); bar();

will print "2", but

print $lexical_var;

will yield an uninitialized variable error, I've tried to do the same
with a subroutine declaration, saying "my sub helper_function { ... ",
to learn that this is not yet implemented. As it is possible to store
a reference to an anonymous subroutine in a variable:

$helper_function = sub { ... };

and use it saying

$helper_function->(<args, if any>); # or:
&$helper_function(<args, if any>);

I've resourcefully tried to use that knowledge to implement it.
However, the fact that my helper function is recursive on top of it
all seems to be a problem: The following works:

$times = 0;

$code_ref = sub {
$times++;
print "Hello, there!\n";
$code_ref->() until ($times == 3);
};

$code_ref->(); # prints "Hello, there!" three times

but this (the line numbers are not part of the code):

1 $times = 0;
2
3 {
4
5 my $code_ref = sub {
6 $times++;
7 print "Hello, there!\n";
8 $code_ref->() until ($times == 3);
9 };
10
11 $code_ref->();
12
13 }

prints
Hello, there!
Undefined subroutine &main:: called at script line 8.

Thus, the anonymous subroutine referenced to in $code_ref is called
exactly once (from line 11), but apparently it cannot be found when
called from within itself (at line 8). (More precise, I assume, would
be to say "from within the subroutine, the variable is not visible",
or "it cannot be dereferenced via this variable")

The explanation that the variable $code_ref has gone out of scope
within the subroutine declaration seems to suggest itself, but I
couldn't claim that I understand this, since the declaration is within
the same block (and hence I'm anything else but confident that it is
in fact the reason).

Is there anybody who can, and would care to, explain the reasons for
this?

Thanks very much!

Florian
 
B

Brian McCauley

It might sound slightly like a joke, but I am precisely trying to
implement the above. The reasons are as follows:

- I am programming a library function (in a module) that has a helper
function. Ideally, this helper function should be visible only to the
library function which uses it. Knowing that it is possible to say

{
my $lexical_var = 0;
sub foo {$lexical_var++;}
sub bar {print $lexical_var;}
}

to the effect is that

foo(); foo(); bar();

will print "2", but

print $lexical_var;

will yield an uninitialized variable error, I've tried to do the same
with a subroutine declaration, saying "my sub helper_function { ... ",
to learn that this is not yet implemented. As it is possible to store
a reference to an anonymous subroutine in a variable:

$helper_function = sub { ... };

and use it saying

$helper_function->(<args, if any>); # or:
&$helper_function(<args, if any>);

I've resourcefully tried to use that knowledge to implement it.
However, the fact that my helper function is recursive on top of it
all seems to be a problem: The following works:

$times = 0;

$code_ref = sub {
$times++;
print "Hello, there!\n";
$code_ref->() until ($times == 3);
};

$code_ref->(); # prints "Hello, there!" three times

but this (the line numbers are not part of the code):

Please refrain from modifying your code in any way so that we can't
simply cut-n-paste.

If you need to refer to lines of your sample code in the narrative do
so by putting comments in the code and referring to those.
1 $times = 0;
2
3 {
4
5 my $code_ref = sub {
6 $times++;
7 print "Hello, there!\n";
8 $code_ref->() until ($times == 3);
9 };
10
11 $code_ref->();
12
13 }

prints

You forgot to use strict. Had you used strict you've had got a
different (and much more helpful) error. How much pain do you need
before you use strict?
Thus, the anonymous subroutine referenced to in $code_ref is called
exactly once (from line 11), but apparently it cannot be found when
called from within itself (at line 8). (More precise, I assume, would
be to say "from within the subroutine, the variable is not visible",
or "it cannot be dereferenced via this variable")

Yes, the lexical variable is not in scope so you are accessing
$main::code_ref.
The explanation that the variable $code_ref has gone out of scope
within the subroutine declaration seems to suggest itself,

No, it's not that is has gone out of scope, but rather it hasn't come
into scope yet.
Is there anybody who can, and would care to, explain the reasons for
this?

Exactly the same reason as you'll get an error from

use strict;
my $foo = $foo;

The scope of a lexical variable starts at the statement following he
declaration. Within the declaration statement the lexical variable is
not yet in scope.

This is one of the few times when it actually makes sense to separate
the declaration and initialisation.

use strict; # Always!
use warnings; # Always!

my $times = 0;
{

my $code_ref;
$code_ref = sub {
$times++;
print "Hello, there!\n";
$code_ref->() until ($times == 3);
};

$code_ref->();
undef $code_ref;
}

Note the final undef. This is because you've created a circular
reference. Without the explicit undef, when the execution pointer
passes the point where the variable $code_ref goes out of scope the
closure will not get garbage collected.

Personally I don't like having to manually make sure that $code_ref
gets undef()ed on _every_ _possible_ execution path so I use a package
variable and local.

For details see my previous posts on this matter:

http://groups.google.com/group/comp..._frm/thread/caa4307138ea1063/488fd05e61b50583
http://groups.google.com/group/comp..._frm/thread/c53ddd46fa3891be/757984a52cd091ae
http://groups.google.com/group/comp..._frm/thread/e00e827ea3f9ab6f/2fbfe18a8d5e7acd
 
U

Uri Guttman

BM> my $times = 0;
BM> {

BM> my $code_ref;
BM> $code_ref = sub {
BM> $times++;
BM> print "Hello, there!\n";
BM> $code_ref->() until ($times == 3);
BM> };

BM> $code_ref->();
BM> undef $code_ref;
BM> }

BM> Note the final undef. This is because you've created a circular
BM> reference. Without the explicit undef, when the execution pointer
BM> passes the point where the variable $code_ref goes out of scope the
BM> closure will not get garbage collected.

that code ref should be garbage collected even without the undef as it
leaves scope. nothing outside that block refers to it so it has only one
ref count which goes to 0 upon block exit. should be very simple to test
for this by also blessing it and creating a DESTROY method to print that
it was destroyed.

uri
 
M

Mirco Wahab

florian said:
- I am programming a library function (in a module) that has a helper
function. Ideally, this helper function should be visible only to the
library function which uses it. Knowing that it is possible to say
...
...
5 my $code_ref = sub {
6 $times++;
7 print "Hello, there!\n";
8 $code_ref->() until ($times == 3);
9 };
10
11 $code_ref->();
...

Is there anybody who can, and would care to, explain the reasons for
this?

There have already been several correct solutions
posted, I'll only make a small addendum.

Depending on what you exactly trying to do,
you could drop the external variables and work
on the stack in your inner sub:

{
my $code_ref;
$code_ref = sub {
warn "@_\n";
push @_,-1+pop@_ and $_[1] and &$code_ref
};

$code_ref->('Hello, there!', 3);

}

This would recursively invoke the sub
and modify stack values directly, so
it can't be interfered with from outside.

Regards

M.
 
F

florian

The way my works in this situation is rather irratating. The problem
is that my declares that $code_ref after the end of my statement. So
$code_ref isn't in scope on the RHS of the assignment.

Oh indeed ... thanks very much for this simple explanation. Now I even
spotted the according sentence in the documentation for 'my' (Camel
Book):

A private variable is not visible until the statement /after/ its
declaration.

(i.e. not within the same statement.) I just marked that sentence in
yellow (because it is much more important than a lot of other stuff
explained there). In a sense, I have learned, a statement such as

my $var = $var + 1;

(I know - who would want to do such a thing?) must be read right side
first, something which is actually familiar, as this is the case with
all assignments, come to think of it. In

$var = $var + 1; # clumsy, generic alternative to $var++;

the expression $var + 1 is evaluated _first_, and _then_ the result is
assigned to $var.

Thanks very much!

Florian
 
B

Brian McCauley

BM> my $times = 0;
BM> {

BM> my $code_ref;
BM> $code_ref = sub {
BM> $times++;
BM> print "Hello, there!\n";
BM> $code_ref->() until ($times == 3);
BM> };

BM> $code_ref->();
BM> undef $code_ref;
BM> }

BM> Note the final undef. This is because you've created a circular
BM> reference. Without the explicit undef, when the execution pointer
BM> passes the point where the variable $code_ref goes out of scope the
BM> closure will not get garbage collected.

that code ref should be garbage collected even without the undef as it
leaves scope. nothing outside that block refers to it so it has only one
ref count which goes to 0 upon block exit. should be very simple to test
for this by also blessing it and creating a DESTROY method to print that
it was destroyed.

Please see the previous thread (2nd one referenced in my post) where
you asserted exactly the same thing. You were wrong then (and you
admitted it). You're wrong again.
 
F

florian

You forgot to use strict. Had you used strict you've had got a
different (and much more helpful) error. How much pain do you need
before you use strict?

Hmm ...
Exactly the same reason as you'll get an error from

use strict;
my $foo = $foo;

Frankly, I wouldn't have seen the similarity without knowing. But
you're probably right with using strict. I tried this out on my
example, and

use strict;
use warnings;

$main::times = 0;

{
my $code_ref = sub {
$main::times++;
print "Hello, there!\n";
$code_ref->() until ($main::times == 3); # (line 12)
};

$code_ref->();

}

yielded the error

Global symbol "$code_ref" requires explicit package name at script
line 12.

which, in fact, tells me that $code_ref in this line is not the
lexical one, while the one in line 15 apparently is. (Not at the very
first glance, but on second look, yes.)


Note the final undef. This is because you've created a circular
reference. Without the explicit undef, when the execution pointer
passes the point where the variable $code_ref goes out of scope the
closure will not get garbage collected.
For details see my previous posts on this matter: ...

I've looked into these, and also read the discussion about garbage
collection in this thread. The Camel Book seems to confirm what you
say, but I'm actually rather confused about this, as I'm not advanced
enough (in particular, I've shunned OO and some related stuff so far)
to follow the discussions. But if I've understood the book correctly,
the lexical variable references the anonymous subroutine, while
something inside the subroutine references the variable in turn, which
is why their 'internal reference counts' (whatever that may be) can
never go to zero, even though they go out of scope when the block is
left. I also understand why undef-ing the variable should break this
circle, so I'll do this. Thanks very much for pointing this out!
Personally I don't like having to manually make sure that $code_ref
gets undef()ed on _every_ _possible_ execution path so I use a package
variable and local.

Where I intend to apply it it is never invoked casually but only
inside a certain block in the module file, so it should not be a
problem, I think. (That was actually the point - making it private to
one function.) I know I could have achieved a comparable effect by
using an ordinary subroutine, using Exporter and disallowing to export
the subroutine, which would have restricted it to the module file, but
this seems even more compelling.

Thanks very much for your help!

Florian
 
U

Uri Guttman

BM> Please see the previous thread (2nd one referenced in my post) where
BM> you asserted exactly the same thing. You were wrong then (and you
BM> admitted it). You're wrong again.

i bet you are right because of the ref inside the closure. i was
definitely groggy when i posted that earlier. :) i do use closures a bit
but rarely recursive ones and even more rarely recursive ones that i
need to destroy.

uri
 
M

Mirco Wahab

florian said:
I've looked into these, and also read the discussion about garbage
collection in this thread. The Camel Book seems to confirm what you
say, but I'm actually rather confused about this, as I'm not advanced
enough (in particular, I've shunned OO and some related stuff so far)
to follow the discussions. But if I've understood the book correctly,
the lexical variable references the anonymous subroutine, while
something inside the subroutine references the variable in turn, which
is why their 'internal reference counts' (whatever that may be) can
never go to zero, even though they go out of scope when the block is
left. I also understand why undef-ing the variable should break this
circle, so I'll do this. Thanks very much for pointing this out!

You can also spot the closure disaster
by using the use Devel::Cycle module,
in your case (in my code variant):

use strict;
use warnings;
use Devel::Cycle;

{
my $code_ref;
$code_ref = sub {
print "@_\n";
push@_,-1+pop@_ and $_[1] and &$code_ref
};

$code_ref->('Hello, there!', 3);
find_cycle($code_ref);
}

Would print (here):

Hello, there! 3
Hello, there! 2
Hello, there! 1
Cycle (1):
$A variable $code_ref => \&A

Adding the Devel::peek module would show the
reference count before leaving the braces:

...
find_cycle($code_ref);
Dump($code_ref);
}
...

(will print "2"!)

Regards

M.



Regards

M.
 
M

Michael Carman

I know I could have achieved a comparable effect by using an ordinary
subroutine, using Exporter and disallowing to export the subroutine, which
would have restricted it to the module file

Just to be clear, Perl doesn't have a concept of "private" functions like some
other languages do. An unexported subroutine can still be called using a fully
qualified name (e.g. Foo::Bar::baz()).

For any lurking novices out there, I'd also like to point out that in Perl it's
customary to expect the users of your module to play nice and not call
undocumented functions. Sometimes people prefix sub names with an underscore
(e.g. _mysub) to provide an extra indication that a sub is intended to be
private. While forcibly preventing bad behavior is an interesting exercise, it's
rare for people to actually do so.

-mjc
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top