get local packages symbols with require


M

Marc Girod

Hello,

I am still trying to improve the CPAN ClearCase::Wrapper package:
http://cpansearch.perl.org/src/DSB/ClearCase-Wrapper-1.16/Wrapper.pm

In particular, somebody brought to my attention the fact that errors
in sub-modules which Wrapper loads, are ignored.
I decided that the point to fix was the snippet:

local $^W = 0; # in case a function is redefined
eval {
local @INC = ($dir); # make %INC come out right
eval "require $pkg";
};
warn [email protected], next if [email protected];

I restored the warnings, excepting 'redefine'.
I moved the 'warn [email protected]' inside the eval block, right after the require.
This gave me errors for all the modules used.
I removed the @INC restriction, and got to the next problem.

Now I got into %{"${pkg}::"} not only the local symbols, but also all
imported functions (e.g. 'find' from File::Find).
The problem is that I need this information (the local symbols) for
the remaining part.
Most of the symbols are indeed found in the autosplit.ix file which is
also accessed, but not all:
(first) some functions might have been excluded from the
autosplitting, and (second) functions may have been aliased to shorter
names, not matching any new *.al file (listed in autosplit.ix).

My next move was to try to (optionally) 'require' twice, first in an
open verbose mode, then as previously.
In the meanwhile, I restored %{"${pkg}::"} to its initial state.

This led to the next error: 'use vars', in the sub-modules, was
evaluated only once, the first time.

Deep in what looks as a dead end, I ask for critique and help
Thanks,
Marc
 
Ad

Advertisements

M

Marc Girod

Hi Ben,

I start from the bottom.

What is your actual problem? You appear to be thrashing about, trying a
whole lot of things you don't really understand and making a horrible
mess, but you haven't explained what you're trying to do and how it
isn't working.

Sorry. I find this a bit unfair, but I asked for it.
It is absolutely correct that I touch things I don't fully understand.
On another hand, I believe that the guy who wrote this package did
understand what he did.
This doesn't make the result perfect.
What I try to do, I explained in my introduction: the context is to
improve an existing module, which has users, hence respecting its
backwards compatibility.

Now there are precise issues:

Apart for that, this module works satisfactorily and provides a useful
service.
What service? It doesn't really make sense that I copy the doc which I
referred to, on CPAN.
This sets up an infrastructure for developing wrappers around a
proprietary tool, extending or modifying this one's behavior.

I develop myself one such wrapper.
What is $dir?

This is enclosed in a:

for my $dir (@INC) { ... }

loop. Wrapper modules will be searched under ClearCase/Wrapper
hierarchies below each such $dir.
Setting @INC to ($dir) restricts the number of visible packages.
Most commonly, the wrapper modules will be found under site_perl,
whereas strict or File::Find are located under lib.
Hence this restriction will prevent accessing many useful modules
specifed with 'use'.
How did you do that? If you used 'warnings', it won't do what you
expect. The reason for using $^W is that it propagates into required
modules; 'warnings' does not.

Right. Thanks. I indeeed used 'no warnings q(redefine)'.
Redefining functions is not something generally recommendable.
It has only come necessary for backwards compatibility reasons, while
adding a top level loop to the wrapper infrastructure, hence requiring
to catch existing 'exit' and 'exec' calls.
It may well stay limited to one level.
OTOH, $^W will only silence warnings in
code not under 'warnings', so that's not terribly useful either.
If you want to do this right, you either need to find a way to avoid
redefining functions, or you need to set up a local __WARN__ handler
which filters out the 'redefined' warnings.

I had a look at that and couldn't quite figure out how to do it.
But let's leave it for now.
What errors?

Cannot find File::Find was one.
Sorry, I am home now, without access to my logs or the environment.
...so, @INC = $dir was completely wrong, and nothing was getting loaded
at all, but you weren't seeing errors because you threw them away? OK.

As far as I can tell, no.
It must have done something useful despite the hidden errors.
Or I am understanding even less than you think.
What does 'require' do in presence of errors?
Somehow it must have skipped them in some way...
Sub::Identify will tell you where a sub comes from.

Thanks... Interesting.
Be careful about grubbing around in the symbol tables: there can be
things in there you may not expect. For instance, a constant created
with 'constant' is represented as a scalar ref in the symbol table,
where you would expect a glob to be. If you look up the glob by name
perl will transform it into a real glob for you, but if you try to
follow the ref from the stash you will find it isn't what you expect.
OK.



What did you think that would do?

I thought the first pass would populate the symbol table with
everything useful plus extra symbols I didn't want. It would give me
the useful errors and warnings I was lacking, without spurious extra
ones which obviously didn't affect the behavior.
I would throw away the whole result and the second pass would do as it
had always done, and whatever this is.
Now, as told, I noticed that I failed to 'throw away the whole
result', and that the first pass affected the second.
Be *even* *more* careful about changing the stashes directly. This is an
area where there are still (probably) bugs in perl: here be segfaults.

OK. This was my attempt to clean up the result of the first phase.
Naive maybe.
If you require the same file twice, without clearing the %INC entry, the
second time will do nothing. This is (pretty-much) the whole point of
require.

OK. I could clean up %INC. Thanks.
Er... Was that a recommendation?

I got both.
In summary, you offer me two venues. Either:
- use Sub::Identify to skip some entries in %{"${pkg}::"}, without
needing to touch it after the single pass needed.
- cleaning %INC in addition to what I already did.

Clearly the first one looks preferable, especially as I still don't
understand how the 'use vars' were affected (and probably the 'use
constant' as well).
There obviously remains something unsettled about what actually
happens in the original code, with the require under a restricted
@INC, and throwing errors away.
I would be *very* surprised if nothing at all.
In fact, I don't know where else the modules could be loaded...
And, they are.

Thanks,
Marc
Marc

Thanks.
 
M

Marc Girod

Hi again Ben, and of course others,

- use Sub::Identify to skip some entries in %{"${pkg}::"}, without
needing to touch it after the single pass needed.

I decided to give it a try from home, and indeed, this looks
promising:

tmp> perl -MFile::Find -M'Sub::Identify q:)all)' -le \
'for (keys %{"File::Find::"}){ \
my $sn=sub_name(\&$_); \
my $p = stash_name(\&$_); \
defined $sn and $p eq q(File::Find) and print "$sn in $p"}'
find in File::Find
finddepth in File::Find

(skipped: 44)

Using B instead trades off only functions which would be loaded from C
libraries, right?
I think this is excluded in my case.

tmp> perl -MFile::Find -MB -le \
'for (keys %{"File::Find::"}){ \
my $coderef=\&$_; \
ref $coderef or next; \
my $cv = B::svref_2object($coderef); \
$cv->isa('B::CV') or next; \
$cv->GV->isa('B::SPECIAL') and next; \
$p=$cv->GV->STASH->NAME; \
if($p eq q(File::Find)){ \
print "$p::",$cv->GV->NAME} \
else{$skip++}} \
print"skipped: $skip"'
find
finddepth
skipped: 44

Thanks,
Marc
 
M

Marc Girod

Hi Ben,

...*why*? That's exactly what perl does when you call 'require' without
messing about with @INC.

OK. I must be able to explain. Even for myself.
This is essential to this infrastructure, and the key word to describe
it probably 'overlay'.

The point is not to run known code, but to allow unknown modules,
mostly auto-split/loaded, to complete and redefine some functions.
These functions are made available through a shared driver (the
cleartool.plx wrapper, aliased to ct), from the command line, such as
e.g.:

ct describe
ct checkin
ct find

....so that the different functions may be provided by different
mixins, and some may override others. E.g. checkin can be offered by
ClearCase::Wrapper::DSB and ClearCase::Wrapper::MGi.
If you install only DSB, you get its definition. If you install both,
MGi's takes over (because of alphabetic order for the time being).

The ct wrapper should work whatever the mix of modules installed.
The modules are written independently by different authors.

So, the main wrapper implements a dispatch table, and forces the
functions to autoload as needed.
The only effect this will have is that if the
module you are loading loads more modules in its turn, they will have to
be in the same directory. Is that the desired effect here? ISTM it will
just cause problems, unless there's something funny going on you haven't
posted yet.

In fact, there is. This part is just setting up the table. The real
meat (code) is in the *.al files, and its loading takes place from the
cleartool.plx driver.

There is in fact little code that would not be autosplit/loaded, and
it may well be that until now, it could not 'use' other modules. There
has however always been a 'use strict', which I have never seen to
affect in any way...
Something, though: some data, and function aliases.
And I have been able to place there some small routines, shared by
different autoload files.

I'd wish to be able to put more.
(It's not quite clear to me yet how much of this you wrote, and how much
is someone else's code you're trying to improve.

2% and 98%. My part is growing, but started from 0% 5 years ago.
If you didn't, can you see anything elsewhere in
the code to suggest why this was done? Do you have any VCS logs you can
look through?)
Sure.

The second is that overriding builtins like exit and exec is quite
different from overriding functions. Which are you trying to do, or are
you trying to both?

Both. As explained above, allowing to redefine functions is important
too, although the ordering makes it so that the first holds. The
others can be ignored.
OK. Basically, you would want something like

Thanks. Will be useful.

Marc
 
M

Marc Girod

Hi,

Using B instead trades off only functions which would be loaded from C
libraries, right?

I tried this in my code, after a clean require:

for my $subdir (qw(ClearCase/Wrapper)) {
for my $dir (@INC) {
my @pms = sort glob("$dir/$subdir/*.pm");
for my $pm (@pms) {
$pm =~ s%^$dir/(.*)\.pm$%$1%;
(my $pkg = $pm) =~ s%[/\\]+%::%g;
{
eval {
eval "require $pkg";
warn [email protected] if [email protected];
};
next if [email protected];
my $ix = "auto/$pm/autosplit.ix";
if (-e "$dir/$ix") {
eval { require $ix };
warn [email protected] if [email protected];
}
}
my %names = %{"${pkg}::"};
for (keys %names) {
my $coderef = \&$_;
ref $coderef or next;
my $cv = B::svref_2object($coderef);
$cv->isa('B::CV') or next;
$cv->GV->isa('B::SPECIAL') and next;
my $p=$cv->GV->STASH->NAME;
next unless $p eq $pkg;
....

Now, under the debugger, this gives, positioned on the last line:

DB<4> x $pkg, $dir, $subdir, scalar keys %names
0 'ClearCase::Wrapper::MGi'
1 '/home/emagiro/perl/lib/perl5'
2 'ClearCase/Wrapper'
3 105
DB<8> p $_
annotate
DB<9> x $p, $pkg
0 'ClearCase::Wrapper'
1 'ClearCase::Wrapper::MGi'
DB<10> x @pms
0 'ClearCase/Wrapper/MGi'

Yet, the function 'annotate' is only defined in
ClearCase::Wrapper::MGi.
Now, it gets assigned (wrongly) to ClearCase::Wrapper.

So... I hope I am progressing, but I am not home yet.

Thanks,
Marc
 
M

Marc Girod

So... I hope I am progressing, but I am not home yet.

I progress reading Dave's original code.
He was 10 years ago ahead of me now.
Now, he had the following:

my $tglob = "${pkg}::$_";
....
if ($] >= 5.006) {
next unless eval { exists &{$tglob} };
}

which shows me my error.
I keep the B code:

my $coderef = \&{$tglob};
ref $coderef or next;
my $cv = B::svref_2object($coderef);
$cv->isa('B::CV') or next;
$cv->GV->isa('B::SPECIAL') and next;
my $p=$cv->GV->STASH->NAME;
next unless $p eq $pkg;

And this works fine on various symbols:

DB<16> x $tglob
0 'ClearCase::Wrapper::MGi::find'
....
DB<19> x $cv->GV->STASH->NAME
0 'File::Find'

DB<22> $tglob = 'ClearCase::Wrapper::MGi::annotate'
....
DB<27> x $cv->GV->STASH->NAME
0 'ClearCase::Wrapper::MGi'

At least the smoke test is through... even on the command line:

ClearCase-Wrapper-MGi> wrapdebug lsgen -d 0 -a MGi.pm
[email protected]@/main/lx/18 (MG, MG_4.106)
ClearCase-Wrapper-MGi> ct lsgen -d 0 -a MGi.pm
[email protected]@/main/lx/18 (MG, MG_4.106)
ClearCase-Wrapper-MGi> wrapdebug find . -name MGi.pm -print
../[email protected]@

Marc
 
Ad

Advertisements

P

Peter J. Holzer

In fact, there is. This part is just setting up the table. The real
meat (code) is in the *.al files, and its loading takes place from the
cleartool.plx driver.

There is in fact little code that would not be autosplit/loaded, and
it may well be that until now, it could not 'use' other modules. There
has however always been a 'use strict', which I have never seen to
affect in any way...

Use loads each module only once. Presumably your scripts all start with
"use strict", so strict is loaded long before you mess with @INC. All
modules loaded after that see that strict is already loaded and don't
care where it came from.

hp
 
M

Marc Girod

Use loads each module only once. Presumably your scripts all start with
"use strict", so strict is loaded long before you mess with @INC. All
modules loaded after that see that strict is already loaded and don't
care where it came from.

Good point.
What this shows (confirms) is that backwards compatibility is probably
easier to achieve than I thought: no module has even been able to load
anything, except from the autosplit functions.

Marc
 
Ad

Advertisements

R

Rainer Weikusat

Dr.Ruud said:
Don't test [email protected], but test the what the eval returns.

This is a hack intended to work around destructors clearing or
changing [email protected] which happen to be executed automatically before the 'eval
scope' is left using a version of Perl older than 5.14, based on the
'ideological standpoint' that fixing broken code is Just Completely
Wrong[tm] and that one should always prefer adding bizarre
workarounds, ideally, even in places where the aren't needed, instead.

Anything else just makes the code too easy to understand and since "it
was hard to write, it should be hard to read" ...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top