Coderef usage in complex data structures

K

kz

Hi Gurus,

Given the following text file (proprietary database format, thus no CPAN
modules available)
0001 08 01 24 22 24 25 22 64
0002 06 09 42 22 f3 8f
where
aaaa bb cc dd ee ff gg hh ii jj ....
aaaa: sequence number (hex, starting from 0001)
bb: length of record including this byte
cc: record type (01..2a)
bytes dd to EOL are the data. Each record type requires a specific amount of
data bytes (bb-2) and needs to be parsed using different criteria, therefore
I decided to code for each record type a different sub named type_01 thru
type_2a (#$% load of work, though...)

After consulting the perldsc manpage I came up with below code snippet,
which does exactly what needed, though I'm not confident if this is the best
way to do it.

#!/usr/bin/perl
use strict;
use warnings;
my %DATABASE;
my $dbfile = $ARGV[0];
open ( README, "<$dbfile") || print "ERROR: unable to open log file, error:
$! \n", exit 1;
# read data
while (my $line = <README>) {
chomp $line;
if ($line =~ /^(\w{4})\s+\w{2}\s+\w{2}\s+(\w{2})\s+(\w{2})\s?(.*)$/)
{
$DATABASE{$1}{length} = $2;
$DATABASE{$1}{recordtype} = $3;
$DATABASE{$1}{records} = [split /\s/,$4];
$DATABASE{$1}{parse} = \&{"type_".$3};
# I'm surprised that above line works at all....
} }
close ( README );
# process data
foreach my $key (sort keys %DATABASE) {
$DATABASE{$key}{parse}->($DATABASE{$key});
# I'm surprised that above line works at all....
}
exit 0;
sub type_01 {
print "type 01\n";
my $hh = $_[0];
print $hh->{recordtype},"\n";
print $hh->{length},"\n";
print join ("-", @{$hh->{records}}),"\n";
# further processing
}
....
sub type_09 {
print "type 09\n";
# further processing
}
.......
sub type_2a {
print "type 2a\n";
# further processing
}

Critics and suggestions are welcome.

Thanks in advance,

Zoltan Kandi, M. Sc.
 
B

Ben Morrow

kz said:
Hi Gurus,

Given the following text file (proprietary database format, thus no CPAN
modules available)
0001 08 01 24 22 24 25 22 64
0002 06 09 42 22 f3 8f
where
aaaa bb cc dd ee ff gg hh ii jj ....
aaaa: sequence number (hex, starting from 0001)
bb: length of record including this byte
cc: record type (01..2a)
bytes dd to EOL are the data. Each record type requires a specific amount of
data bytes (bb-2) and needs to be parsed using different criteria, therefore
I decided to code for each record type a different sub named type_01 thru
type_2a (#$% load of work, though...)

No... using the symbol table (in this case, the names of your subs)
instead of a real data structure is always wrong. Make an array of anon
subs:

my @process = (
sub {
print "type 01";
my $hh = shift;
...
},
sub {
print "type 02";
...
},
...
sub {
print "type 2a";
...
},
);

and then call with
$process[hex $DATABASE{$key}{recordtype}]->($DATABASE{$key});
After consulting the perldsc manpage I came up with below code snippet,
which does exactly what needed, though I'm not confident if this is the best
way to do it.

#!/usr/bin/perl
use strict;
use warnings;
my %DATABASE;
my $dbfile = $ARGV[0];
open ( README, "<$dbfile") || print "ERROR: unable to open log file, error:
$! \n", exit 1;
# read data
while (my $line = <README>) {
chomp $line;
if ($line =~ /^(\w{4})\s+\w{2}\s+\w{2}\s+(\w{2})\s+(\w{2})\s?(.*)$/)
{
$DATABASE{$1}{length} = $2;
$DATABASE{$1}{recordtype} = $3;
$DATABASE{$1}{records} = [split /\s/,$4];
$DATABASE{$1}{parse} = \&{"type_".$3};
# I'm surprised that above line works at all....

So'm I... are you sure you didn't turn 'use strict' off?
} }
close ( README );
# process data
foreach my $key (sort keys %DATABASE) {
$DATABASE{$key}{parse}->($DATABASE{$key});
# I'm surprised that above line works at all....

That, on the other hand, is perfectly straightforward and simply the way
you invoke a subref.

Ben
 
B

Brian McCauley

Ben Morrow said:
using the symbol table (in this case, the names of your subs)
instead of a real data structure is always wrong.

Damn, I kinda liked being able to use objects. :)

--
\\ ( )
. _\\__[oo
.__/ \\ /\@
. l___\\
# ll l\\
###LL LL\\
 
K

kz

Hi,

Ben Morrow said:
[snip original question]

No... using the symbol table (in this case, the names of your subs)
instead of a real data structure is always wrong. Make an array of anon
subs:

my @process = (
sub {
print "type 01";
my $hh = shift;
...
},
sub {
print "type 02";
...
},
...
sub {
print "type 2a";
...
},
);

and then call with
$process[hex $DATABASE{$key}{recordtype}]->($DATABASE{$key});

If these routines are relatively short, I'm fine with it. Since I'm planning
to get the whole thing wrapped
into Tk (maybe not the most correct way of saying this) these subs might get
relatively big and the readability of the whole code might suffer from this.
So I might still be ending up with coding named subs and pushing their
coderefs onto an array.
After consulting the perldsc manpage I came up with below code snippet,
which does exactly what needed, though I'm not confident if this is the best
way to do it.

#!/usr/bin/perl
use strict;
use warnings;
my %DATABASE;
my $dbfile = $ARGV[0];
open ( README, "<$dbfile") || print "ERROR: unable to open log file, error:
$! \n", exit 1;
# read data
while (my $line = <README>) {
chomp $line;
if ($line =~ /^(\w{4})\s+\w{2}\s+\w{2}\s+(\w{2})\s+(\w{2})\s?(.*)$/)
{
$DATABASE{$1}{length} = $2;
$DATABASE{$1}{recordtype} = $3;
$DATABASE{$1}{records} = [split /\s/,$4];
$DATABASE{$1}{parse} = \&{"type_".$3};
# I'm surprised that above line works at all....

So'm I... are you sure you didn't turn 'use strict' off?

Obviously not, I cut-and-pasted the whole test code as-is. Still, could
someone explain to me why this line works without any evals? This coderef
can not be created at compile time as $3 is not known at this point. Does
the compiler try to match that string on every loop iteration to an existing
sub?
That, on the other hand, is perfectly straightforward and simply the way
you invoke a subref.

....provided $DATABASE{$1}{parse} = \&{"type_".$3}; created a correct
coderef...

Where have I seen it yesterday? Oh yes it was my bubble-sorting script I
drew up the other day ;-)
(e-mail address removed)

Thanks Ben, I've got some re-coding to do now...

Cheers,

Zoltan Kandi, M. Sc.
 
B

Ben Morrow

kz said:
Obviously not, I cut-and-pasted the whole test code as-is. Still, could
someone explain to me why this line works without any evals? This coderef
can not be created at compile time as $3 is not known at this point. Does
the compiler try to match that string on every loop iteration to an existing
sub?

Interesting... it seems strict 'refs' doesn't apply to coderefs... I
didn't know that.

This is a straightforward symref. If you turn strict 'refs' off it works
for all ref types:

no strict 'refs';
my $scalar;
my $foo = "lar";
my $ref = \${"sca" . $foo};

However it seems that it works for coderefs even with strict 'refs' on.

Ben
 
T

Tad McClellan

Ben Morrow said:
Interesting... it seems strict 'refs' doesn't apply to coderefs... I
didn't know that.

This is a straightforward symref. If you turn strict 'refs' off it works
for all ref types:

no strict 'refs';
my $scalar;
my $foo = "lar";
my $ref = \${"sca" . $foo};

However it seems that it works for coderefs even with strict 'refs' on.


It looks to me like the docs for strict.pm are incomplete|misleading|wrong.


perldoc strict

"strict refs"
...
There is one exception to this rule:

$bar = \&{'foo'};
&$bar;


The docs do not provide a specification for the exception, only
an example of an exception.

The example has no runtime-ness in the symbol, so you could
reasonably conclude that the exception is only at compile-time,
while it looks like we are seeing runtime stuff being excepted...
 
K

kz

Tad McClellan said:
[snip]

It looks to me like the docs for strict.pm are incomplete|misleading|wrong.


perldoc strict

"strict refs"
...
There is one exception to this rule:

$bar = \&{'foo'};
&$bar;


The docs do not provide a specification for the exception, only
an example of an exception.

The example has no runtime-ness in the symbol, so you could
reasonably conclude that the exception is only at compile-time,
while it looks like we are seeing runtime stuff being excepted...

....at least this is what below snippet would suggest when run on XP and ASPN
631, both runtime and compile-time stuff are excepted.

What would be the correct/expected behaviour of Perl?

Side question: why does this manpage (and lots of others as well) still
suggest calling a sub with &? Is there any difference between &$bar and
$bar->()?
I have done some RTFM but still ... which FM should I believe?

use strict;
use warnings;
my $num = "01";
my $bar = \&{"sub_".$num};
my $baz = \&{"sub_02"};
print "before...\n";
$bar->();
$baz->();
print "after...\n";
exit 0;
sub sub_01 {
print "I am inside sub_01...\n";
}

sub sub_02 {
print "I am inside sub_02...\n";
}

Cheers,

Zoltan
 
N

nobull

Tad McClellan said:
It looks to me like the docs for strict.pm are incomplete|misleading|wrong.

Yes it is.
perldoc strict

"strict refs"
...
There is one exception to this rule:

$bar = \&{'foo'};
&$bar;


The docs do not provide a specification for the exception, only
an example of an exception.

The example has no runtime-ness in the symbol, so you could
reasonably conclude that the exception is only at compile-time,
while it looks like we are seeing runtime stuff being excepted...

Well, the exception would be rather meaningless if it were restricted
to constant expressions. I don't really think it would be
_reasonable_ to conclude that the exception is only for constant
expressions.

So if we assume the exception is not totally pointless then we can
infer that it must mean that you can derefernce a symbolic coderef as
the argument to the \ operator without having to relax strict. I
agree that it would be better for the docs to spell this out.

Actually, however, this is not the full story. You can also
dereference a symbolic coderef as the argument to goto(), defined() or
exists().

Indeed strict.pm goes on to mention goto() - thus having said there's
only one exception it lists two!

The purpose of use strict is to avoid the compiler thinking you want a
symref when you accidently get a string where you wanted a reference.
This is unlikely to be the case in the above situatiuons so it's more
convenient to have them excluded.

However, until this is documented I'm going to continue to put "no
strict 'refs'" in my AUTOLOAD functions.

I'm sorry that the following patch is line-wrap damaged, I'm having to
post via Google due to NNTP problems here.

--- perl5/5.8.0/strict.pm Fri Nov 1 15:39:49 2002
+++ strict.pm Fri Feb 27 17:52:35 2004
@@ -37,13 +37,18 @@
$file = "STDOUT";
print $file "Hi!"; # error; note: no comma after $file

-There is one exception to this rule:
-
- $bar = \&{'foo'};
- &$bar;
-
-is allowed so that C<goto &$AUTOLOAD> would not break under
stricture.
+There is an exception to this rule. You can dereference a string as
a
+subroutine as the argument to C<goto()>, C<defined()>, C<exits()> or
+the backslask operator.
+
+ $symref = 'foo';
+ $ref = \&$symref; # ok
+ goto &$symref; # ok
+ if ( defined &$symref ) { do_stuff() } # ok
+ if ( exists &$symref ) { do_stuff() } # ok

+This allowed, not because symbolic code references are a good thing,
+but so that AUTOLOAD methods do not need to switch off the stricture.

=item C<strict vars>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top