Coderef usage in complex data structures

Discussion in 'Perl Misc' started by kz, Feb 25, 2004.

  1. kz

    kz Guest

    Hi Gurus,

    Given the following text file (proprietary database format, thus no CPAN
    modules available)
    0001 08 01 24 22 24 25 22 64
    0002 06 09 42 22 f3 8f
    where
    aaaa bb cc dd ee ff gg hh ii jj ....
    aaaa: sequence number (hex, starting from 0001)
    bb: length of record including this byte
    cc: record type (01..2a)
    bytes dd to EOL are the data. Each record type requires a specific amount of
    data bytes (bb-2) and needs to be parsed using different criteria, therefore
    I decided to code for each record type a different sub named type_01 thru
    type_2a (#$% load of work, though...)

    After consulting the perldsc manpage I came up with below code snippet,
    which does exactly what needed, though I'm not confident if this is the best
    way to do it.

    #!/usr/bin/perl
    use strict;
    use warnings;
    my %DATABASE;
    my $dbfile = $ARGV[0];
    open ( README, "<$dbfile") || print "ERROR: unable to open log file, error:
    $! \n", exit 1;
    # read data
    while (my $line = <README>) {
    chomp $line;
    if ($line =~ /^(\w{4})\s+\w{2}\s+\w{2}\s+(\w{2})\s+(\w{2})\s?(.*)$/)
    {
    $DATABASE{$1}{length} = $2;
    $DATABASE{$1}{recordtype} = $3;
    $DATABASE{$1}{records} = [split /\s/,$4];
    $DATABASE{$1}{parse} = \&{"type_".$3};
    # I'm surprised that above line works at all....
    } }
    close ( README );
    # process data
    foreach my $key (sort keys %DATABASE) {
    $DATABASE{$key}{parse}->($DATABASE{$key});
    # I'm surprised that above line works at all....
    }
    exit 0;
    sub type_01 {
    print "type 01\n";
    my $hh = $_[0];
    print $hh->{recordtype},"\n";
    print $hh->{length},"\n";
    print join ("-", @{$hh->{records}}),"\n";
    # further processing
    }
    ....
    sub type_09 {
    print "type 09\n";
    # further processing
    }
    .......
    sub type_2a {
    print "type 2a\n";
    # further processing
    }

    Critics and suggestions are welcome.

    Thanks in advance,

    Zoltan Kandi, M. Sc.
     
    kz, Feb 25, 2004
    #1
    1. Advertising

  2. kz

    Ben Morrow Guest

    "kz" <> wrote:
    > Hi Gurus,
    >
    > Given the following text file (proprietary database format, thus no CPAN
    > modules available)
    > 0001 08 01 24 22 24 25 22 64
    > 0002 06 09 42 22 f3 8f
    > where
    > aaaa bb cc dd ee ff gg hh ii jj ....
    > aaaa: sequence number (hex, starting from 0001)
    > bb: length of record including this byte
    > cc: record type (01..2a)
    > bytes dd to EOL are the data. Each record type requires a specific amount of
    > data bytes (bb-2) and needs to be parsed using different criteria, therefore
    > I decided to code for each record type a different sub named type_01 thru
    > type_2a (#$% load of work, though...)


    No... using the symbol table (in this case, the names of your subs)
    instead of a real data structure is always wrong. Make an array of anon
    subs:

    my @process = (
    sub {
    print "type 01";
    my $hh = shift;
    ...
    },
    sub {
    print "type 02";
    ...
    },
    ...
    sub {
    print "type 2a";
    ...
    },
    );

    and then call with
    $process[hex $DATABASE{$key}{recordtype}]->($DATABASE{$key});

    > After consulting the perldsc manpage I came up with below code snippet,
    > which does exactly what needed, though I'm not confident if this is the best
    > way to do it.
    >
    > #!/usr/bin/perl
    > use strict;
    > use warnings;
    > my %DATABASE;
    > my $dbfile = $ARGV[0];
    > open ( README, "<$dbfile") || print "ERROR: unable to open log file, error:
    > $! \n", exit 1;
    > # read data
    > while (my $line = <README>) {
    > chomp $line;
    > if ($line =~ /^(\w{4})\s+\w{2}\s+\w{2}\s+(\w{2})\s+(\w{2})\s?(.*)$/)
    > {
    > $DATABASE{$1}{length} = $2;
    > $DATABASE{$1}{recordtype} = $3;
    > $DATABASE{$1}{records} = [split /\s/,$4];
    > $DATABASE{$1}{parse} = \&{"type_".$3};
    > # I'm surprised that above line works at all....


    So'm I... are you sure you didn't turn 'use strict' off?

    > } }
    > close ( README );
    > # process data
    > foreach my $key (sort keys %DATABASE) {
    > $DATABASE{$key}{parse}->($DATABASE{$key});
    > # I'm surprised that above line works at all....


    That, on the other hand, is perfectly straightforward and simply the way
    you invoke a subref.

    Ben

    --
    The cosmos, at best, is like a rubbish heap scattered at random.
    - Heraclitus
     
    Ben Morrow, Feb 25, 2004
    #2
    1. Advertising

  3. Ben Morrow <> writes:

    > using the symbol table (in this case, the names of your subs)
    > instead of a real data structure is always wrong.


    Damn, I kinda liked being able to use objects. :)

    --
    \\ ( )
    . _\\__[oo
    .__/ \\ /\@
    . l___\\
    # ll l\\
    ###LL LL\\
     
    Brian McCauley, Feb 25, 2004
    #3
  4. kz

    kz Guest

    Hi,

    "Ben Morrow" <> wrote in message
    news:c1idn4$dfb$...
    >

    [snip original question]
    >
    > No... using the symbol table (in this case, the names of your subs)
    > instead of a real data structure is always wrong. Make an array of anon
    > subs:
    >
    > my @process = (
    > sub {
    > print "type 01";
    > my $hh = shift;
    > ...
    > },
    > sub {
    > print "type 02";
    > ...
    > },
    > ...
    > sub {
    > print "type 2a";
    > ...
    > },
    > );
    >
    > and then call with
    > $process[hex $DATABASE{$key}{recordtype}]->($DATABASE{$key});


    If these routines are relatively short, I'm fine with it. Since I'm planning
    to get the whole thing wrapped
    into Tk (maybe not the most correct way of saying this) these subs might get
    relatively big and the readability of the whole code might suffer from this.
    So I might still be ending up with coding named subs and pushing their
    coderefs onto an array.

    > > After consulting the perldsc manpage I came up with below code snippet,
    > > which does exactly what needed, though I'm not confident if this is the

    best
    > > way to do it.
    > >
    > > #!/usr/bin/perl
    > > use strict;
    > > use warnings;
    > > my %DATABASE;
    > > my $dbfile = $ARGV[0];
    > > open ( README, "<$dbfile") || print "ERROR: unable to open log file,

    error:
    > > $! \n", exit 1;
    > > # read data
    > > while (my $line = <README>) {
    > > chomp $line;
    > > if ($line =~ /^(\w{4})\s+\w{2}\s+\w{2}\s+(\w{2})\s+(\w{2})\s?(.*)$/)
    > > {
    > > $DATABASE{$1}{length} = $2;
    > > $DATABASE{$1}{recordtype} = $3;
    > > $DATABASE{$1}{records} = [split /\s/,$4];
    > > $DATABASE{$1}{parse} = \&{"type_".$3};
    > > # I'm surprised that above line works at all....

    >
    > So'm I... are you sure you didn't turn 'use strict' off?


    Obviously not, I cut-and-pasted the whole test code as-is. Still, could
    someone explain to me why this line works without any evals? This coderef
    can not be created at compile time as $3 is not known at this point. Does
    the compiler try to match that string on every loop iteration to an existing
    sub?

    >
    > > } }
    > > close ( README );
    > > # process data
    > > foreach my $key (sort keys %DATABASE) {
    > > $DATABASE{$key}{parse}->($DATABASE{$key});
    > > # I'm surprised that above line works at all....

    >
    > That, on the other hand, is perfectly straightforward and simply the way
    > you invoke a subref.


    ....provided $DATABASE{$1}{parse} = \&{"type_".$3}; created a correct
    coderef...

    > Ben
    >
    > --
    > The cosmos, at best, is like a rubbish heap scattered at random.
    > - Heraclitus


    Where have I seen it yesterday? Oh yes it was my bubble-sorting script I
    drew up the other day ;-)

    >


    Thanks Ben, I've got some re-coding to do now...

    Cheers,

    Zoltan Kandi, M. Sc.
     
    kz, Feb 26, 2004
    #4
  5. kz

    kz Guest

    "kz" <> wrote in message
    news:Gj2%b.4$...
    > Hi Gurus,
    >

    [snipping code]

    Thanks Ben, Brian and David for your valuable help.

    Best regards,

    Zoltan Kandi, M. Sc.
     
    kz, Feb 26, 2004
    #5
  6. kz

    Ben Morrow Guest

    "kz" <> wrote:
    > > > $DATABASE{$1}{parse} = \&{"type_".$3};
    > > > # I'm surprised that above line works at all....

    > >
    > > So'm I... are you sure you didn't turn 'use strict' off?

    >
    > Obviously not, I cut-and-pasted the whole test code as-is. Still, could
    > someone explain to me why this line works without any evals? This coderef
    > can not be created at compile time as $3 is not known at this point. Does
    > the compiler try to match that string on every loop iteration to an existing
    > sub?


    Interesting... it seems strict 'refs' doesn't apply to coderefs... I
    didn't know that.

    This is a straightforward symref. If you turn strict 'refs' off it works
    for all ref types:

    no strict 'refs';
    my $scalar;
    my $foo = "lar";
    my $ref = \${"sca" . $foo};

    However it seems that it works for coderefs even with strict 'refs' on.

    Ben

    --
    Heracles: Vulture! Here's a titbit for you / A few dried molecules of the gall
    From the liver of a friend of yours. / Excuse the arrow but I have no spoon.
    (Ted Hughes, [ Heracles shoots Vulture with arrow. Vulture bursts into ]
    /Alcestis/) [ flame, and falls out of sight. ]
     
    Ben Morrow, Feb 26, 2004
    #6
  7. Ben Morrow <> wrote:
    >
    > "kz" <> wrote:
    >> > > $DATABASE{$1}{parse} = \&{"type_".$3};
    >> > > # I'm surprised that above line works at all....
    >> >
    >> > So'm I... are you sure you didn't turn 'use strict' off?

    >>
    >> Obviously not, I cut-and-pasted the whole test code as-is. Still, could
    >> someone explain to me why this line works without any evals? This coderef
    >> can not be created at compile time as $3 is not known at this point. Does
    >> the compiler try to match that string on every loop iteration to an existing
    >> sub?

    >
    > Interesting... it seems strict 'refs' doesn't apply to coderefs... I
    > didn't know that.
    >
    > This is a straightforward symref. If you turn strict 'refs' off it works
    > for all ref types:
    >
    > no strict 'refs';
    > my $scalar;
    > my $foo = "lar";
    > my $ref = \${"sca" . $foo};
    >
    > However it seems that it works for coderefs even with strict 'refs' on.



    It looks to me like the docs for strict.pm are incomplete|misleading|wrong.


    perldoc strict

    "strict refs"
    ...
    There is one exception to this rule:

    $bar = \&{'foo'};
    &$bar;


    The docs do not provide a specification for the exception, only
    an example of an exception.

    The example has no runtime-ness in the symbol, so you could
    reasonably conclude that the exception is only at compile-time,
    while it looks like we are seeing runtime stuff being excepted...


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Feb 26, 2004
    #7
  8. kz

    kz Guest

    "Tad McClellan" <> wrote in message
    news:...
    >

    [snip]
    >
    > It looks to me like the docs for strict.pm are

    incomplete|misleading|wrong.
    >
    >
    > perldoc strict
    >
    > "strict refs"
    > ...
    > There is one exception to this rule:
    >
    > $bar = \&{'foo'};
    > &$bar;
    >
    >
    > The docs do not provide a specification for the exception, only
    > an example of an exception.
    >
    > The example has no runtime-ness in the symbol, so you could
    > reasonably conclude that the exception is only at compile-time,
    > while it looks like we are seeing runtime stuff being excepted...
    >
    >
    > --
    > Tad McClellan SGML consulting
    > Perl programming
    > Fort Worth, Texas


    ....at least this is what below snippet would suggest when run on XP and ASPN
    631, both runtime and compile-time stuff are excepted.

    What would be the correct/expected behaviour of Perl?

    Side question: why does this manpage (and lots of others as well) still
    suggest calling a sub with &? Is there any difference between &$bar and
    $bar->()?
    I have done some RTFM but still ... which FM should I believe?

    use strict;
    use warnings;
    my $num = "01";
    my $bar = \&{"sub_".$num};
    my $baz = \&{"sub_02"};
    print "before...\n";
    $bar->();
    $baz->();
    print "after...\n";
    exit 0;
    sub sub_01 {
    print "I am inside sub_01...\n";
    }

    sub sub_02 {
    print "I am inside sub_02...\n";
    }

    Cheers,

    Zoltan
     
    kz, Feb 27, 2004
    #8
  9. kz

    Guest

    Tad McClellan <> wrote in message news:<>...
    >
    > It looks to me like the docs for strict.pm are incomplete|misleading|wrong.


    Yes it is.

    >
    > perldoc strict
    >
    > "strict refs"
    > ...
    > There is one exception to this rule:
    >
    > $bar = \&{'foo'};
    > &$bar;
    >
    >
    > The docs do not provide a specification for the exception, only
    > an example of an exception.
    >
    > The example has no runtime-ness in the symbol, so you could
    > reasonably conclude that the exception is only at compile-time,
    > while it looks like we are seeing runtime stuff being excepted...


    Well, the exception would be rather meaningless if it were restricted
    to constant expressions. I don't really think it would be
    _reasonable_ to conclude that the exception is only for constant
    expressions.

    So if we assume the exception is not totally pointless then we can
    infer that it must mean that you can derefernce a symbolic coderef as
    the argument to the \ operator without having to relax strict. I
    agree that it would be better for the docs to spell this out.

    Actually, however, this is not the full story. You can also
    dereference a symbolic coderef as the argument to goto(), defined() or
    exists().

    Indeed strict.pm goes on to mention goto() - thus having said there's
    only one exception it lists two!

    The purpose of use strict is to avoid the compiler thinking you want a
    symref when you accidently get a string where you wanted a reference.
    This is unlikely to be the case in the above situatiuons so it's more
    convenient to have them excluded.

    However, until this is documented I'm going to continue to put "no
    strict 'refs'" in my AUTOLOAD functions.

    I'm sorry that the following patch is line-wrap damaged, I'm having to
    post via Google due to NNTP problems here.

    --- perl5/5.8.0/strict.pm Fri Nov 1 15:39:49 2002
    +++ strict.pm Fri Feb 27 17:52:35 2004
    @@ -37,13 +37,18 @@
    $file = "STDOUT";
    print $file "Hi!"; # error; note: no comma after $file

    -There is one exception to this rule:
    -
    - $bar = \&{'foo'};
    - &$bar;
    -
    -is allowed so that C<goto &$AUTOLOAD> would not break under
    stricture.
    +There is an exception to this rule. You can dereference a string as
    a
    +subroutine as the argument to C<goto()>, C<defined()>, C<exits()> or
    +the backslask operator.
    +
    + $symref = 'foo';
    + $ref = \&$symref; # ok
    + goto &$symref; # ok
    + if ( defined &$symref ) { do_stuff() } # ok
    + if ( exists &$symref ) { do_stuff() } # ok

    +This allowed, not because symbolic code references are a good thing,
    +but so that AUTOLOAD methods do not need to switch off the stricture.

    =item C<strict vars>
     
    , Feb 27, 2004
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Dennis Gavrilov
    Replies:
    1
    Views:
    1,470
    Dennis Gavrilov
    Jul 24, 2003
  2. Alfonso Morra
    Replies:
    11
    Views:
    752
    Emmanuel Delahaye
    Sep 24, 2005
  3. Henry Law
    Replies:
    4
    Views:
    147
    Anno Siegel
    Sep 9, 2005
  4. Bruno Boettcher

    How to create a coderef from an object-method?

    Bruno Boettcher, Jan 4, 2006, in forum: Perl Misc
    Replies:
    6
    Views:
    112
    Robert Sedlacek
    Jan 5, 2006
  5. Coderef to object methods?

    , Sep 19, 2006, in forum: Perl Misc
    Replies:
    8
    Views:
    147
    -berlin.de
    Sep 20, 2006
Loading...

Share This Page