Need syntax/small footprint help

Discussion in 'Perl Misc' started by robic0@yahoo.com, Mar 26, 2005.

  1. Guest

    Hello, last month I wrote a rather large 10,000 line program that
    parses xml (expat). The data is read into compound structures
    then analyzed later. Below is one of the subroutines. I would like
    to make it smaller so that its easier to read and understand and
    also make the regular expressions less complicated for performance.
    Any ideas?

    ---------------------------
    sub eCHK_RegDel ## -- checks Redundant Regkey Delete // Regkey
    exists --
    {
    my ($details) = @_;
    return 1 if (@{$details} == 0);
    my @regdel = ();

    for (@$details) {
    my $aref = $_->[0];
    push (@regdel, $aref) if ($aref->[5] eq $REGISTRY && $aref->[6] eq
    $DELETE);
    }
    if (@regdel)
    {
    ## check that KeyName exists
    for (@regdel) {
    my $aref = $_;
    next if (exists ($aref->[4]{$KEYNAME}) &&
    length($aref->[4]{$KEYNAME}) > 0);
    # log error
    ## 0701 -- test 07,01
    ReportItem ('E', $CLOGINFO, $CERRLOG, '0701', glicn($aref), (0,0),
    '', $XML_File, 1,0,1);
    }
    ## check redundant regkey deletes
    my $first_kname = '';
    my $first_vname_exists = 0;
    my $aref_first = undef;

    my @regx_esc_codes = ( "\\", '/', '(', ')', '[', ']', '?', '|',
    '+', '.', '*', '$', '^', '{',
    '}', '@' );
    for (@regdel)
    {
    my $aref = $_;
    next if (!(exists ($aref->[4]{$KEYNAME}) &&
    length($aref->[4]{$KEYNAME}) > 0));
    my $kname = $aref->[4]{$KEYNAME};
    my $vname = '';
    my $kv = 'key (branch)';
    if (exists ($aref->[4]{$VALUENAME})) {
    if (length($aref->[4]{$VALUENAME}) > 0) {
    $vname = $aref->[4]{$VALUENAME};
    # strip leading spaces
    while ($vname =~ s/^ //) {}
    }
    $kv = 'value';
    }
    for (@regx_esc_codes)
    {
    my $tc = $_;
    my $xx = "\$kname =~ s/\\$tc/\\\\\\$tc/g;"; # code template for
    regex
    eval $xx;
    #print "$xx\n";
    }
    $kname = $kname."\\\\";
    my $fnd = 0;
    # -- #
    my $ctmpl = "if (\$kname =~ /^$first_kname/i) {\$fnd = 1;}"; #
    code template for first key name
    eval $ctmpl;
    # -- #
    #print
    "-------------\n$fnd,$ctmpl\nkname=$kname\n--------------\n\n";
    #print
    "-------------\n$fnd,$ctmpl\nkname=$kname\nfirst=$first_kname\n--------------\n\n";
    if ($@) {
    ## Check the $ctmpl, get the control code, log this error as a code
    issue.
    ## This shouldn't happen ... the compiler will show the escape
    char, add
    ## the char to "@regx_esc_codes", now its fixed!
    $@ =~ s/^[\x20\n\t]+//; $@ =~ s/[\x20\n\t]+$//;
    ## 0702 -- test 07,02
    ReportItem ('E', $CLOGINFO, $CERRLOG, '0702', (0,0,0,0), $@,
    $XML_File, 1,0,1);
    }
    if ((!$fnd || length ($first_kname) == 0))
    {
    # because of sort,the first non valuename
    # will always be in the new keyname action

    $first_kname = uc($kname);
    for (@regx_esc_codes)
    {
    my $tc = $_;
    my $xx = "\$first_kname =~ s/\\$tc/\\\\\\$tc/g;"; # code
    template for regex
    eval $xx;
    }
    $aref_first = $aref;
    if (length ($vname) > 0) {$first_vname_exists = 1} else {
    $first_vname_exists = 0}
    next;
    }
    next if ($first_vname_exists);
    if (length($vname) == 0) {
    if ($fnd) {
    # log error, this sub-key is already being deleted
    ## 0703 -- test 07,03
    ReportItem ('E', $CLOGINFO, $CERRLOG, '0703', glicn($aref),
    glicn($aref_first), '', $XML_File, 1,0,1);
    }
    next;
    }
    # log error, the key for this value is already being deleted
    ## 0704 -- test 07,04
    ReportItem ('E', $CLOGINFO, $CERRLOG, '0704', glicn($aref),
    glicn($aref_first), '', $XML_File, 1,0,1);
    }
    }
    }
    -----------------------------
     
    , Mar 26, 2005
    #1
    1. Advertising

  2. wrote:

    > Hello, last month I wrote a rather large 10,000 line program that
    > parses xml (expat). The data is read into compound structures
    > then analyzed later. Below is one of the subroutines. I would like
    > to make it smaller so that its easier to read and understand and
    > also make the regular expressions less complicated for performance.
    > Any ideas?
    >
    > ---------------------------
    > sub eCHK_RegDel ## -- checks Redundant Regkey Delete // Regkey
    > exists --
    > {
    > my ($details) = @_;
    > return 1 if (@{$details} == 0);


    return 1 unless @$details;

    But actually this is, I think, redundant. There is no point since if
    @$details is empty so will @regdel so nothing will happen anyhow.

    Why are you returning 1? You don't seem to be bothering about the
    returning anything in other situations so I infer that nothing is
    considering the return value of eCHK_RegDel.

    > my @regdel = ();
    > for (@$details) {
    > my $aref = $_->[0];
    > push (@regdel, $aref) if ($aref->[5] eq $REGISTRY && $aref->[6] eq
    > $DELETE);
    > }


    my @regdel =
    grep { $_->[5] eq $REGISTRY && $_->[6] eq $DELETE )
    map { $_->[0] } @$details;


    > if (@regdel)
    > {


    Again this is redundant. "If the next line would do nothing then don't
    do it". But if the next line would do nothing there's no cost in doing
    it anyhow.

    > ## check that KeyName exists
    > for (@regdel) {
    > my $aref = $_;


    You don't need the value in $_ so simply:

    for my $aref (@regdel) {

    > next if (exists ($aref->[4]{$KEYNAME}) &&
    > length($aref->[4]{$KEYNAME}) > 0);


    You mean defined() not exists().

    length() can never be negative so >0 is redundant.

    There could be a case for simply using no warnings 'uninitialized' here.

    > # log error
    > ## 0701 -- test 07,01
    > ReportItem ('E', $CLOGINFO, $CERRLOG, '0701', glicn($aref), (0,0),
    > '', $XML_File, 1,0,1);
    > }
    > ## check redundant regkey deletes
    > my $first_kname = '';
    > my $first_vname_exists = 0;
    > my $aref_first = undef;
    >
    > my @regx_esc_codes = ( "\\", '/', '(', ')', '[', ']', '?', '|',
    > '+', '.', '*', '$', '^', '{',
    > '}', '@' );


    This is invarient and should be taken outside the loop or maybe even
    outside the subroutine. But you don't need it at all probably.

    > for (@regdel)
    > {
    > my $aref = $_;
    > next if (!(exists ($aref->[4]{$KEYNAME}) &&
    > length($aref->[4]{$KEYNAME}) > 0));


    Why the second loop? Why not a do it in a single pass?

    > my $kname = $aref->[4]{$KEYNAME};
    > my $vname = '';
    > my $kv = 'key (branch)';
    > if (exists ($aref->[4]{$VALUENAME})) {
    > if (length($aref->[4]{$VALUENAME}) > 0) {
    > $vname = $aref->[4]{$VALUENAME};
    > # strip leading spaces
    > while ($vname =~ s/^ //) {}


    s/^ +//;

    > }
    > $kv = 'value';
    > }
    > for (@regx_esc_codes)
    > {
    > my $tc = $_;
    > my $xx = "\$kname =~ s/\\$tc/\\\\\\$tc/g;"; # code template for
    > regex
    > eval $xx;
    > #print "$xx\n";
    > }


    Do not use eval unless there is a reason to do so. The above is more simply

    for my $tc (@regx_esc_codes) {
    $kname =~ s/(\Q$tc\E)/\\$1/g;
    }

    But I suspect you simply wanted

    $kname = quotemeta $kname;

    But I'm guessing you wouldn't need to do anything if you got rid of some
    more of the pointless eval()s in your code.

    All in all I think you are working way too. Hard. I'll stop now with
    the line-by-line and go have another look at your code in its entirity.
    Get back to you soon.
     
    Brian McCauley, Mar 27, 2005
    #2
    1. Advertising

  3. Brian McCauley wrote:

    > All in all I think you are working way too hard. I'll stop now with
    > the line-by-line and go have another look at your code in its entirity.
    > Get back to you soon.


    Below find an, untested but much simpler version. I have not changed
    the alorithm just removed some of the tortuous obfuscations.

    I've subsequently noticed what I presume is a flaw in the algorithm (but
    that's not Perl related so OT here).
    Consider:

    { $KEYNAME => 'foo', $VALUENAME => 'bar' }
    { $KEYNAME => 'foo\blab' }
    { $KEYNAME => 'foo\blab', $VALUENAME => 'bar' }

    I'm guessing the 3rd one above should be reported as redundant but won't be.

    ## -- checks Redundant Regkey Delete // Regkey exists --
    sub eCHK_RegDel {
    my ($details) = @_;
    my ($first_kname,$first_vname_exists,$aref_first);
    for (@$details) {
    my $aref = $_->[0];
    next unless $aref->[5] eq $REGISTRY && $aref->[6] eq $DELETE;
    my $kname = $aref->[4]{$KEYNAME};

    # Can input data really contain $kname='' ?
    # if not then the check of length($kname) is redundant.

    unless ( defined($kname) && length($kname) ) {
    # log error
    ## 0701 -- test 07,01
    ReportItem ('E', $CLOGINFO, $CERRLOG, '0701', glicn($aref), (0,0),
    '', $XML_File, 1,0,1);
    next;
    }
    my $vname = $aref->[4]{$VALUENAME};

    # redunant surely - does input really contain $vname
    # with leading spaces or empty strings?
    if ( defined $vname ) {
    $vname =~ s/^ +//;
    undef $vname unless length $vname;
    }

    $kname .= "\\\\";
    unless ( $first_kname && $kname =~ /^\Q$first_kname/i ) {
    # because of sort,the first non valuename
    # will always be in the new keyname action

    $first_kname = $kname;
    $aref_first = $aref;
    $first_vname_exists = defined($vname);
    next;
    }
    next if $first_vname_exists;
    ReportItem ('E', $CLOGINFO, $CERRLOG,
    defined($vname) ? '0703' : '0704',
    glicn($aref), glicn($aref_first), '',
    $XML_File, 1,0,1);
    }
    }
     
    Brian McCauley, Mar 27, 2005
    #3
  4. Guest

    On Sun, 27 Mar 2005 16:27:35 +0000, Brian McCauley <>
    wrote:

    >
    >
    >Brian McCauley wrote:
    >
    >> All in all I think you are working way too hard. I'll stop now with
    >> the line-by-line and go have another look at your code in its entirity.
    >> Get back to you soon.

    >
    >Below find an, untested but much simpler version. I have not changed
    >the alorithm just removed some of the tortuous obfuscations.
    >
    >I've subsequently noticed what I presume is a flaw in the algorithm (but
    >that's not Perl related so OT here).
    >Consider:
    >
    > { $KEYNAME => 'foo', $VALUENAME => 'bar' }
    > { $KEYNAME => 'foo\blab' }
    > { $KEYNAME => 'foo\blab', $VALUENAME => 'bar' }
    >
    >I'm guessing the 3rd one above should be reported as redundant but won't be.
    >
    >## -- checks Redundant Regkey Delete // Regkey exists --
    >sub eCHK_RegDel {
    > my ($details) = @_;
    > my ($first_kname,$first_vname_exists,$aref_first);
    > for (@$details) {
    > my $aref = $_->[0];
    > next unless $aref->[5] eq $REGISTRY && $aref->[6] eq $DELETE;
    > my $kname = $aref->[4]{$KEYNAME};
    >
    > # Can input data really contain $kname='' ?
    > # if not then the check of length($kname) is redundant.
    >
    > unless ( defined($kname) && length($kname) ) {
    > # log error
    > ## 0701 -- test 07,01
    > ReportItem ('E', $CLOGINFO, $CERRLOG, '0701', glicn($aref), (0,0),
    > '', $XML_File, 1,0,1);
    > next;
    > }
    > my $vname = $aref->[4]{$VALUENAME};
    >
    > # redunant surely - does input really contain $vname
    > # with leading spaces or empty strings?
    > if ( defined $vname ) {
    > $vname =~ s/^ +//;
    > undef $vname unless length $vname;
    > }
    >
    > $kname .= "\\\\";
    > unless ( $first_kname && $kname =~ /^\Q$first_kname/i ) {
    > # because of sort,the first non valuename
    > # will always be in the new keyname action
    >
    > $first_kname = $kname;
    > $aref_first = $aref;
    > $first_vname_exists = defined($vname);
    > next;
    > }
    > next if $first_vname_exists;
    > ReportItem ('E', $CLOGINFO, $CERRLOG,
    > defined($vname) ? '0703' : '0704',
    > glicn($aref), glicn($aref_first), '',
    > $XML_File, 1,0,1);
    > }
    >}
    >


    Hey man, thanks. This is just a followup test msg using Agent, lost
    the google crap, now I can see mo better.
    Yeah, you got some interresting stuff, I am going to correlate it with
    the code once I set up everything. Some stuff you overlooked,
    but some good stuff I have to check out. This is just a test
    of Agent so I just in "check" mode for that. Obvious of interrest
    is the extreme back slashing regx wise... I'll explain that too later.
    Anyway thanks for the look Brian... be back to ya !!!

    -RFC
     
    , Mar 29, 2005
    #4
  5. Guest

    On Mon, 28 Mar 2005 21:44:40 -0800, wrote:
    >Hey man, thanks. This is just a followup test msg using Agent, lost
    >the google crap, now I can see mo better.
    >Yeah, you got some interresting stuff, I am going to correlate it with
    >the code once I set up everything. Some stuff you overlooked,
    >but some good stuff I have to check out. This is just a test
    >of Agent so I just in "check" mode for that. Obvious of interrest
    >is the extreme back slashing regx wise... I'll explain that too later.
    >Anyway thanks for the look Brian... be back to ya !!!
    >
    >-RFC


    I've got Forté Agent v 1.91, seems to be working good.
    Most of the code in this program was written fast and diddn't
    have the luxury of detailed scrutiny and because of its size
    and scope had to be written without the use of hidden
    shortcuts or anything hard to read. For that reason it was
    kept strictly on a C lang construct basis anything deeper
    would have slowed down the write.

    The "\\\\\\\\" 's et should be examined, along with what
    this function is trying to do. This was a tough one to
    program to say the least. Escape sequences in
    hard code are analogous in chemistry to being
    "out of solution". You don't worry about "in solution"
    unles its brought out to be operated on. There's a
    non-apparent unusual twist in this technique that
    could be usefull maybe. Anyway, I've found alot
    of other tricks doing this project. I think I will post
    some of them on this thread. They need help too.
    -RFC
     
    , Mar 29, 2005
    #5
  6. Guest

    On Sun, 27 Mar 2005 10:28:12 +0000, Brian McCauley <>
    wrote:

    >
    >
    > wrote:
    >
    >> Hello, last month I wrote a rather large 10,000 line program that
    >> parses xml (expat). The data is read into compound structures
    >> then analyzed later. Below is one of the subroutines. I would like
    >> to make it smaller so that its easier to read and understand and
    >> also make the regular expressions less complicated for performance.
    >> Any ideas?
    >>
    >> ---------------------------
    >> sub eCHK_RegDel ## -- checks Redundant Regkey Delete // Regkey
    >> exists --
    >> {
    >> my ($details) = @_;
    >> return 1 if (@{$details} == 0);

    >
    > return 1 unless @$details;

    yes, the same
    >
    >But actually this is, I think, redundant. There is no point since if
    >@$details is empty so will @regdel so nothing will happen anyhow.
    >

    return from @details empty check comes b4 @regdel population?

    >Why are you returning 1? You don't seem to be bothering about the
    >returning anything in other situations so I infer that nothing is
    >considering the return value of eCHK_RegDel.


    right, I paid no attention to the return condition at the end of the
    sub. where its called from pays no attention to return codes as of
    yet.
    >
    >> my @regdel = ();
    >> for (@$details) {
    >> my $aref = $_->[0];
    >> push (@regdel, $aref) if ($aref->[5] eq $REGISTRY && $aref->[6] eq
    >> $DELETE);
    >> }

    >
    >my @regdel =
    > grep { $_->[5] eq $REGISTRY && $_->[6] eq $DELETE )
    > map { $_->[0] } @$details;
    >
    >
    >> if (@regdel)
    >> {

    >
    >Again this is redundant. "If the next line would do nothing then don't
    >do it". But if the next line would do nothing there's no cost in doing
    >it anyhow.


    It may look weird but I assure you its the same push return addr onto
    the stack register without a goto.
    also, its proper form that more code may end up below "if" statement,
    large code dictates you leave the door open.
    >
    >> ## check that KeyName exists
    >> for (@regdel) {
    >> my $aref = $_;

    >
    >You don't need the value in $_ so simply:
    >
    > for my $aref (@regdel) {
    >

    writing freehand code from top to bottom, it is a rarety you will know
    what the final outcome will be. not alot of time to go back...
    but sometimes
    >> next if (exists ($aref->[4]{$KEYNAME}) &&
    >> length($aref->[4]{$KEYNAME}) > 0);

    >
    >You mean defined() not exists().
    >
    >length() can never be negative so >0 is redundant.
    >

    Not true... $aref->[4]{$KEYNAME} is dynamic. it MUST exist
    and if it does is sometimes zero length, which is a condition
    that is unacceptable...

    >There could be a case for simply using no warnings 'uninitialized' here.
    >
    >> # log error
    >> ## 0701 -- test 07,01
    >> ReportItem ('E', $CLOGINFO, $CERRLOG, '0701', glicn($aref), (0,0),
    >> '', $XML_File, 1,0,1);
    >> }
    >> ## check redundant regkey deletes
    >> my $first_kname = '';
    >> my $first_vname_exists = 0;
    >> my $aref_first = undef;
    >>
    >> my @regx_esc_codes = ( "\\", '/', '(', ')', '[', ']', '?', '|',
    >> '+', '.', '*', '$', '^', '{',
    >> '}', '@' );

    >
    >This is invarient and should be taken outside the loop or maybe even
    >outside the subroutine. But you don't need it at all probably.


    Not true... it is not inside a loop. global scope had no use for
    it.... in general, scoping in Perl is just like C++, dump it as soon
    as possible !!

    >
    >> for (@regdel)
    >> {
    >> my $aref = $_;
    >> next if (!(exists ($aref->[4]{$KEYNAME}) &&
    >> length($aref->[4]{$KEYNAME}) > 0));

    >
    >Why the second loop? Why not a do it in a single pass?
    >

    for alot of reasons... one is the side affects of trying to do such
    a complicated maneuver all at once... generally a good
    precautionarry measure especially since the first does not
    slow down the second at all, since the @regdel array of
    structures has been pre-conditioned via a sort engine
    that maximizes this kind of check ahead of time.
    Pre-fetching like this occurs sometimes 6 levels deep
    before analysis begins. Its a trade-off where clarity wins
    vs. bug introduction, but the engines must be designed
    for this type of thing ahead of time... which were.

    >> my $kname = $aref->[4]{$KEYNAME};
    >> my $vname = '';
    >> my $kv = 'key (branch)';
    >> if (exists ($aref->[4]{$VALUENAME})) {
    >> if (length($aref->[4]{$VALUENAME}) > 0) {
    >> $vname = $aref->[4]{$VALUENAME};
    >> # strip leading spaces
    >> while ($vname =~ s/^ //) {}

    >
    > s/^ +//;
    >
    >> }
    >> $kv = 'value';
    >> }
    >> for (@regx_esc_codes)
    >> {
    >> my $tc = $_;
    >> my $xx = "\$kname =~ s/\\$tc/\\\\\\$tc/g;"; # code template for
    >> regex
    >> eval $xx;
    >> #print "$xx\n";
    >> }

    >
    >Do not use eval unless there is a reason to do so. The above is more simply
    >

    #--#
    >for my $tc (@regx_esc_codes) {
    > $kname =~ s/(\Q$tc\E)/\\$1/g;
    >}

    #--#
    I don't know what this is... I guess I should check it.
    If you can resolve ALL the "$tc" with the "\Q"
    in regx then this should work... have you tried this method yourself?
    Without eval on the @regx codes you can't esc an "odd" number
    of '\' without muting the '$' in "$tc". I hope it works I will try it
    tommorow. @regx is array of single chars, this won't work on
    expressions as are the eval in the dynamic strings compared later.
    Course thats why its done here right....
    >
    >But I suspect you simply wanted
    >
    > $kname = quotemeta $kname;
    >
    >But I'm guessing you wouldn't need to do anything if you got rid of some
    >more of the pointless eval()s in your code.
    >
    >All in all I think you are working way too. Hard. I'll stop now with
    >the line-by-line and go have another look at your code in its entirity.
    > Get back to you soon.


    You know, I'm being paid far too much money for it too .... hehe!!!
     
    , Apr 1, 2005
    #6
  7. Guest

    On Fri, 01 Apr 2005 00:33:32 -0800, wrote:

    >for alot of reasons... one is the side affects of trying to do such
    >a complicated maneuver all at once... generally a good
    >precautionarry measure especially since the first does not
    >slow down the second at all, since the @regdel array of

    i meant @details instead of @regdel
    >structures has been pre-conditioned via a sort engine
    >that maximizes this kind of check ahead of time.
    >Pre-fetching like this occurs sometimes 6 levels deep
    >before analysis begins. Its a trade-off where clarity wins
    >vs. bug introduction, but the engines must be designed
    >for this type of thing ahead of time... which were.
     
    , Apr 1, 2005
    #7
  8. Anno Siegel Guest

    <> wrote in comp.lang.perl.misc:

    > return from @details empty check comes b4 @regdel population?


    Please use complete English words. Things like "b4" have been a childish
    affectation since the last 300 baud modem was scrapped.

    Anno
     
    Anno Siegel, Apr 1, 2005
    #8
  9. Guest

    On Mon, 28 Mar 2005 21:44:40 -0800, wrote:

    >On Sun, 27 Mar 2005 16:27:35 +0000, Brian McCauley <>
    >wrote:
    >
    >>
    >>
    >>Brian McCauley wrote:
    >>
    >>> All in all I think you are working way too hard. I'll stop now with
    >>> the line-by-line and go have another look at your code in its entirity.
    >>> Get back to you soon.

    >>
    >>Below find an, untested but much simpler version. I have not changed
    >>the alorithm just removed some of the tortuous obfuscations.
    >>
    >>I've subsequently noticed what I presume is a flaw in the algorithm (but
    >>that's not Perl related so OT here).
    >>Consider:
    >>
    >> { $KEYNAME => 'foo', $VALUENAME => 'bar' }
    >> { $KEYNAME => 'foo\blab' }
    >> { $KEYNAME => 'foo\blab', $VALUENAME => 'bar' }
    >>
    >>I'm guessing the 3rd one above should be reported as redundant but won't be.


    Since we're talking about deleting a key, the 3rd one would be
    reported as the root key (2nd) is already being deleted.. check the
    code.
    BTW, the
    while ($vname =~ s/^ //) {} was replace with:
    $vname =~ s/^ +//;
    and so the trailing spaces:
    $vname =~ s/ +$//;
    although its not used in this sub. i have a reduction sub
    (layer) that conditions data and takes away redudendency for
    "equivalence" purposes before a typical check. Its called
    "sub reduce" its got like 8 reductions now... more or less
    depending on how I handle url's and directories (mixed
    and across os'). u understand what I mean given these
    are human typed?

    >>
    >>## -- checks Redundant Regkey Delete // Regkey exists --
    >>sub eCHK_RegDel {
    >> my ($details) = @_;
    >> my ($first_kname,$first_vname_exists,$aref_first);
    >> for (@$details) {
    >> my $aref = $_->[0];
    >> next unless $aref->[5] eq $REGISTRY && $aref->[6] eq $DELETE;
    >> my $kname = $aref->[4]{$KEYNAME};
    >>
    >> # Can input data really contain $kname='' ?
    >> # if not then the check of length($kname) is redundant.


    $kname is dynamic... again, if it exists, it MUST have value.
    there is no ambiguity here... what good does existence without
    value in an initialized world in the middle of life? prozac....
    >>
    >> unless ( defined($kname) && length($kname) ) {
    >> # log error
    >> ## 0701 -- test 07,01
    >> ReportItem ('E', $CLOGINFO, $CERRLOG, '0701', glicn($aref), (0,0),
    >> '', $XML_File, 1,0,1);
    >> next;
    >> }
    >> my $vname = $aref->[4]{$VALUENAME};
    >>
    >> # redunant surely - does input really contain $vname
    >> # with leading spaces or empty strings?


    In the real world $vname contains raw human input, yes..
    Unfortunately, raw spaces have length. Is a space
    acceptical name on a credit card? no... so we
    emulate life programitally sometimes in code to factor
    out human mistakes.

    >> if ( defined $vname ) {
    >> $vname =~ s/^ +//;
    >> undef $vname unless length $vname;
    >> }
    >>
    >> $kname .= "\\\\";
    >> unless ( $first_kname && $kname =~ /^\Q$first_kname/i ) {
    >> # because of sort,the first non valuename
    >> # will always be in the new keyname action
    >>
    >> $first_kname = $kname;
    >> $aref_first = $aref;
    >> $first_vname_exists = defined($vname);
    >> next;
    >> }
    >> next if $first_vname_exists;
    >> ReportItem ('E', $CLOGINFO, $CERRLOG,
    >> defined($vname) ? '0703' : '0704',
    >> glicn($aref), glicn($aref_first), '',
    >> $XML_File, 1,0,1);
    >> }
    >>}
    >>

    >
    >Hey man, thanks. This is just a followup test msg using Agent, lost
    >the google crap, now I can see mo better.
    >Yeah, you got some interresting stuff, I am going to correlate it with
    >the code once I set up everything. Some stuff you overlooked,
    >but some good stuff I have to check out. This is just a test
    >of Agent so I just in "check" mode for that. Obvious of interrest
    >is the extreme back slashing regx wise... I'll explain that too later.
    >Anyway thanks for the look Brian... be back to ya !!!
    >
    >-RFC


    OK thanks again Brian, keep up the good work.
    That "\Q" thing has got me to find out at work tommorow what it is.
    Thanks.
    -rfc
     
    , Apr 1, 2005
    #9
  10. wrote:

    > On Sun, 27 Mar 2005 10:28:12 +0000, Brian McCauley <>
    > wrote:
    >
    >
    >>
    >> wrote:
    >>
    >>
    >>>Hello, last month I wrote a rather large 10,000 line program that
    >>>parses xml (expat). The data is read into compound structures
    >>>then analyzed later. Below is one of the subroutines. I would like
    >>>to make it smaller so that its easier to read and understand and
    >>>also make the regular expressions less complicated for performance.
    >>>Any ideas?
    >>>
    >>>---------------------------
    >>>sub eCHK_RegDel ## -- checks Redundant Regkey Delete // Regkey
    >>>exists --
    >>>{
    >>> my ($details) = @_;
    >>> return 1 if (@{$details} == 0);

    >>
    >> return 1 unless @$details;

    >
    > yes, the same
    >
    >>But actually this is, I think, redundant. There is no point since if
    >>@$details is empty so will @regdel so nothing will happen anyhow.
    >>

    >
    > return from @details empty check comes b4 @regdel population?


    You think it is relevant why? It is always better to write code with
    fewer cases handled as exceptional.

    >>> if (@regdel)
    >>> {

    >>
    >>Again this is redundant. "If the next line would do nothing then don't
    >>do it". But if the next line would do nothing there's no cost in doing
    >>it anyhow.

    >
    >
    > It may look weird but I assure you its the same push return addr onto
    > the stack register without a goto.


    I have no idea what you just said.

    > also, its proper form that more code may end up below "if" statement,
    > large code dictates you leave the door open.


    What you call "leaving the door open" is what makes your code large.
    If someone is going to add more code that needs to be outside the loop
    but conditional on @regdel being non-empty then the additional effort of
    adding the if is trivial.

    Remember: always write less code unless this would make your code less
    readable.

    >>> next if (exists ($aref->[4]{$KEYNAME}) &&
    >>>length($aref->[4]{$KEYNAME}) > 0);

    >>
    >>You mean defined() not exists().
    >>
    >>length() can never be negative so >0 is redundant.
    >>

    >
    > Not true...


    Why?

    > $aref->[4]{$KEYNAME} is dynamic. it MUST exist
    > and if it does is sometimes zero length, which is a condition
    > that is unacceptable...


    That supports my assertion. When disagreeing with someone it is
    convetional to put forward reasons why they are wrong, not why they are
    right.

    >
    >>> my @regx_esc_codes = ( "\\", '/', '(', ')', '[', ']', '?', '|',
    >>> '+', '.', '*', '$', '^', '{',
    >>>'}', '@' );

    >>
    >>This is invarient and should be taken outside the loop or maybe even
    >>outside the subroutine. But you don't need it at all probably.

    >
    >
    > Not true... it is not inside a loop.


    Sorry my mistake.

    > global scope had no use for
    > it.... in general, scoping in Perl is just like C++, dump it as soon
    > as possible !!


    Yes in general, yes. There are however exceptions. Constants are one.


    >>> for (@regdel)
    >>> {
    >>> my $aref = $_;
    >>> next if (!(exists ($aref->[4]{$KEYNAME}) &&
    >>>length($aref->[4]{$KEYNAME}) > 0));

    >>
    >>Why the second loop? Why not a do it in a single pass?
    >>

    >
    > for alot of reasons... one is the side affects of trying to do such
    > a complicated maneuver all at once... generally a good
    > precautionarry measure especially since the first does not
    > slow down the second at all, since the @regdel array of
    > structures has been pre-conditioned via a sort engine
    > that maximizes this kind of check ahead of time.
    > Pre-fetching like this occurs sometimes 6 levels deep
    > before analysis begins. Its a trade-off where clarity wins
    > vs. bug introduction, but the engines must be designed
    > for this type of thing ahead of time... which were.


    I have no idea what you just said. I think you are saying that if an
    error is detected in thre first loop then the second loop is not
    executed. That's simply not true.
    >> $kname =~ s/(\Q$tc\E)/\\$1/g;

    >
    > I don't know what this is... I guess I should check it.
    > If you can resolve ALL the "$tc" with the "\Q"
    > in regx then this should work... have you tried this method yourself?


    Yes I have used quotemeta (or \Q) in Perl a lot.

    > Without eval on the @regx codes you can't esc an "odd" number
    > of '\' without muting the '$' in "$tc". I hope it works I will try it
    > tommorow. @regx is array of single chars, this won't work on
    > expressions as are the eval in the dynamic strings compared later.


    I have no idea what you just said there. You seem to be saying you
    think eval() is somehow the solution to some problem you were having.
    As far as I can see all the problems you mention are a _consequence_ of
    your decision to use eval().

    Most of your code seems devoted to checking if string $kname starts with
    string $first_kname. In most languages there's an easy way to do this.
    Surely the fact that you spent a couple of dozen lines on it should
    have rung alarm bells.

    In Perl this is simply:

    $kname =~ /^\Q$first_kname/

    ....or...

    !index($kname,$first_kname)

    To make it case insensative you can simply put a /i on the refeg or
    convert everything to lowercase.

    > Course thats why its done here right....


    I have no idea what you are talking about. What are you suggesting you
    have done right?

    >>But I suspect you simply wanted
    >>
    >> $kname = quotemeta $kname;
    >>
    >>But I'm guessing you wouldn't need to do anything if you got rid of some
    >>more of the pointless eval()s in your code.
    >>
    >>All in all I think you are working way too. Hard. I'll stop now with
    >>the line-by-line and go have another look at your code in its entirity.
    >> Get back to you soon.

    >
    >
    > You know, I'm being paid far too much money for it too .... hehe!!!


    So that's why you'll spend ten units of effort working around a bug you
    could fix in one unit.
     
    Brian McCauley, Apr 1, 2005
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Francisco Garcia

    Small footprint XP app

    Francisco Garcia, Jan 9, 2004, in forum: ASP .Net
    Replies:
    5
    Views:
    630
    Shannon Broskie
    Feb 5, 2004
  2. Beatrice Rutger

    JMS memory footprint size

    Beatrice Rutger, Jun 5, 2005, in forum: Java
    Replies:
    0
    Views:
    382
    Beatrice Rutger
    Jun 5, 2005
  3. gbrun
    Replies:
    1
    Views:
    467
    Andrey Kuznetsov
    Feb 19, 2006
  4. vahania

    small footprint soap

    vahania, Jun 26, 2003, in forum: XML
    Replies:
    1
    Views:
    1,021
    Heinz Häberle
    Jun 27, 2003
  5. Tim Morgan
    Replies:
    0
    Views:
    212
    Tim Morgan
    May 18, 2005
Loading...

Share This Page