Trying to avoid passing params to subs through globals

J

Justin C

Further to me recent problem of unblessed references, I'm re-writing my
code, putting more of it into subroutines so that the main thread is
more obvious. I'm finding that where I was using global variables there
appear to be a lot of variables that need passing to some sub-routines.

I'm thinking of creating a few hashes of references to these variables,
and then passing the whole hash (or a reference to it) to the
subroutine.

my %excel = (
"worksheet" => \$ws,
"workboot" => \$wb,
"format_1" => \$format_1,
"format_2" => \$format_2,
);

my %document = (
"pdf" => \$pdf,
"table" => \$table,
);

Firstly, if these are defined near the begining of the program they look
just like globals to me! What's the difference? OK, so I can just
declare "my %excel =();" and then have a subroutine(s) populate it on an
ad-hoc basis, that, I suppose, looks less global.

To avoid a whole bunch of vars being global I can see that I'm going to
have to nest a lot of sub-routines so that I can keep scope minimal.

The other thing is, if $ws, for example, was returned as a reference (I
think I really need to see where I should return references, and where I
should return the actual thing... if it's an object return the thing, if
it's scalar, array or hash return a ref? maybe that's too crude) I
surely shouldn't be creating a reference to it (in the hash
assignement), so it should be "worksheet" => $ws" instead of "\$ws",
shouldn't it?

Sorry if the above sounds a bit random, it's hitting the keyboard as
it's coming to mind, and I shouldn't really be doing this now, I'm
supposed to be working... though the work I'm supposed to be doing is
getting this program to do what we want - it's a little chicken and egg
here at the moment.

Justin.
 
R

Randal L. Schwartz

Justin> Further to me recent problem of unblessed references, I'm re-writing my
Justin> code, putting more of it into subroutines so that the main thread is
Justin> more obvious. I'm finding that where I was using global variables there
Justin> appear to be a lot of variables that need passing to some sub-routines.

Sounds like related state correlated with related behavior.

That's called an "object".

Might wanna look into that. :)

If you need, I can recommend a good book (or two :).

print "Just another Perl hacker,"; # the original
 
U

Uri Guttman

SRB> Although, I've actually found this a very helpful paradigm:

actually that is a poor paradigm.

SRB> sub do_something {
SRB> ...
SRB> my $handler = sub {
SRB> ...
SRB> };

SRB> if ($cond_a) {
SRB> return $handler->(do_a());
SRB> } elsif ($cond_b) {

why the else (of the if) when you just returned? make it cleaner by just
falling through to a plain if.

SRB> return $handler->(do_b());
SRB> } else {
SRB> return $handler->(do_c());
SRB> }
SRB> }

better yet, use a dispatch table. i hate elsif's and try to never use
them. they are a marker for poor logic. hell, i eschew else as well. in
over 10k lines in one project i had about 3 elsif's and 11 else's. and
the code is very readable. it is just done with very clean logic and
code flow.

uri
 
U

Uri Guttman

SRB> On 12/14/2009 01:46 PM, Uri Guttman wrote:
SRB> if ($cond_a) {
SRB> return $handler->(do_a());
SRB> } elsif ($cond_b) {SRB> return $handler->(do_b());
SRB> } else {
SRB> return $handler->(do_c());
SRB> }
SRB> }
SRB> Definitely =) Conditionals were mostly just fodder for the sake of
SRB> the anonymous sub (which was the major point). Mostly I use this kind
SRB> of thing for failure routines:
SRB> if (my $result = try_something()) {
SRB> handle_success($result);
SRB> } elsif (my $result = try_fallback()) {
SRB> handle_succesS($result);
SRB> }
SRB> and so on ... I don't really have a good paradigm for that kind of
SRB> thing that would involve a dispatch table though (is there one?).

you can still drop the elsif and make them if's if the handlers are done
and you need to exit. that alone makes for a better coding style.

as for that try/handle stuff, if it is as regular as you show, then a
simple table/loop will do it better:

my @try_subs = ( \&try_foo, \&try_bar, ... ) ;

foreach my $try ( @try_subs ) {

my $result = $try->() ;
next unless $result ;

handle_success( $result ) ;
last ;
}

nary an else or elsif in the code!

thanx,

uri
 
J

Jochen Lehmeier

in over 10k lines in one project i had about 3 elsif's and 11 else's.

I understand your trouble with elsif (well, at least I can think of cases
where elsif may not be optimal).

But "else"? How could "else" possibly be a sign of bad logic?
 
J

Jochen Lehmeier

SRB> if (my $result = try_something()) {
SRB> handle_success($result);
SRB> } elsif (my $result = try_fallback()) {
SRB> handle_succesS($result);
SRB> }
my @try_subs = ( \&try_foo, \&try_bar, ... ) ;

foreach my $try ( @try_subs ) {

my $result = $try->() ;
next unless $result ;

handle_success( $result ) ;
last ;
}

nary an else or elsif in the code!

I guess if we would work in a team, we would regularly have heated
discussions on things like this. Replacing a standard, efficient, easy to
parse (mentally), absolutely normal coding style with a rather convoluted
one, just for the abstract goal of avoiding "else" and "elsif" (*)...
well, I have no useful, public-compatible comment really.

Besids, your pattern breaks down as soon as your functions take different
arguments, or your handle_success() differ from each other, though I guess
you could work around that with closures too, if you wanted.

Note, I do not recall that the OP mentioned lots and lots of elsif's -
only 2 or maybe 3. For larger numbers of cases, and especially when
re-used, this pattern might be kind of useful, but not for simple cases
like this, and not just for getting rid of els(e|if).

Aside from that,

my $result = try_something() || try_fallback();
handle_success($result) if $result;

would be the Perl-like variant I'd use (or straigtforward "elsif", just
like SRB), which scales perfectly to large numbers of try_()'s if you
insert line breaks in strategical places.
 
U

Uri Guttman

JL> I understand your trouble with elsif (well, at least I can think of
JL> cases where elsif may not be optimal).

JL> But "else"? How could "else" possibly be a sign of bad logic?

not a sign of bad logic but of inefficient logic in perl.

my favorite counter example is this style which is all too common:

you enter a sub and have a validity check for something and then you do
the work if the data is ok:

if ( valid() ) {

do lots of code here
}
else {
return -1 ;
}

return $result ;

vs my style:

return -1 unless valid() ;

lots of code
return $result ;


look ma, no else!! no else is needed!! it is shorter, no extra block
entries, no extra indent, no wasted pixels for the {} chars!

you do your checking and get out immediately if bad. then the main good
work is done inline and not in an indented block. perl allows and
encourages this style with statement modifiers and good flow control
ops.

there are many variations on that style but they all eschew else
statements. you just don't need them that often in good perl. most of
the time elses are used for error cases. other times a conditional
expression is better (that is another commonly abused thing, else's
assigning to the same var as the then clause). if you show me code with
lots of else's i can usually rewrite it better without them.

uri
 
J

Jochen Lehmeier

not a sign of bad logic but of inefficient logic in perl.

Define "bad", define "inefficient". Try to avoid the term "logic" please,
it is much too big in this context, c/f http://en.wikipedia.org/wiki/Logic.
you enter a sub and have a validity check for something and then you do
the work if the data is ok:

if ( valid() ) {

do lots of code here
}
else {
return -1 ;
}

return $result ;

vs my style:

return -1 unless valid() ;

lots of code
return $result ;
look ma, no else!! no else is needed!!

Well, this is completely different code from what we talked about before,
and you didn't mention my remarks on the original topic at all (Logical
Fallacy: Irrelevant Conclusion). Of course your second code is shorter
than the longer code up there (and I've written enough fragements closer
to your second than your first), but I still fail to see how this is so
specifically *because* there is no "else" (Logical Fallacies: Affirming
the Consequent, Denying the Antecedent, and maybe Non Sequitur). And while
I am with you in that your second code is better than the first, you still
have to convince me that your second code is "better" than (just to name
an example)

my $result=-1;
if (valid())
{
lots of code
}
return $result;

You seem to be very focused on "else". I still fail to see why. There are
many ways to write code with or without "else", all with different
lengths, your measurement of code simply seems to consist of the number of
bytes and number of "else"s.
you do your checking and get out immediately if bad. then the main good
work is done inline and not in an indented block. perl allows and
encourages this style with statement modifiers and good flow control
ops.

Yes, yes, but go back to the original topic please - very different stuff
back there (Logical Fallacy: Fallacy of Many Questions or Loaded Question)
there are many variations on that style but they all eschew else
statements. you just don't need them that often in good perl.[...]
if you show me code with
lots of else's i can usually rewrite it better without them.

For your definition of "better".

The reason I reply here and try to argue with you is that you make "else"
out to be totally bad and evil, like someone saying "goto is evil" (the
latter being true ;-) ). I would hate for some Perl newcomer to grow up
with FUD or hearsay like this. Saying "a rather short than long coding
style, getting out of subs ASAP" is perfectly fine with me. But saying
"avoid else at all costs" is not.

You are correct in saying that there is the long-winded coding style of
people who are not familiar with the usual Perl idioms which can be used
to make Perl code short. But there is nothing at all specifically pointing
to "else" in that reasoning. And it is absolutely not true that in all
circumstances the shorter code is "better".

Besides, consider: you suggest to work a lot with "return" or "last"
mingled in between other statements. This is one thing which surely leads
to shorter code, but also leads very quickly to code which is much harder
to maintain, since it gets harder to see the flow of execution at a
glance. I.e., if I have a sub that does *not* contain any "return" or
"last", then I can be absolutely sure that the flow of execution will, at
some point, arrive at the end of the method; and I can be absolutely sure
that I can add code there that will be executed after everything else in
that sub (for example, something which modifies, logs or validates the
return code). With frequent "return" and "last", this gets harder. (Yes, I
am purposefully ignoring "die" here).

There are is also the Theory of Computation, which, for example, concerns
itself with proofs of correctness. In those areas, if/else is often *much*
easier to work with than return/last.

Check out LISP, just to broaden the horizon. The language does not even
have the concept of "return" or "last" at all, and heavily uses "else".
But it's still *very* logical, indeed.

PS: Sorry for the OT Logic stuff, folks. ;-)
 
U

Uri Guttman

JL> Define "bad", define "inefficient". Try to avoid the term "logic"
JL> please, it is much too big in this context, c/f
JL> http://en.wikipedia.org/wiki/Logic.

i use logic as in the decisions made in flow control. those are all
logical choices. it is a fine term for this.

JL> Well, this is completely different code from what we talked about
JL> before, and you didn't mention my remarks on the original topic at
JL> all (Logical Fallacy: Irrelevant Conclusion). Of course your second
JL> code is shorter than the longer code up there (and I've written
JL> enough fragements closer to your second than your first), but I still
JL> fail to see how this is so specifically *because* there is no "else"
JL> (Logical Fallacies: Affirming the Consequent, Denying the Antecedent,
JL> and maybe Non Sequitur). And while I am with you in that your second
JL> code is better than the first, you still have to convince me that
JL> your second code is "better" than (just to name an example)

it is faster as in no block entry which costs you in perl.

it is shorter so it is easier to read on a screen

it is simpler as you don't need to check inside blocks for what is
happening and where.

easier to follow the MAIN logic which doing the work as you already
handled the easy error exit case. no worries about what ELSE should be
done as there is no else clause.

one less indent makes it easier to read and you can use longer lines
with fewer wrapped lines. that also makes the code shorter and easier to read.

is that enough to qualify it as BETTER code? 5 good reasons with others
i can scrounge up if needed.

JL> my $result=-1;
JL> if (valid())
JL> {
JL> lots of code
JL> }
JL> return $result;

JL> You seem to be very focused on "else". I still fail to see why. There
JL> are many ways to write code with or without "else", all with
JL> different lengths, your measurement of code simply seems to consist
JL> of the number of bytes and number of "else"s.

i don't focus on else. i just easily code away from it. it takes zero
effort for me to do it and it generates better code by the above
reasons. it is the coding style i teach for perl and it usually wins
over most who learn it.


JL> Yes, yes, but go back to the original topic please - very different
JL> stuff back there (Logical Fallacy: Fallacy of Many Questions or
JL> Loaded Question)

huh?? we are talking a minimal change in style to lower the else count
and make better code. there is no loaded question nor fallacy here. it
is a proven style improvement. simple and effective.
there are many variations on that style but they all eschew else
statements. you just don't need them that often in good perl.[...]
if you show me code with
lots of else's i can usually rewrite it better without them.

JL> For your definition of "better".

and i get paid to judge code so my word has backing on this. i rate code
for recruitment and for code review projects. too many elses can always
be improved. it is just one of the many things i see which can be
improved. hell, my code can be improved. please do so. my cpan id is
URI. go for it and publish it here. if you show me better code, i will
use it gladly. but you have to also prove it to be better. i know i can
back up my code against weak attacks.

JL> The reason I reply here and try to argue with you is that you make
JL> "else" out to be totally bad and evil, like someone saying "goto
JL> is evil" (the latter being true ;-) ). I would hate for some Perl
JL> newcomer to grow up with FUD or hearsay like this. Saying "a
JL> rather short than long coding style, getting out of subs ASAP" is
JL> perfectly fine with me. But saying "avoid else at all costs" is
JL> not.

not evil, just not needed as often as it is commonly used. it is needed
but just rarely. i don't go out of my way to force no elses. i just can
do it naturally in perl. in other langs it wouldn't be as easy. this is
a perl style point, not a general thing against elses like with no gotos.

JL> You are correct in saying that there is the long-winded coding style
JL> of people who are not familiar with the usual Perl idioms which can
JL> be used to make Perl code short. But there is nothing at all
JL> specifically pointing to "else" in that reasoning. And it is
JL> absolutely not true that in all circumstances the shorter code is
JL> "better".

shorter code is generally better code given that it isn't forced into
obsfucation. there is a limit but so much code is long winded and can be
easily improved by shortening it.

JL> Besides, consider: you suggest to work a lot with "return" or "last"
JL> mingled in between other statements. This is one thing which surely
JL> leads to shorter code, but also leads very quickly to code which is
JL> much harder to maintain, since it gets harder to see the flow of
JL> execution at a glance. I.e., if I have a sub that does *not* contain
JL> any "return" or "last", then I can be absolutely sure that the flow
JL> of execution will, at some point, arrive at the end of the method;
JL> and I can be absolutely sure that I can add code there that will be
JL> executed after everything else in that sub (for example, something
JL> which modifies, logs or validates the return code). With frequent
JL> "return" and "last", this gets harder. (Yes, I am purposefully
JL> ignoring "die" here).

then your code is too complex. i keep things short and sweet and flow
control is everything in doing that.

JL> There are is also the Theory of Computation, which, for example,
JL> concerns itself with proofs of correctness. In those areas, if/else
JL> is often *much* easier to work with than return/last.

who cares about that when working in the real world?

JL> Check out LISP, just to broaden the horizon. The language does not
JL> even have the concept of "return" or "last" at all, and heavily uses
JL> "else". But it's still *very* logical, indeed.

lisp sucks donkey balls. and i know from plenty of experience in
it. lisp is for computers. perl is for people. your choice!

uri
 
U

Uri Guttman

DH> On Mon, 14 Dec 2009 16:04:27 -0500 in comp.lang.perl.misc, "Uri Guttman"

DH> my $result = try_foo()
DH> || try_bar()
DH> || ... ;

DH> handle_success( $result ) if $result;

and how would you handle undef values? or a list return there? undef can
be checked with // in 5.10 but lists can't be handled with ||. the loop
version can easily be modified to handle list returns.

also if you need to pass args to those subs it gets redundant to see
them in each call and noisy as well. the loop version keeps the calling
code simple and extendable while keeping the list of calls separate
where you can easily see and edit it.

this assumes a decent sized list of calls, not just 2 or 3 of them.

uri
 
E

Eric Pozharski

On 2009-12-14 said:
it is faster as in no block entry which costs you in perl.

While I totaly agree on everything skipped I beg to differ on this one:

#!/usr/bin/perl

use strict;
use warnings;
use Benchmark qw{ cmpthese timethese };

use feature 'switch';
my( $x, $y );

cmpthese timethese -5, {
code00 => sub { if( ++$x % 2 ) { $y++ } elsif( not $x % 2 ) { $y++ }; },
code01 => sub { if( ++$x % 2 ) { $y++ }; if( not $x % 2 ) { $y++ }; },
code02 => sub { $y++ if ++$x % 2; $y++ if not $x % 2; },
code03 => sub { ++$x % 2 and $y++; (not $x % 2) and $y++; },
code04 => sub { given( ++$x % 2 ) { when( 1 ) { $y++ } when( 0 ) { $y++ } }; },
code05 => sub { if( ++$x % 2 ) { my $z = 0 } elsif( not $x % 2 ) { my $z = 1 }; },
};

__END__
Benchmark: running code00, code01, code02, code03, code04, code05 for at least 5 CPU seconds...
code00: 6 wallclock secs ( 5.31 usr + 0.00 sys = 5.31 CPU) @ 3023788.32/s (n=16056316)
code01: 6 wallclock secs ( 5.02 usr + 0.01 sys = 5.03 CPU) @ 2340337.77/s (n=11771899)
code02: 3 wallclock secs ( 5.04 usr + 0.00 sys = 5.04 CPU) @ 2320529.37/s (n=11695468)
code03: 5 wallclock secs ( 5.32 usr + 0.00 sys = 5.32 CPU) @ 2176516.17/s (n=11579066)
code04: 4 wallclock secs ( 5.07 usr + 0.00 sys = 5.07 CPU) @ 1295559.17/s (n=6568485)
code05: 4 wallclock secs ( 5.32 usr + 0.00 sys = 5.32 CPU) @ 2562540.23/s (n=13632714)
Rate code04 code03 code02 code01 code05 code00
code04 1295559/s -- -40% -44% -45% -49% -57%
code03 2176516/s 68% -- -6% -7% -15% -28%
code02 2320529/s 79% 7% -- -1% -9% -23%
code01 2340338/s 81% 8% 1% -- -9% -23%
code05 2562540/s 98% 18% 10% 9% -- -15%
code00 3023788/s 133% 39% 30% 29% 18% --

5.10.1 isn't the first perl that gives me a strong feeling that perl
somewhat shortcuts I<elsif>s.

*SKIP*
lisp sucks donkey balls. and i know from plenty of experience in
it. lisp is for computers. perl is for people. your choice!

I don't wish you lack in this fight -- you don't need it.
 
U

Uri Guttman

DH> On Tue, 15 Dec 2009 03:19:11 -0500 in comp.lang.perl.misc, "Uri Guttman"
DH> my $result = try_foo()
DH> || try_bar()
DH> || ... ;
DH> There is no indication that any of those things are needed. You are
DH> moving the goalpost. Let the subroutines return 0 on failure. Or, the
DH> above can _more_ easily be modified using the appropriate expression if
DH> one of them cannot be altered.

i have seen too many subs which return undef on fail. in older perls
that is problematic with ||.

DH> If the subs need args then most likely they need different args.
DH> How are you going to accommodate that in the loop version?

a hash ref and a table of hashes (probably along with the sub
call). this is called table driven code and i have very nice working
examples of it in the test code in Sort::Maker on cpan. you can create
tables of args, expected return values, golden sort code, etc. it
removes tons of redundant code and makes it much easier to make
variations than if you did a standard if/else tree.

DH> My list of calls is no uglier than your array list for any size.

for some definition of ugly. i will take mine. my perl eyes are very
experienced and used by many others for business.

DH> But mainly it is straightforward and easily understood code, eliminating
DH> all that loopyness for a task that is not much like a loop to begin
DH> with. Consider it as a response to the elseif version, as yours was,
DH> and then maybe you will like it more.

a long || list can be very annoying to some. it has several weaknesses
and it is inflexible. i don't prefer it so that is all i have to say.

uri
 
U

Uri Guttman

BM> Can you explain how? AFAIK undef has always been false...

because you can't tell undef from '' or 0 or '0' with ||. 5.10 has //
which solves that problem.

uri
 
U

Uri Guttman

DH> On Tue, 15 Dec 2009 22:00:07 -0500 in comp.lang.perl.misc, "Uri Guttman"

DH> But that's not a problem in this thread, where undef is completely
DH> acceptable as a false result from a sub indicating failure. When you
DH> wrote that undef was problematic with || I took that as meaning that it
DH> didn't always work as documented in older perls. After searching, I can
DH> find no indication of that being true. So it is not problematic; it
DH> just doesn't always do everything you might imagine.

DH> In another thread you suggest:
DH> my $month = $ARGV[0] || '' ;

you can't get undef from @ARGV.

uri
 
J

jl_post

I am with you in that your second code is better than
the first, you still have to convince me that your second
code is "better" than (just to name an example)

my $result=-1;
if (valid())
{
lots of code
}
return $result;

You seem to be very focused on "else". I still fail to
see why. There are many ways to write code with or
without "else", all with different lengths, your
measurement of code simply seems to consist of the
number of bytes and number of "else"s.


What Uri is trying to avoid is not "else"s per se, but rather
unnecessary blocks to begin with. Every time you employ a block (with
proper indentation, hopefully), you leave the maintainer necessarily
having to figure out how the logic differs between code that takes
that block and code that skips that block.

In the example you gave as you wrote it, it's easy to see that if
the if() block is not taken, then the function is returned from
immediately. But what if, in place of the "lots of code", there was
literally a hundred lines of code? If I was looking at that if()
condition, I'd know that in some cases the logic enters that block,
but I wouldn't know that in others the logic is done with that
function altogether.

In order to know that, in certain conditions, that we're done with
the function and we can return, I necessarily would have to scroll all
the way down to the end of the if() block, where I would notice that
the $result is returned untouched. It's much easier for me ("me"
being the maintainer, not the writer of the code) to know that when I
see:

return -1 if not valid();
lots of code
return result;

that non-valid inputs results in an immediate return.

Of course, this is just a simple example. In real life, I often
come across functions with the following skeleton:

sub f
{
my (...) = @_;

my $found = 0;
my $result;

if (...)
{
# do something quick here

if (...)
{
# do something quick here

if (...)
{
while (... && !$found)
{
if (...)
{
# do something quick here

if (...)
{
# lots of code

if (...)
{
# We found what we're looking for!
$result = ...;
$found = 1;
}
}
}
}
}
}
}

return $result;
}

All this function does is search an input set for something we
want, and when we find it, it gets returned. However, when we know
the inputs aren't what we want at the beginning, we still in the
function until the very end. And when we know the element we're at in
the while() loop isn't what we want, we're still in it for the whole
loop.

And when we finally do get the answer we want, we have to employ
the use of a $found variable to jump out of the loop to get to the end
of the function, where only then we can return.

And notice how many '}' we have at the end. Which matches to
what? Can you tell at a glance? I wrote the code myself, and I can't
immediately tell.

When maintaining, when I come across code like the above, I'll see
the first if() and wonder, "What happens to code that doesn't take
this route? <scroll, scroll, scroll> Oh, nothing." Then I'll come
across the second if(), scroll, and think the same thing. Same for
the third if().

Then I come across the first if() in the while() loop and wonder,
"What happens to code that doesn't take this route? <scroll, scroll>
Oh, this element has nothing more to do with this iteration of the
loop." Same with the next if().

And when I come across the final if() (the one that reaches the
what we're looking for), I think, how does setting the $found and
$result variables affect future behavior of this function? <scroll>
Oh, nothing more of consequence happens except that $result gets
returned."

Instead of using the code above as I wrote it, how about I re-write
it like this:

sub f
{
my (...) = @_;

return if not ...;

# do something quick here

return unless ...;

# do something quick here

return if not ...;

while (...)
{
next if not ...;

# do something quick here

next unless ...;

# lots of code

# Return the result if we found what we're looking for:
return $result if ...;
}
}

Is this new code any better? I think so. For one thing, it
doesn't depend on the $found and $result variables, as they aren't
needed here.

Also, look at those early "return" statements. They tell the
maintainer at-a-glance that the rest of the function is no longer
necessary if certain conditions are met. (You might think that's true
of the previous code, but if you have to scroll down hundreds of lines
to be greeted with half a dozen (or more) of '}'s (sometimes code
between them), you might find yourself wishing that there were a lot
less blocks to navigate through.

Note the "next" statements in the while() loop. They let you know
right away when the element you're examining is one you're interested
in and want to keep around. If you know you don't need it, you can
just invoke "next" and skip right to the next element.

Because of these "next" and extra "return" statements, I no longer
find myself wondering what follows code that doesn't meet the if()
conditions. I know right away what happens, without the need to jump
around to the end of the loop (all the while hoping that the '}' that
I think is the end of the loop really is the right one).

Also note that, with the exception of the top-level block that
defines the f() function, there is only one block, and it's just for
the while() loop. A maintainer can see that there is only one place
where the logic of the code diverges (excepting, of course, the early
exits and skips caused by the "return" and "next" statements).

Have you ever read a recipe in a culinary cookbook? If you have,
you might have noticed that most recipes consist of instructions that
are mostly nicely lined up with the page. Occasionally an instruction
will include multiple "sub-instructions" that are bulleted and/or
indented. However, excessive (and nested) bulleting and indeting is
frowned upon, since it tends to turn the recipe into a logic puzzle,
which is not the original intent of the recipe writer.

Likewise, I freely use early "return"s and "next"s, because I don't
want my code to become a puzzle to the maintainers of my code. I want
to make it clear when the rest of the function is no longer of any use
(by using a "return" statement). The maintainer shouldn't have to
scroll down dozens of lines and match up the right brace just to
figure that out for him/herself.

Similarly, If I'm looping through a list or data structure and
determine that the current element won't be needed, I'll just use a
"next" statemnet to skip it. Otherwise, the maintainer will
necessarily have to scroll down to the bottom of the loop (and match
up the proper braces) just to see that nothing else happens.

Nevertheless, some coders I've worked with still insist on using
the overly-indented style, adamantly avoiding the early "return"s and
"next"s. When asked why, one coder stated it's because he was taught
in a college C course that "every function should have exactly one
entrance point and exactly one exit point."

While that may be good avice for C (in C you often need to perform
"clean up" code to free up memory and close file handles), that advice
is not so good for languages like C++ and Perl, which encourage the
use of self-cleaning objects. (You can read up on "RAII" if you want
to learn more about it for C++.)

In properly written C++ and Perl, if you exit a block or a routine
early, the variables created in the scope(s) you just left simply
clean themselves up! With this kind of convenience, there's no reason
to prohibit "next"s and multiple "return" statements.

In truth, some coders will avoid "next"s and early "return"s simply
because they've never used them and are not used to them (and
sometimes they'll even use that argument to justify why nobody else
should use them, either!). But if this were a valid argument, then
everybody would necessarily have to code to the lowest common
denominator of knowledge, which is just plain silly. Imagine that,
whenever a new coder is hired, a company would have to examine the new
coder's knowledge to determine what keywords and coding styles should
never be used again. (You probably see where this is going, and why
avoiding code styles just because they're unfamiliar is a terrible
idea.)

The use of less blocks is also less error-prone. I've seen code
where someone accidentally put a line of code inbetween the wrong pair
of '}' braces (among half a dozen braces). When I was debugging the
code, I noticed that all but two of the '}' could have been eliminate
(by judicial use of "next" and "return"). If there had only been two
'}' to begin with, then the bug caused by the misplaced code probably
would have never existed, as the code probably wouldn't have been
misplaced in the first place.

So I think that Uri's point (and mine, definitely) is to try toward
the minimal number of blocks, since minimizing the number of blocks
tends to reduce code errors and is easier to maintain. And "next"s
and multiple "return"s may FEEL wrong to some coders (especially to
those who have been taught not to use them), but if used correctly
they are a real blessing to the maintainer.

I hope this clarifies things, Jochen.

-- Jean-Luc Romano
 
J

Jürgen Exner

What Uri is trying to avoid is not "else"s per se, but rather
unnecessary blocks to begin with. Every time you employ a block (with
proper indentation, hopefully), you leave the maintainer necessarily
having to figure out how the logic differs between code that takes
that block and code that skips that block.

In the example you gave as you wrote it, it's easy to see that if
the if() block is not taken, then the function is returned from
immediately. But what if, in place of the "lots of code", there was
literally a hundred lines of code?

While I agree with you in general this particular argument backfires.

If you have a block with hundred lines of code then it is overdue to be
split up into 6-10 subs. Yes, there are certainly examples where
exceedingly long blocks of code are appropriate, but in general the old
rule of thumb "No sub longer than a single [VT100] screen" still has its
merrits.

jue
 
J

Jürgen Exner

Of course, this is just a simple example. In real life, I often
come across functions with the following skeleton:

sub f
{
my (...) = @_;

my $found = 0;
my $result;

if (...)
{
# do something quick here

if (...)
{
# do something quick here

if (...)
{
while (... && !$found)
{
if (...)
{
# do something quick here

if (...)
{
# lots of code

if (...)
{
# We found what we're looking for!
$result = ...;
$found = 1;
}
}
}
}
}
}
}

return $result;
}
[...]
When maintaining, when I come across code like the above,

.... then I wonder, if the original programmer has never heard of subs.

jue
 
U

Uri Guttman

DH> On Wed, 16 Dec 2009 16:35:47 -0500 in comp.lang.perl.misc, "Uri Guttman"
DH> In another thread you suggest:
DH> my $month = $ARGV[0] || '' ;
DH> use Data::Dumper;
DH> my $month = $ARGV[0];
DH> print Dumper \$month;

DH> yields:
DH> $VAR1 = \undef;

that is because you didn't test @ARGV for elements. please make
intelligent tests. if @ARGV has elements, those elements cannot be undef
because the shell can't pass undef via exec. checking $ARGV[0] directly
is bad coding but i didn't address that in this thread.

uri
 
J

jl_post

If you have a block with hundred lines of code then it is
overdue to be split up into 6-10 subs. Yes, there are certainly
examples where exceedingly long blocks of code are appropriate,
but in general the old rule of thumb "No sub longer than a
single [VT100] screen" still has its merrits.


(For the record, I was going to type "hundreds of lines of code,"
but then I thought better of it and decided to write "a hundred lines
of code" instead.)

But my point is that I don't control how other code is written, and
if it I run across a function I didn't write that has literally has
hundreds of lines of code, then I'm stuck debugging it regardless.
(In other words, I can't refuse to debug it just because it doesn't
look well-written.)

The reason I encourage coders to use prodigious use of "next" and
"return" statements in Perl is because they are almost always done
correctly when used. In other words, their use almost always makes my
job as a maintainer easier.

However, I tend not to encourage breaking up long code into smaller
functions quite as often, for three main reasons:

1. Sometimes the algorithm is a valid -- yet long --
algorithm that works well as a single function and
would compromise the flow of the main logic if
broken up.

2. For some reason, breaking up functions causes
many programmers to resort to using global
variables (instead of correctly implementing
function parameter passing) and this is a HUGE
pet peeve of mine. (I'd rather deal with
overly-long functions than with unnecessary
global variables.)

3. Even if the function should be broken up,
a long function that doesn't rely on global
variables is not that much harder to debug
than one that is broken up into smaller
functions (especially if the function is
well written, with easily distinguishable
sections of code). In other words, a long
function might be a coding sin, but there
are plenty worse sins out there (in my opinion
as a maintainer).

I'm not saying that large functions shouldn't be broken up into
smaller ones; what I'm saying is that, if I were to write a
"Maintainer's Wish List", my wish to have coders use "next" and
"return" prodigiously (which cuts down on the over-use of bracketed
blocks) ranks much higher than my wish to break up large functions
into smaller ones (especially since many codes mistakenly think that
breaking up functions gives them permission to use global variables).

-- Jean-Luc
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,042
Latest member
icassiem

Latest Threads

Top