I am with you in that your second code is better than
the first, you still have to convince me that your second
code is "better" than (just to name an example)
my $result=-1;
if (valid())
{
lots of code
}
return $result;
You seem to be very focused on "else". I still fail to
see why. There are many ways to write code with or
without "else", all with different lengths, your
measurement of code simply seems to consist of the
number of bytes and number of "else"s.
What Uri is trying to avoid is not "else"s per se, but rather
unnecessary blocks to begin with. Every time you employ a block (with
proper indentation, hopefully), you leave the maintainer necessarily
having to figure out how the logic differs between code that takes
that block and code that skips that block.
In the example you gave as you wrote it, it's easy to see that if
the if() block is not taken, then the function is returned from
immediately. But what if, in place of the "lots of code", there was
literally a hundred lines of code? If I was looking at that if()
condition, I'd know that in some cases the logic enters that block,
but I wouldn't know that in others the logic is done with that
function altogether.
In order to know that, in certain conditions, that we're done with
the function and we can return, I necessarily would have to scroll all
the way down to the end of the if() block, where I would notice that
the $result is returned untouched. It's much easier for me ("me"
being the maintainer, not the writer of the code) to know that when I
see:
return -1 if not valid();
lots of code
return result;
that non-valid inputs results in an immediate return.
Of course, this is just a simple example. In real life, I often
come across functions with the following skeleton:
sub f
{
my (...) = @_;
my $found = 0;
my $result;
if (...)
{
# do something quick here
if (...)
{
# do something quick here
if (...)
{
while (... && !$found)
{
if (...)
{
# do something quick here
if (...)
{
# lots of code
if (...)
{
# We found what we're looking for!
$result = ...;
$found = 1;
}
}
}
}
}
}
}
return $result;
}
All this function does is search an input set for something we
want, and when we find it, it gets returned. However, when we know
the inputs aren't what we want at the beginning, we still in the
function until the very end. And when we know the element we're at in
the while() loop isn't what we want, we're still in it for the whole
loop.
And when we finally do get the answer we want, we have to employ
the use of a $found variable to jump out of the loop to get to the end
of the function, where only then we can return.
And notice how many '}' we have at the end. Which matches to
what? Can you tell at a glance? I wrote the code myself, and I can't
immediately tell.
When maintaining, when I come across code like the above, I'll see
the first if() and wonder, "What happens to code that doesn't take
this route? <scroll, scroll, scroll> Oh, nothing." Then I'll come
across the second if(), scroll, and think the same thing. Same for
the third if().
Then I come across the first if() in the while() loop and wonder,
"What happens to code that doesn't take this route? <scroll, scroll>
Oh, this element has nothing more to do with this iteration of the
loop." Same with the next if().
And when I come across the final if() (the one that reaches the
what we're looking for), I think, how does setting the $found and
$result variables affect future behavior of this function? <scroll>
Oh, nothing more of consequence happens except that $result gets
returned."
Instead of using the code above as I wrote it, how about I re-write
it like this:
sub f
{
my (...) = @_;
return if not ...;
# do something quick here
return unless ...;
# do something quick here
return if not ...;
while (...)
{
next if not ...;
# do something quick here
next unless ...;
# lots of code
# Return the result if we found what we're looking for:
return $result if ...;
}
}
Is this new code any better? I think so. For one thing, it
doesn't depend on the $found and $result variables, as they aren't
needed here.
Also, look at those early "return" statements. They tell the
maintainer at-a-glance that the rest of the function is no longer
necessary if certain conditions are met. (You might think that's true
of the previous code, but if you have to scroll down hundreds of lines
to be greeted with half a dozen (or more) of '}'s (sometimes code
between them), you might find yourself wishing that there were a lot
less blocks to navigate through.
Note the "next" statements in the while() loop. They let you know
right away when the element you're examining is one you're interested
in and want to keep around. If you know you don't need it, you can
just invoke "next" and skip right to the next element.
Because of these "next" and extra "return" statements, I no longer
find myself wondering what follows code that doesn't meet the if()
conditions. I know right away what happens, without the need to jump
around to the end of the loop (all the while hoping that the '}' that
I think is the end of the loop really is the right one).
Also note that, with the exception of the top-level block that
defines the f() function, there is only one block, and it's just for
the while() loop. A maintainer can see that there is only one place
where the logic of the code diverges (excepting, of course, the early
exits and skips caused by the "return" and "next" statements).
Have you ever read a recipe in a culinary cookbook? If you have,
you might have noticed that most recipes consist of instructions that
are mostly nicely lined up with the page. Occasionally an instruction
will include multiple "sub-instructions" that are bulleted and/or
indented. However, excessive (and nested) bulleting and indeting is
frowned upon, since it tends to turn the recipe into a logic puzzle,
which is not the original intent of the recipe writer.
Likewise, I freely use early "return"s and "next"s, because I don't
want my code to become a puzzle to the maintainers of my code. I want
to make it clear when the rest of the function is no longer of any use
(by using a "return" statement). The maintainer shouldn't have to
scroll down dozens of lines and match up the right brace just to
figure that out for him/herself.
Similarly, If I'm looping through a list or data structure and
determine that the current element won't be needed, I'll just use a
"next" statemnet to skip it. Otherwise, the maintainer will
necessarily have to scroll down to the bottom of the loop (and match
up the proper braces) just to see that nothing else happens.
Nevertheless, some coders I've worked with still insist on using
the overly-indented style, adamantly avoiding the early "return"s and
"next"s. When asked why, one coder stated it's because he was taught
in a college C course that "every function should have exactly one
entrance point and exactly one exit point."
While that may be good avice for C (in C you often need to perform
"clean up" code to free up memory and close file handles), that advice
is not so good for languages like C++ and Perl, which encourage the
use of self-cleaning objects. (You can read up on "RAII" if you want
to learn more about it for C++.)
In properly written C++ and Perl, if you exit a block or a routine
early, the variables created in the scope(s) you just left simply
clean themselves up! With this kind of convenience, there's no reason
to prohibit "next"s and multiple "return" statements.
In truth, some coders will avoid "next"s and early "return"s simply
because they've never used them and are not used to them (and
sometimes they'll even use that argument to justify why nobody else
should use them, either!). But if this were a valid argument, then
everybody would necessarily have to code to the lowest common
denominator of knowledge, which is just plain silly. Imagine that,
whenever a new coder is hired, a company would have to examine the new
coder's knowledge to determine what keywords and coding styles should
never be used again. (You probably see where this is going, and why
avoiding code styles just because they're unfamiliar is a terrible
idea.)
The use of less blocks is also less error-prone. I've seen code
where someone accidentally put a line of code inbetween the wrong pair
of '}' braces (among half a dozen braces). When I was debugging the
code, I noticed that all but two of the '}' could have been eliminate
(by judicial use of "next" and "return"). If there had only been two
'}' to begin with, then the bug caused by the misplaced code probably
would have never existed, as the code probably wouldn't have been
misplaced in the first place.
So I think that Uri's point (and mine, definitely) is to try toward
the minimal number of blocks, since minimizing the number of blocks
tends to reduce code errors and is easier to maintain. And "next"s
and multiple "return"s may FEEL wrong to some coders (especially to
those who have been taught not to use them), but if used correctly
they are a real blessing to the maintainer.
I hope this clarifies things, Jochen.
-- Jean-Luc Romano