Error returns and repeated code, GOTOs, etc.

B

Ben Bacarisse

Aaron W. Hsu said:
Knuth has a famous "retort" to the GOTO article.

I am sure you know (hence the quotes) that it is not a refutation of
what Dijkstra wrote. Knuth says:

I argue for the elimination of go to's in certain cases, and for their
introduction in others. I believe that by presenting such a view I am
not in fact disagreeing sharply with Dijkstra's ideas, since he
recently wrote the following: "Please don't fall into the trap of
believing that I am terribly dogmatical about [the go to statement]."

Both are models of clarity, and neither is dogmatic. I feel sure that
if all gotos in all programs had been as well-reasoned for as those in
Knuth's paper, Dijkstra would never have written his.
 
B

Ben Bacarisse

mike3 said:
To me, it looks like they're trying to reuse the TRACE call.

Yes, but what came first? If the tracing was there originally and the
logic of the function was subsequently made more complex, doing it with
gotos makes no sense to me. If the logic was there and used multiple
returns when the tracing was added, I can see some value in making the
minimum of changes -- just replace the returns with jumps.
So that
one could modify the TRACE call without having to modify a clone at
another point in the routine. Would it be OK to duplicate the line at
each return point?

No, of course not.

Mmmm. So how would you write those returns?

Having got this far -- a single return -- I would add any more. I'd
just write the function using natural structures rather than jumps:

idn_result_t
idn_normalizer_add(idn_normalizer_t ctx, const char *scheme_name) {
idn_result_t r;
void *v;

assert(ctx != NULL && scheme_name != NULL);
TRACE(("idn_normalizer_add(scheme_name=%s)\n", scheme_name));
assert(INITIALIZED);

if (idn__strhash_get(scheme_hash, scheme_name, &v) != idn_success) {
ERROR(("idn_normalizer_add(): invalid scheme \"%-.30s\"\n",
scheme_name));
r = idn_invalid_name;
}
else {
assert(ctx->nschemes <= ctx->scheme_size);

if (ctx->nschemes != ctx->scheme_size ||
(r = expand_schemes(ctx)) == idn_success) {
ctx->schemes[ctx->nschemes++] = v;
r = idn_success;
}
}
TRACE(("idn_normalizer_add(): %s\n", idn_result_tostring(r)));
return r;
}

(I've also removed an unneeded variable: "scheme", but it would probably
be better to rename "v" as "scheme" now.)
 
M

mike3

mike3 <[email protected]> writes:

Yes, but what came first?  If the tracing was there originally and the
logic of the function was subsequently made more complex, doing it with
gotos makes no sense to me.  If the logic was there and used multiple
returns when the tracing was added, I can see some value in making the
minimum of changes -- just replace the returns with jumps.

You mean a goto?

Mmmm. So how would you write those returns?

Having got this far -- a single return -- I would add any more.  I'd
just write the function using natural structures rather than jumps:

  idn_result_t
  idn_normalizer_add(idn_normalizer_t ctx, const char *scheme_name) {
       idn_result_t r;
       void *v;

       assert(ctx != NULL && scheme_name != NULL);
       TRACE(("idn_normalizer_add(scheme_name=%s)\n", scheme_name));
       assert(INITIALIZED);

       if (idn__strhash_get(scheme_hash, scheme_name, &v) != idn_success) {
            ERROR(("idn_normalizer_add(): invalid scheme \"%-..30s\"\n",
                   scheme_name));
            r = idn_invalid_name;
       }
       else {
            assert(ctx->nschemes <= ctx->scheme_size);

            if (ctx->nschemes != ctx->scheme_size ||
                (r = expand_schemes(ctx)) == idn_success) {
                 ctx->schemes[ctx->nschemes++] = v;
                 r = idn_success;
            }
       }
       TRACE(("idn_normalizer_add(): %s\n", idn_result_tostring(r)));
       return r;
  }

(I've also removed an unneeded variable: "scheme", but it would probably
be better to rename "v" as "scheme" now.)

Hmm. This looks like the "nest" approach suggested on this thread by
Ike Naar. I suppose it'd be OK in this case, since there are only a
couple of
error-code-returning calls and so the nesting does not get too deep.
But
what if you had like 5 calls or more (and with something that's not a
bignum
package, so you couldn't, e.g. pass error codes inside bignums)?
 
M

mike3

"Aaron W.  Hsu" <[email protected]> writes:
Both are models of clarity, and neither is dogmatic.  I feel sure that
if all gotos in all programs had been as well-reasoned for as those in
Knuth's paper, Dijkstra would never have written his.

But I take it that, from what I've seen on this thread, using gotos to
jump to error code handlers is _not_ a "well-reasoned" use, right, and
so
Djikstra's rules still apply?
 
M

mike3

But I take it that, from what I've seen on this thread, using gotos to
jump to error code handlers is _not_ a "well-reasoned" use, right, and
so
Djikstra's rules still apply?
^^^^^^^^^
Damn, I did it again! Dijkstra, oops. :(
 
B

BartC

mike3 said:
On May 29, 11:44 pm, mike3 <[email protected]> wrote:
I've made an interesting and disturbing discovery.

I found THIS:
YES, that's right -- those are GOTOs in there, and they do this a LOT
in this program!

I've just looked at the Python source code. I counted around 2000 gotos in
200,000 lines of code. Mostly they seemed to be used in place of an early
return, where there is common code to be executed before a return, so this
is just written once at the end of a function.

It seems clear enough. Some people don't like goto because it can lead to
spaghetti code (eg. as used by io_x), but it doesn't need to.
Djikstra!!!

I've no idea who you're talking about there...
 
M

mike3

Indeed, that is why I tend to refer people to the Knuth response after
reading Dijkstra's original.  Even today there is an unnecessary
avoidance of GOTO, as if they are the root cause of all problems, when
the problem both of the above authors really want to address is code that
is not clear, reasoned, and justified. Spaghetti code should not exist,
regardless of whether or not you use Gotos to write spaghetti code.

So does using GOTOs to jump to error handlers make the code less
clear, reasoned, and justified (see the example in my OP)? How do they
affect the clarity of the "real world program" code snippet I posted?
 
T

Thad Smith

On a side note, the term ERROR_SUCCESS appears contradictory to me. I
would prefer something like NO_ERROR or SUCCESS if you had control
over the error definitions.

I agree. With an ERROR prefix I would use ERROR_NONE. My preference, though,
is to use <module>_SUCCESS (or <module>_OK), where <module> is replaced by a
name that designates the particular module whose status is being reported.

Thad
 
B

Ben Bacarisse

It's been a while since I read it, so I read it again and I am no longer
so sure.

The surprising thing is that the note (it less then two pages long) is
about go to in programming languages -- not in programs. In effect, it
is a call to language designers to provide the right set of control
structures so that go to's are no longer needed.

Of course, the problem he so clearly identifies -- the fact that
unbridled use of the go to (note: _unbridled use_) leads to code that
can't be understood without maintaining a very complex view of the
code's execution sate -- applies to code written in those languages that
already have a go to statement.
So does using GOTOs to jump to error handlers make the code less
clear, reasoned, and justified (see the example in my OP)? How do they
affect the clarity of the "real world program" code snippet I posted?

To my mind, yes. However, the code is short and simple to re-write
without goto, so it is neither a good example of what Dijkstra was
arguing against, nor is it a good example of what Knuth was arguing to
retain. Because it is short, it is not particularly hard to reason
about (though it is harder to reason about than the goto-free version)
and because the pattern is simple to re-write without goto, it not a
case that shows the value of retaining goto in the language.
 
B

Ben Bacarisse

mike3 said:
But I take it that, from what I've seen on this thread, using gotos to
jump to error code handlers is _not_ a "well-reasoned" use, right,

A well-reasoned use needs reasoning. Your recent example did not
include any reasoning, just a slightly facetious remark that you'd found
a goto in real code (multiple exclamation marks!!!).

In certain cases, depending on the logic, the case for a jump to a block
of common clean-up code can be made. It needs care to get it right
since you are often cleaning up an incomplete configuration using
variables whose cope is, by definition, nearly all of (and in C90 is all
of) the function body.

Your example was a bad one because the jump-free version is, to my mind,
clearer and simpler.
and so Djikstra's rules still apply?

(You've corrected the name error.)

Go read the note -- it less then two pages. Honestly, go read it. It
does not present a rule for programmers, so much as a call for language
designers to provide control structures that are easy to reason about.
If gotos are still needed, it suggests a lack of suitable alternatives.
 
J

Joe keane

I agree gotos are better.

I agree gotos are better.

int func(int x)
{
int err;
struct bar *a;
struct bar *b;
struct bar *c;
int err2;

err = allocate_bar(&a);
if (err != X_OK)
goto done;

err = allocate_bar(&b);
if (err != X_OK)
goto free_a;

err = allocate_bar(&c);
if (err != X_OK)
goto free_b;

...

/* normal exit */

err = free_bar(c);

err2 = free_bar(b);
err = err != X_OK ? err : err2;

err2 = free_bar(a);
err = err != X_OK ? err : err2;
return err;

/* error exit */

err2 = free_bar(c);
(void) err2;

free_b:
err2 = free_bar(b);
(void) err2;

free_a:
err2 = free_bar(a);
(void) err2;

done:
return err;
}

do that in C++!
 
B

Ben Bacarisse

mike3 said:
mike3 <[email protected]> writes:

Yes, but what came first?  If the tracing was there originally and the
logic of the function was subsequently made more complex, doing it with
gotos makes no sense to me.  If the logic was there and used multiple
returns when the tracing was added, I can see some value in making the
minimum of changes -- just replace the returns with jumps.

You mean a goto?
Yes.
Mmmm. So how would you write those returns?

Having got this far -- a single return -- I would add any more.  I'd
just write the function using natural structures rather than jumps:

  idn_result_t
  idn_normalizer_add(idn_normalizer_t ctx, const char *scheme_name) {
       idn_result_t r;
       void *v;

       assert(ctx != NULL && scheme_name != NULL);
       TRACE(("idn_normalizer_add(scheme_name=%s)\n", scheme_name));
       assert(INITIALIZED);

       if (idn__strhash_get(scheme_hash, scheme_name, &v) != idn_success) {
            ERROR(("idn_normalizer_add(): invalid scheme \"%-.30s\"\n",
                   scheme_name));
            r = idn_invalid_name;
       }
       else {
            assert(ctx->nschemes <= ctx->scheme_size);

            if (ctx->nschemes != ctx->scheme_size ||
                (r = expand_schemes(ctx)) == idn_success) {
                 ctx->schemes[ctx->nschemes++] = v;
                 r = idn_success;
            }
       }
       TRACE(("idn_normalizer_add(): %s\n", idn_result_tostring(r)));
       return r;
  }

(I've also removed an unneeded variable: "scheme", but it would probably
be better to rename "v" as "scheme" now.)

Hmm. This looks like the "nest" approach suggested on this thread by
Ike Naar. I suppose it'd be OK in this case, since there are only a
couple of error-code-returning calls and so the nesting does not get
too deep. But what if you had like 5 calls or more (and with
something that's not a bignum package, so you couldn't, e.g. pass
error codes inside bignums)?

Give an example, and I'll comment. I can't see exactly what it is you
are saying. I presume it is not just the code that you originally
posted, since I commented on that.

Having, multiple early error cases does not always lead to conceptually
deep nesting. If you have:

if (C1) {
rc = E1;
goto error;
}
if (C2) {
rc = E2;
goto error;
}
if (C3) {
rc = E3;
goto error;
}
/* main error-free code */
error:
/* common clean-up code */
return rc;

you can turn it into:

if (C1)
rc = E1;
else if (C2)
rc = E2;
else if (C3)
rc = E3;
else {
/* main error-free code */
}
/* common clean-up code */
return rc;

That's deeply nested in the most literal sense, but no one would have
any trouble reasoning about the nesting, eleven if were 100 levels deep.
 
M

Malcolm McLean

בת×ריך ×™×•× ×©×‘×ª,2 ביוני 2012 03:46:45 UTC+1, מ×ת mike3:
But I take it that, from what I've seen on this thread, using gotos to
jump to error code handlers is _not_ a "well-reasoned" use, right, and
so Djikstra's rules still apply?
Gotos for error handling ia a well reasoned use. Being able to give cogent reasons for doing something doesn't mean that it is right, or that someone else might be able to give even better reasons against. But it's rather different from the situation where an action is taken without consideration, or in ignorance of the problems that may be associated with it.
 
J

Jean-Christophe

I agree gotos are better.

#define SIZE 3
int func(int x)
{
int i, err;
struct bar *s[SIZE];

// init
for( err=X_OK, i=0; (err==X_OK) && (i<SIZE); i++ )
err = allocate_bar( &s );

// check success
if( err == X_OK )
{ // process ...
}

// free only allocated bar(s)
while( --i >= 0 ) free_bar( s );

// done
return err; // zero = success
}
 
B

Ben Bacarisse

I agree gotos are better.

int func(int x)
{
int err;
struct bar *a;
struct bar *b;
struct bar *c;
int err2;

err = allocate_bar(&a);
if (err != X_OK)
goto done;

err = allocate_bar(&b);
if (err != X_OK)
goto free_a;

err = allocate_bar(&c);
if (err != X_OK)
goto free_b;

...

/* normal exit */

err = free_bar(c);

err2 = free_bar(b);
err = err != X_OK ? err : err2;

err2 = free_bar(a);
err = err != X_OK ? err : err2;
return err;

/* error exit */

err2 = free_bar(c);
(void) err2;

This code is unreachable but I think it's not needed by any path so it's
probably just there due to an editing error.
free_b:
err2 = free_bar(b);
(void) err2;

free_a:
err2 = free_bar(a);
(void) err2;

done:
return err;
}

do that in C++!

Why do you think this is not already C++?

BTW, to my mind this API is broken. I like allocation functions that
return NULL on failure. If you need more data, an error-collecting
parameter can be passed. Having NULL pointers make it possible to write
clean-up code that does not depend on the order of allocation.

However, using the same API, I'd write:

int func(int x)
{
int err_a, err_b, err_c;
struct bar *a, *b, *c;

err_a = allocate_bar(&a);
err_b = allocate_bar(&b);
err_c = allocate_bar(&c);

if (err_a == X_OK && err_b == X_OK && err_c == X_OK) {
/* normal code */
}

if (err_a == X_OK) err_a = free_bar(a);
if (err_b == X_OK) err_b = free_bar(b);
if (err_c == X_OK) err_c = free_bar(c);

return err_a == X_OK && err_b == X_OK && err_c == X_OK;
}

This does not return the same value, as your code but that's not hard to
change. I found your code's preference for one possible error over
another to be a little odd, so I changed it to appear more uniform.
 
B

BartC

Ben Bacarisse said:
To my mind, yes. However, the code is short and simple to re-write
without goto, so it is neither a good example of what Dijkstra was
arguing against, nor is it a good example of what Knuth was arguing to
retain. Because it is short, it is not particularly hard to reason
about (though it is harder to reason about than the goto-free version)
and because the pattern is simple to re-write without goto, it not a
case that shows the value of retaining goto in the language.

The fact that C is quite often used a compilation target for other
languages, is enough reason to retain the goto statement. (Even forgetting
the millions of lines of existing code.)

Any control statement that doesn't exactly match one of C's limited
selection of statements, can easily be translated instead into a series of
conditional statements and gotos. Then there might be housekeeping code,
common return code, etc that doesn't fit into a structured programming
pattern, as well as actual spaghetti code, using gotos, in the source
language.

(Actually, you can probably get rid of most control statements; just leave a
simple if-statement (you don't need else), and goto. These are the most
important.)

While there might be theoretical ways of always being able to eliminate all
gotos by convoluting the code and adding extra state variables, it's far
easier just to use gotos. Especially in target code that no-one is ever
going to read.

But for actually writing in C, obviously you will use them sparingly.
 
B

blmblm

On Jun 1, 8:46 pm, mike3 <[email protected]> wrote:

[ snip ]
^^^^^^^^^
Damn, I did it again! Dijkstra, oops. :(

Maybe you will find the following useful as a way of remembering
(I *think* I got this from a native speaker of Dutch, though I may
be misremembering): "Dykstra" might be a more logical spelling,
and the "i" and "j" together look sort of like a "y". FWIW?
 
J

Joe keane

#define SIZE 3
int func(int x)
{
int i, err;
struct bar *s[SIZE];

// init
for( err=X_OK, i=0; (err==X_OK) && (i<SIZE); i++ )
err = allocate_bar( &s );

// check success
if( err == X_OK )
{ // process ...
}

// free only allocated bar(s)
while( --i >= 0 ) free_bar( s );

// done
return err; // zero = success
}


I used 'struct bar',
for all of them,
for sake of example.

It's very likely that in real world code
they are all different types,
with different functions to allocate them and unallocate them,
and so some code to iterate over them is likely to be
some more confusing than helpful.

If they are the same type,
such that you can put them in an array,
then of course what you say is better.
 
M

Martin Shobe

Joe said:
I agree gotos are better.

int func(int x)
{
int err;
struct bar *a;
struct bar *b;
struct bar *c;
int err2;

err = allocate_bar(&a);
if (err != X_OK)
goto done;

err = allocate_bar(&b);
if (err != X_OK)
goto free_a;

err = allocate_bar(&c);
if (err != X_OK)
goto free_b;

...

/* normal exit */

err = free_bar(c);

err2 = free_bar(b);
err = err != X_OK ? err : err2;

err2 = free_bar(a);
err = err != X_OK ? err : err2;
return err;

/* error exit */

err2 = free_bar(c);
(void) err2;

free_b:
err2 = free_bar(b);
(void) err2;

free_a:
err2 = free_bar(a);
(void) err2;

done:
return err;
}

do that in C++!

If you insist. :)

void func(int x)
{
std::unique_ptr<bar> a = new bar();
std::unique_ptr<bar> b = new bar();
std::unique_ptr<bar> c = new bar();

...
}

Martin Shobe
 
J

Joe keane

I like allocation functions that return NULL on failure.

I like allocation functions that set the carry bit on failure.

My own convention is that if a function returns less than zero, it has
reset your state to the most extent possible, and none of the 'output'
parameters are changed (but be liberal in what you receive).

IMHE the stupid mistakes occur when you mix together conventions [it
-should- be easy to do this, but we are just monkeys].

When you prevent stupid mistakes, you force people to make the more
complicated mistakes.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,786
Messages
2,569,625
Members
45,320
Latest member
icelord

Latest Threads

Top