Cleanup patterns

M

MQ

Hi all

I am just wondering how most people implement cleanup in C functions.
In particular, if the function opens a number of resources, these need
to be released properly should an error occur at any point in the
function (as well as at the end if successful). C++ has exceptions,
the only way I can see to do this neatly in C is to use goto
statements. Is my method of implementing cleanup good, or are their
better ways. Here is an example, which uses the global errno to store
the error.

#define CLEANUP(err) ({errno = (err); goto cleanup})

int example_function()
{
SOMETYPE * a,b,c;
errno = 0;

if(!(a = malloc(sizeof(a))))
CLEANUP(ENOMEM);

if(!(b = malloc(sizeof(b))))
CLEANUP(ENOMEM);

if(!(c = malloc(sizeof(c))))
CLEANUP(ENOMEM);

/* do something here */

cleanup:
if(a)
free(a);
if(b);
free(b);
if(c)
free(c);
if(errno)
return -1;
return 0;
}
 
T

Tom St Denis

MQ said:
Hi all

I am just wondering how most people implement cleanup in C functions.
In particular, if the function opens a number of resources, these need
to be released properly should an error occur at any point in the
function (as well as at the end if successful). C++ has exceptions,
the only way I can see to do this neatly in C is to use goto
statements. Is my method of implementing cleanup good, or are their
better ways. Here is an example, which uses the global errno to store
the error.

#define CLEANUP(err) ({errno = (err); goto cleanup})

Don't use macros with jumps and what not.
int example_function()
{
SOMETYPE * a,b,c;

b and c are not pointers [or at least not at the same level as a]
errno = 0;

if(!(a = malloc(sizeof(a))))
CLEANUP(ENOMEM);

if(!(b = malloc(sizeof(b))))
CLEANUP(ENOMEM);

if(!(c = malloc(sizeof(c))))
CLEANUP(ENOMEM);

I'd do simply

a = malloc(sizeof(*a));
if (a == NULL) { errno = outofmem; goto ERR_A; }
b = ...
if (b == NULL) { errno = outofmem; goto ERR_B; }
....

Then at the end have

errno = 0;

ERR_C:
free(b);
ERR_B:
free(a);
ERR_A:
return errno;
}

Makes the code easy to follow and doesn't involve nasty if's all over
the place (e.g. if you avoided goto all together..)

Tom
 
E

Eric Sosman

MQ said:
Hi all

I am just wondering how most people implement cleanup in C functions.
In particular, if the function opens a number of resources, these need
to be released properly should an error occur at any point in the
function (as well as at the end if successful). C++ has exceptions,
the only way I can see to do this neatly in C is to use goto
statements.

In C, it's typical to use nested if's, and/or carefully
initialized variables, and/or decomposition into multiple
functions, and/or goto.
Is my method of implementing cleanup good, or are their
better ways. Here is an example, which uses the global errno to store
the error.

Your method is Bad(tm). For starters, it won't compile.
That could be fixed by changing the macro definition, but then
you'd be invoking undefined behavior if either of the first two
malloc() calls failed. Also, Standard C doesn't define ENOMEM,
and `errno!=0' is not a valid test for failure.

All of these problems are easily fixed or worked around, but
consider: You've advanced CLEANUP as, essentially, a way to help
write better programs. Since it demonstrably has *not* helped
you to write a better program, perhaps it's time to question its
effectiveness, no?
 
M

MQ

Eric said:
In C, it's typical to use nested if's, and/or carefully
initialized variables, and/or decomposition into multiple
functions, and/or goto.


Your method is Bad(tm). For starters, it won't compile.
That could be fixed by changing the macro definition, but then
you'd be invoking undefined behavior if either of the first two
malloc() calls failed. Also, Standard C doesn't define ENOMEM,
and `errno!=0' is not a valid test for failure.

Hi Eric

I did throw this together rather quickly and no doubt there will be a
number of errors, but after posting I fixed these problems and it does
compile fine. Ignoring this, I just want to know if the general
structure of the code is good practise. You should be able to
ascertain what I am trying to achieve...


MQ.
 
C

CBFalconer

MQ said:
I am just wondering how most people implement cleanup in C
functions. In particular, if the function opens a number of
resources, these need to be released properly should an error
occur at any point in the function (as well as at the end if
successful). C++ has exceptions, the only way I can see to do
this neatly in C is to use goto statements. Is my method of
implementing cleanup good, or are their better ways. Here is an
example, which uses the global errno to store the error.

#define CLEANUP(err) ({errno = (err); goto cleanup})

int example_function()
{
SOMETYPE * a,b,c;
errno = 0;

if(!(a = malloc(sizeof(a))))
CLEANUP(ENOMEM);

if(!(b = malloc(sizeof(b))))
CLEANUP(ENOMEM);

if(!(c = malloc(sizeof(c))))
CLEANUP(ENOMEM);

/* do something here */

cleanup:
if(a)
free(a);
if(b);
free(b);
if(c)
free(c);
if(errno)
return -1;
return 0;
}

I would implement that as:

int example(void)
{
SOMETYPE *a, *b, *c;

errno = 0;
a = b = c = NULL;
if (!(a = malloc(sizeof *a))) errno = ENOMEM;
else if (!(b = malloc(sizeof *b))) errno = ENOMEM;
else if (!(c = malloc(sizeof *c))) errno = ENOMEM
else {
/* dosomething here */
}
free(a); free(b); free(c);
return -(0 != errno);
}

Notice the absence of obfuscating macros. Also the routine no
longer frees undefined pointers or non-pointers. No goto needed,
although I do not hesitate to use them when appropriate. I would
normally choose different return values.
 
E

Eric Sosman

MQ said:
Hi Eric

I did throw this together rather quickly and no doubt there will be a
number of errors, but after posting I fixed these problems and it does
compile fine. Ignoring this, I just want to know if the general
structure of the code is good practise. You should be able to
ascertain what I am trying to achieve...

It appears you are trying to achieve what I described as
typical: "nested if's, and/or carefully initialized variables,
and/or decomposition into multiple functions, and/or goto."
My original question remains: Given that the attempt led you
into four different errors (and perhaps more in the "fixed"
version we haven't seen yet), do you still think you're going
about things in the best way possible? Note that "it does
compile fine" and "it is correct" are not the same thing!
 
M

MQ

It appears you are trying to achieve what I described as
typical: "nested if's, and/or carefully initialized variables,
and/or decomposition into multiple functions, and/or goto."
My original question remains: Given that the attempt led you
into four different errors (and perhaps more in the "fixed"
version we haven't seen yet), do you still think you're going
about things in the best way possible?

Umm, well, that was my question originally. I am looking for feedback.
And the question is more generic than the example above. I want to
know *generically* what are the best methods for implementing cleanup
in C functions. I can say for sure that *something* needs to be done,
as a function that holds potentially numerous resources cannot clean up
at the point of error, otherwise most of the code will be cleanup
stuff, and obscure the real task that the code is doing.

MQ.
 
R

Richard Heathfield

MQ said:
Hi all

I am just wondering how most people implement cleanup in C functions.

I can't speak for most people, but I do it like this:

Attempt to acquire resource
If attempt succeeded
Use resource
Release resource
Else
Handle error
Endif

This code structure is recursively extensible, and functions are a fabulous
mechanism for keeping the indent level down.
 
C

Chris Dollin

MQ said:
Hi all

I am just wondering how most people implement cleanup in C functions.
In particular, if the function opens a number of resources, these need
to be released properly should an error occur at any point in the
function (as well as at the end if successful). C++ has exceptions,
the only way I can see to do this neatly in C is to use goto
statements. Is my method of implementing cleanup good, or are their
better ways. Here is an example, which uses the global errno to store
the error.

#define CLEANUP(err) ({errno = (err); goto cleanup})

int example_function()
{
SOMETYPE * a,b,c;
errno = 0;

if(!(a = malloc(sizeof(a))))
CLEANUP(ENOMEM);

if(!(b = malloc(sizeof(b))))
CLEANUP(ENOMEM);

if(!(c = malloc(sizeof(c))))
CLEANUP(ENOMEM);

/* do something here */

cleanup:
if(a)
free(a);
if(b);
free(b);
if(c)
free(c);
if(errno)
return -1;
return 0;
}

This one I'd write (fixing a couple of things along the way) as:

int exampleFunction()
{
SomeType *a = malloc( sizeof (*a) );
SomeType *b = malloc( sizeof (*b) );
SomeType *c = malloc( sizeof (*c) );
int status = a && b && c ? 0 : -1;
if (status == 0)
{
/* do what must be done */
}
free( a ), free( b ), free( c );
return status;
}

I don't think your example is complex enough to demonstrate whatever
value your macro might have.
 
E

Eric Sosman

CBFalconer said:
I would implement that as:

int example(void)
{
SOMETYPE *a, *b, *c;

errno = 0;
a = b = c = NULL;
if (!(a = malloc(sizeof *a))) errno = ENOMEM;
else if (!(b = malloc(sizeof *b))) errno = ENOMEM;
else if (!(c = malloc(sizeof *c))) errno = ENOMEM

Missing a semicolon here, right after the possibly
undefined identifier ENOMEM.
else {
/* dosomething here */
}
free(a); free(b); free(c);
return -(0 != errno);

Note that free() is permitted to set errno to a non-zero
value. So might /* dosomething here */ if it were not a
comment.

errno is not a good holder for program state, because
it's "transient:" most library functions can change it at
will, whether or not any error has occurred. Park a value
in errno, and it may be "gone in sixty seconds."
 
B

bert

MQ said:
Hi all

I am just wondering how most people implement cleanup in C functions.
In particular, if the function opens a number of resources, these need
to be released properly should an error occur at any point in the
function (as well as at the end if successful). C++ has exceptions,
the only way I can see to do this neatly in C is to use goto
statements. Is my method of implementing cleanup good, or are their
better ways.

I went through several phases of this problem,
and ended up quite satisfied with the following
as a general scheme.

1. Define a filescope structure which can hold
all pointers, error indicators etcetera relevant
to the function.

2. In the function, declare an instance of the
structure; initialise it with NULLs; fill it up
as resource allocation proceeds; overwrite
fields in it with NULL after freeing any resource.

3. On a failure, exit with a statement such as:
return tidy_exit (this_error, alloc_struct);

It is a simple exercise to write the function
tidy_exit( ) so that it makes use of the data
in its second parameter as required, and
finally returns a copy of its first parameter.

There were some times when it seemed a bit
like overkill, but (a) it was arbitrarily extendable,
and (b) it kept the main function looking clean.
--
 
M

Michael

I am just wondering how most people implement cleanup in C functions.
In particular, if the function opens a number of resources, these need
to be released properly should an error occur at any point in the
function (as well as at the end if successful). C++ has exceptions,
the only way I can see to do this neatly in C is to use goto
statements. Is my method of implementing cleanup good, or are their
better ways. Here is an example, which uses the global errno to store
the error.

#define CLEANUP(err) ({errno = (err); goto cleanup})

int example_function()
{
SOMETYPE * a,b,c;
errno = 0;

if(!(a = malloc(sizeof(a))))
CLEANUP(ENOMEM);

if(!(b = malloc(sizeof(b))))
CLEANUP(ENOMEM);

if(!(c = malloc(sizeof(c))))
CLEANUP(ENOMEM);

/* do something here */

cleanup:
if(a)
free(a);
if(b);
free(b);
if(c)
free(c);
if(errno)
return -1;
return 0;
}

I must be the only one who thinks this is good code. I once worked on
a project where we used a style like this and had orders of magnitude
fewer resource leaks than any other project our company had done using
if/then/else. (We were writing a server on Linux that had to process
tons of transactions and stay up for months, so even a small memory
leak would quickly add up and kill it.) The way we did it is a little
subtle, and requires programmer discipline. Basically, every function
that allocates resources must follow certain rules. The good thing is
that it's fairly easy to check whether a given function does, and if
not, to correct it.

Note: what follows is a rough idea, and may suffer the usual lack of
semicolons and stuff - details are left as an exercise to the reader.
In other words, treat the following as pseudo-code.

/* Rule 1: define a couple macros like this so you can use them like
statements
with semicolons, i.e., their syntax isn't too strange when you use
them (macro
is strange, though). We passed around an error message structure,
so our macros
were a little more complex than the version here with errno. */
#define GOTO_IF_BAD_ALLOC(mem, label) \
do {\
if (mem == NULL) { \
errno = ENOMEM; \
goto label;\
}\
while (0)

int example_function(SOMESTRUCT* s)
{
SOMETYPE *a;
SOMETYPE *b;
SOMETYPE *c;
RETURNTYPE r;

/* Rule 2: initialize everything to NULL, and the error to some
generic error condition */
a = NULL;
b = NULL;
c = NULL;
errno = UNDEFINED_ERROR;
/* Note: if the function ever returns UNDEFINED_ERROR, that
indicates a logic error
in the function. (Actual errors, as well as successful
completion, should set the
error code to a real value.) */

/* Rule 3: whenever you allocate, use the macro. */
a = malloc(sizeof(a));
GOTO_IF_BAD_ALLOC(a, cleanup);

b = malloc(sizeof(b));
GOTO_IF_BAD_ALLOC(b, cleanup);

/* Rule 3b: check all function call returns too, with an analogous
macro. */
r = call_some_function(a, b);
GOTO_IF_CALL_RETURNS_ERROR(r, cleanup);
/* Depending how you do it, this macro might need an extra
parameter with error
code, string message, or both. */

c = malloc(sizeof(c));
GOTO_IF_BAD_ALLOC(c, cleanup);
s->someField = c;
c = NULL; /* We no longer own it. */
/* Rule 4: when you pass ownership of a pointer, set it to NULL,
and add a comment
as above if there's even a question that it might be confusing
to some developer
in the future. */

/* do something here */

/* Rule 5: set error code to success right before the cleanup
routine. */
errno = SUCCESS;

/* Rule 6: have a global cleanup routine that all paths go through. */
cleanup:
/* Rule 7: check each owned pointer. If non-NULL, free it. */
if(a)
free(a);
if(b);
free(b);
if(c)
free(c);

/* Rule 8: return value depends on errno. (You could map to 0/1 or
something too,
as long as it depends on error code.) */
return errno;
}

This example shows the major rules, and looks fairly yucky until you
get used to it. However, it scales well if you have some complex
logic, nested ifs, etc. (e.g., if you only allocate c in some set of
circumstances). The basic idea is that every potentially unsafe action
(allocation, function call, etc.) is followed immediately by a macro
checking its return value and jumping to cleanup if there is an error.

Further, it's easy to check whether a given variable may be leaked.
You just check that all the rules are followed for that given variable.
By iterating over all variables this way, you get a robust function.

Like I said, it took a while (probably a week) to get used to this
coding style. But then we were golden, and spent very little time
debugging resource leaks.

Michael
 
B

Bill Reid

Richard Heathfield said:
MQ said:

I can't speak for most people, but I do it like this:

Attempt to acquire resource
If attempt succeeded
Use resource
Release resource
Else
Handle error
Endif

This code structure is recursively extensible, and functions are a fabulous
mechanism for keeping the indent level down.
Well, no, except for all trivial examples that aren't speed-sensitive.

For more complex functions, you'll quite often have contigent
dependancies for each resource acquisition that don't fit neatly into
the "nest", and the overhead of the function calling mechanism
means that functions are not infinitely or always a "fabulous
mechanism" for something as silly as "keeping the indent
level down" (unless you like your "C" code to run as slow
as "Java"!).

For me, the answer is whatever works best for the particular
situation; I do tend to use a few basic "patterns" quite often (and
sorry, I do use "goto"s a lot for these purposes when warranted),
but I think the real mistake here in this thread is to find a
"one size fits all" solution.
 
R

Richard Heathfield

Bill Reid said:
Well, no, except for all trivial examples that aren't speed-sensitive.

Not so far, not for me anyway. If this method ever turns out to be too slow,
I'll call you. In the meantime, I'll keep the code clear and simple.

<snip>
 
M

MQ

Bill said:
For more complex functions, you'll quite often have contigent
dependancies for each resource acquisition that don't fit neatly into
the "nest", and the overhead of the function calling mechanism
means that functions are not infinitely or always a "fabulous
mechanism" for something as silly as "keeping the indent
level down" (unless you like your "C" code to run as slow
as "Java"!).

These are my greatest concerns, as I am writing file system driver code
that needs to be fast efficient.
 
S

santosh

MQ said:
These are my greatest concerns, as I am writing file system driver code
that needs to be fast efficient.

At the processor level both control flow statements like IF, WHILE
etc., as well as GOTOs, will use the same or same class of JUMP
instructions, so there's no reason, atleast from this point of view, to
gratituously favour GOTO over others like IF, WHILE, FOR etc.

For filesystem drivers, your choice of data structures and their
manipulating algorithms will far outweigh micro-optimisations like
GOTOs, with respect to performance. Also don't forget that hardware
bottlenecks are also likely to be more penalising.
 
R

Richard Heathfield

MQ said:
These are my greatest concerns, as I am writing file system driver code
that needs to be fast efficient.

Rule 1: if it doesn't work, it doesn't matter how fast it is.
Rule 2: it is easier to make a correct program fast than it is to make a
fast program correct.
Rule 3: it is easier to make a readable program correct than it is to make a
correct program readable.

And that's before we get anywhere near the Two Rules of Micro-Optimisation.
For completeness, these are:

Rule 1: Don't do it.
Rule 2 (for experts only): Don't do it yet.
 
B

Bill Reid

Richard Heathfield said:
MQ said:

Rule 1: if it doesn't work, it doesn't matter how fast it is.
Rule 2: it is easier to make a correct program fast than it is to make a
fast program correct.
Rule 3: it is easier to make a readable program correct than it is to make a
correct program readable.
I think these are actually the three rules for pontificating about
programming instead of actually programming...years ago I read
the REAL rules for writing "correct" "fast" and "efficient" code, and
of course, as everybody who's not just pontificating knows,
it boils down to one general rule: pick the two you really want, because
a lot of times you CAN'T have all three...
And that's before we get anywhere near the Two Rules of Micro-Optimisation.
For completeness, these are:

Rule 1: Don't do it.
Rule 2 (for experts only): Don't do it yet.
I certainly wasn't talking about anything like "micro-optimization".
I'm talking about great big gobs of "MACRO-optimization", like
slowing down or speeding up what was virtually the IDENTICAL
code by a factor of up to ten-fold (I could make the same program run
in either two seconds or 20 seconds)...

This is not exactly an academic exercise for me, since a lot of the
code I work with takes several HOURS to run on a Pentium-class
machine, and I have to run it EVERY day. If I were to ignore certain
WELL-KNOWN basic principles of writing "fast" code, there
wouldn't be enough hours in the day to run it, and the whole
program would become moot...

However, I will admit that for a lot of the "commercial" programs
out there, the actual differences may only amount to a second or
a few seconds, since they are actually only performing limited data
processing, and that my experience looking at a lot of "professional
programming" reveals that most "C" code is written with a multiplicity
of "atomic" function calls, despite the fact that overhead for those
calls slows the program down substantially...this is just one of
several "bad habits" that "C" programmers seem to fall into "naturally"
that tend to make a lot of "C" code slower (than it could be), less
"readable", more bug-prone, and less secure...
 
W

websnarf

MQ said:
I am just wondering how most people implement cleanup in C functions.
In particular, if the function opens a number of resources, these need
to be released properly should an error occur at any point in the
function (as well as at the end if successful). C++ has exceptions,
the only way I can see to do this neatly in C is to use goto
statements.

Yes, I agree. This is the easiest way of doing it to support complete
generality.
[...] Is my method of implementing cleanup good, or are their
better ways. Here is an example, which uses the global errno to store
the error.

#define CLEANUP(err) ({errno = (err); goto cleanup})

int example_function()
{
SOMETYPE * a,b,c;
errno = 0;

if(!(a = malloc(sizeof(a))))
CLEANUP(ENOMEM);

if(!(b = malloc(sizeof(b))))
CLEANUP(ENOMEM);

if(!(c = malloc(sizeof(c))))
CLEANUP(ENOMEM);

/* do something here */

cleanup:
if(a)
free(a);
if(b);
free(b);
if(c)
free(c);
if(errno)
return -1;
return 0;
}

Well simple mallocs are an especially easy case. You just initialize
them all to NULL, allocate them all, then if any are NULL, just free
them all and return with some "out of memory" error.

The way you can do this in general is:

int retcode = RETURN_OK;
if (NULL == (handleA = resourceA (parms))) {
retcode = ERROCODE(RESA);
Exit1:;
return retcode;
}
if (NULL == (handleB = resourceB (parms))) {
retcode = ERROCODE(RESB);
Exit2:;
destroyresourceA (handleA);
goto Exit1;
}
if (NULL == (handleC = resourceC (parms))) {
retcode = ERROCODE(RESC);
Exit3:;
destroyresourceB (handleB);
goto Exit2;
}

/* Your intended inner loop here. */
/* If you fail fatally here, set retcode and goto Exit4 below. */

retcode = RETURN_OK;
Exit4:;
destroyresourceC (handleC);
goto Exit3;

Which assumes stack-like resource utilization. Its wordy, but its
scalable, and allows fairly straightforward modification (you can
insert, delete or switch the order of the resource utilization fairly
easily) and maintenance. Its also generalized enough to allow you to
fopen files for reading, and construct more complicated data
structures. It also supports and propogates the semantic of "if I
cannot function perfectly, then I will take no action at all and return
with an error code". I.e., you can use this to build ADTs, and to use
ADTs in the face of potential creation failures.

I don't see that there is much you can do to make this clean in C,
other than in especially trivial cases (like just straight flat
mallocs.) You could just store the return code, then jump to a common
switch() statement which would unwind the stack of resource
allocations, but you don't save very much in terms of clarity and you
end up with a code synchronization problem. The advantage of the above
is that the creation and destruction code are right next to each other.
 
R

Roland Pibinger

I am just wondering how most people implement cleanup in C functions.
In particular, if the function opens a number of resources, these need
to be released properly should an error occur at any point in the
function (as well as at the end if successful).

Not all resources are equal and need equal treatment. It hardly makes
sense to check the return value of malloc. Just use a xmalloc function
which can be found in many variants on the internet (e.g.
http://www.tug.org/tex-archive/dviware/dvi2xx/xmalloc.c).

Best wishes,
Roland Pibinger
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,564
Members
45,039
Latest member
CasimiraVa

Latest Threads

Top