A solution for the allocation failures problem

Keith Thompson · Jan 31, 2008

Eric Sosman said:
Richard said:

Only if another error occurred in the mean time. Functions which succeed
should not (IIRC are not allowed to) set errno.

Click to expand...

See Question 12.14 in the comp.lang.c Frequently Asked
Questions (FAQ) list at http://www.c-faq.com/. Alternatively,
see 7.5p3:

"[...] The value of errno may be set to nonzero by a
library function call whether or not there is an error,
provided the use of errno is not documented in the
description of the function [...]"

And the reason for this is that the function might call other
functions, which themselves might fail without causing the calling
function to fail.

Malcolm McLean · Jan 31, 2008

Kelsey Bjarnason said:
If compiled with NDEBUG defined, that test magically disappears, meaning
that xmalloc( -3 ) now attempts to process the -3.

There is a case to be made against assert(). However you've chosen the wrong
function to demonstrate it.

That -3, since it isn't trapped, now merrily attempts to allocate some
18446744073709551613 bytes, with who knows what consequences.

The consquence can only be to return zero for an insufficient memory
condition.

An application that asks for a negative amount of memory is bugged. There's
no ideal way to treat a bugged application.

Malcolm McLean · Jan 31, 2008

Keith Thompson said:
1. Ignore it and keep going (dangerous).
2. Use an allocator that immediately aborts the program on failure.
3. Check for failure on each allocation. On failure:
3a. Immediately abort the program (equivalent to 2).
3b. Clean up and abort the program.
3c. Clean up and continue processing (this can be difficult, but
it's important if you want the program to be robust).
4. If the language supports exceptions (C doesn't), catch the error at
whatever level is appropriate, not necessarily immediately after
the attempted allocation. Perform appropriate cleanup and recovery
in the exception handler.

5. Demand user intervention to supply the required memory.

Eric Sosman · Jan 31, 2008

jacob said:
[...]
Mr Bjarnason is found of saying how clever he is. This is of course
not verifiable here. He can claim he never makes allocation mistakes
but this sounds very hollow to me. Like other claims done by the
"regulars" here, they sound pathetic, like the one of the famous
Dan Pop that claimed that he never had a crash of a program written by
him.

Boasting here is very easy.

This, from the person who launched the thread with an
admission that his own programming skills were such that
he found it "impossible" to check for malloc() failure ...

Malcolm McLean · Jan 31, 2008

Eric Sosman said:
This, from the person who launched the thread with an
admission that his own programming skills were such that
he found it "impossible" to check for malloc() failure ...

He means impossible to write and test custom allocation failure hnadling
code for each call.
Any idiot can put malloc() in a wrapper or inside an if statement. It's what
does in the body of that if statement that can a cause problems. Not for
every large program, in my opinion, but for some programs.

dj3vande · Jan 31, 2008

Keith Thompson said:
It's *always* possible for an attempted memory allocation to fail.

(Or any other resource allocation)

The possible ways to handle this possibility are:

1. Ignore it and keep going (dangerous).
2. Use an allocator that immediately aborts the program on failure.
3. Check for failure on each allocation. On failure:
3a. Immediately abort the program (equivalent to 2).
3b. Clean up and abort the program.
3c. Clean up and continue processing (this can be difficult, but
it's important if you want the program to be robust).

3d. Clean up and abort the current operation, and report failure to
the caller.
Let the caller choose how to handle it, which might be any of
the ways listed here. (Note that 1 may be less dangerous at
a higher level, depending on the operation that failed.)

4. If the language supports exceptions (C doesn't), catch the error at
whatever level is appropriate, not necessarily immediately after
the attempted allocation. Perform appropriate cleanup and recovery
in the exception handler.

(Approximately equivalent to 3d.)

dave

jacob navia · Jan 31, 2008

Malcolm said:
He means impossible to write and test custom allocation failure hnadling
code for each call.
Any idiot can put malloc() in a wrapper or inside an if statement. It's
what does in the body of that if statement that can a cause problems.
Not for every large program, in my opinion, but for some programs.

"There is no man blinder as the one that doesn't want to see"

Paul Hsieh · Jan 31, 2008

There are many situations when you can't add at each level a complex
unwinding code to take care of an allocation failure in the middle of
the construction of a complex data structure. It is much more sensible
to set an a priori strategy and not check at each call.

Ah! I see what you mean now. I have actually run across this, and
found that as long as you allocate in a consistent nested order that
there is always a way to clean up properly; though in the worst cases
it requires a bunch of gotos that *could* actually be seen as
spaghetti code.

I suppose if this nested consistency could not be guaranteed, then
these clean up fragments could be very complicated. But then I just
use the other method of getting this done which is to free everything
no matter what -- the point being that there would be some boolean
method for determining if a resource/object/whatever has not even yet
been initialized, in which case freeing is a NOOP on it. In the case
of bstrings, for example, if the bstring pointer is NULL, freeing it
becomes a NOOP.

But in both cases I am relying on conventions (that I am defining) to
make sure I can do it. I am pretty sure that my conventions don't
impact the ultimate scope of programming that I am capable of, so I
could make a claim that I think that its always possible to clean up
in a well defined way, but I can't say this with certainty. Clean-up
from failures along the way is certainly the least structured and
therefore least maintainable code I have ever written. So it would
similarly not surprise me if there were some cases where it was
actually *really hard* to write proper clean up code.

No. lcc-win implements try/catch, but of course the regulars here will
start crying "heresy heresy" so I did not mention that.

Hmmm ... right. Personally, I try to separate each of my "extensions"
from the standard. For example, my pstdint.h file, my SuperFastHash()
function, Bstrlib, my primeat library, etc, etc, all are completely
independent of each other. So people who want to use one of them
don't have to buy into my philosophy on any of the other stuff. (The
one major exception is my CSV library, since I have an agenda of
proving that Bstrlib is also equal or better than state of the art
hand crafted Clib string code even where Clib enjoys is greatest
advantage; namely parsing.)

[snip]

Ok, I see what you are doing here, but this is a desperate strategy
that I don't think fully works as well as you are hoping.

Click to expand...

It is one way of trying to cope with this. The same strategy is
implemented for the stack under windows. You get a stack overflow
exception with one LAST page still free to be allocated for
the stack. You can then still call some functions and you have
a stack of 4096 bytes reserved for this purpose. This is the
same strategy.

Yes I see. But in the case of the depleted stack there is a platform
specific design problem with your program that cannot be resolved at
runtime; so some instant desperate strategy *must* be employed. With
the heap, things are different. There many well known algorithm
alternatives which trade of memory footprint for speed -- if you fail
with the memory allocation method, you can often just retry your
algorithm with a slower memory conservative solution. This is
fundamentally why I cannot endorse the xmalloc() design -- it removes
a legitimate programming path in which you literally write recovery
code in your software (if you could no longer use malloc() I mean).

This is just a model OBVIOUSLY. No multi-threading considerations are
in this code. It is just an outline of how this could be solved.

Right. I am just sensitive to this sort of thing, so I feel the need
to bring it up when I see it. Its clear the entire C language
committee seems to care less about these things as they continue to
endorse errno(), strtok(), asctime() and so on.

This is a good point.

It must implement a counter so that it doesn't get into an infinite
loop.

Right. In fact it needs to be a thread safe counter.

I just realized that like any event handlers, you want to have a way
to chain them. In this case you will also need:

allocEvent getAllocEvent (enum incident);

[...]

This are quite reasonable specs. I will try to think them over
and maybe implement them in the lcc-win library.

Well, obviously I have a lot more that I would like to add, like a way
of examining an allocation and get back the size of the allocation
from it (many compilers have this as an extension already), a counter
for the total number of bytes currently allocated, the ability to walk
through your allocated memory blocks (many compilers have something
*similar* to this) and in fact, a memory classifier function
(something that could look at a pointer and give you either a
definitive classification for the pointer (static, auto, program,
heap, unknown) or at least a guess.) There is also potential for an
improved realloc() but there were some issues with my original
thinking about it, and I haven't revisited it.

I think we need to recognize that debugging is a standard part of
development that should be made a standard part of the library.
Because of C's default "handle everything about memory" design, in
order to even compete, in the long term, with GC languages, the
language *needs* to expose as much information as possible to
demonstrate that its still feasible to stick with its approach. Right
now, C programmers have a situation where we not only have to deal
with all memory allocation by ourselves but we have no idea what the
global state of our memory manager is in, or even what the memory
utilization of our program is. In other words C programmers don't
actually know more about the memory state than a Java programmer does
even though C has a totally deterministic interface to memory.

Paul Hsieh · Jan 31, 2008

Changing the syntax for memory allocation doesn't magically solve
anything.

It varies from language to language. Typically either the allocator
returns a null pointer on failure (exactly what C's malloc() does), or
it throws/raises an exception (if the language supports exceptions).

It's *always* possible for an attempted memory allocation to fail.
The possible ways to handle this possibility are:

1. Ignore it and keep going (dangerous).

This is not a solution.

2. Use an allocator that immediately aborts the program on failure.

You only do this because you don't know how to solve the problem
without all that memory, or else, you *can't* solve the problem
without all that memory (rare.)

3. Check for failure on each allocation. On failure:
3a. Immediately abort the program (equivalent to 2).
3b. Clean up and abort the program.
3c. Clean up and continue processing (this can be difficult, but
it's important if you want the program to be robust).

3a = 3b = 2. Aborting doesn't require that you clean up first -- any
real OS(TM) will clean you up upon aborting anyways.

Your 3c only makes sense if you are talking about your whole program,
not just the code fragments where the error occurs (because that
typically will make no sense.) But you are inherently omitting how
you *achieve* it. In straight C you do it by observing the following
in your primitive calls:

3c1) Undo all operations in your scope and return from the current
function with an error code (my typical approach is NULL for pointers
and negative numbers for status codes).

which is applied recursively throughout your "primitives" call chain.
And in your policy/management calls:

3c2) If a strategy fails, go to the next strategy until all are
exhausted, then if everything fails log or return an error code that
amounts to "INCOMPLETE DUE TO LACK OR RESOURCES".

Of course this means you have to understand the distinction between
primitive calls and policy/management calls in your program.
Primitives make and handle data structures, whereas policy performs
actions on these data structures.

4. If the language supports exceptions (C doesn't), catch the error at
whatever level is appropriate, not necessarily immediately after
the attempted allocation. Perform appropriate cleanup and recovery
in the exception handler.

This is just a fancy way of implementing 3c with special language
features.

Going back to "at those lines of code" programming solutions I would
add:

5. Defer the error (but retain this information) until it can be
handled in an aggregated way.

6. Provide a stand-in for typical functionality. (I.e., return some
static memory declaration.) This can only be done in rare instances
where contention is a less serious issue than running out of memory,
or if you can guard the memory location with a semaphore, for example.

Flash Gordon · Jan 31, 2008

Paul Hsieh wrote, On 31/01/08 18:47:

Ah! I see what you mean now. I have actually run across this, and
found that as long as you allocate in a consistent nested order that
there is always a way to clean up properly; though in the worst cases
it requires a bunch of gotos that *could* actually be seen as
spaghetti code.

I have a different but possibly related method...

I suppose if this nested consistency could not be guaranteed, then
these clean up fragments could be very complicated.

Or just as simple...

But then I just
use the other method of getting this done which is to free everything
no matter what -- the point being that there would be some boolean
method for determining if a resource/object/whatever has not even yet
been initialized, in which case freeing is a NOOP on it. In the case
of bstrings, for example, if the bstring pointer is NULL, freeing it
becomes a NOOP.

For me the boolean method for determining whether a resource has been
created yet is set the pointers to NULL at the point where they are
created. For example...

struct difficult {
struct nested *ptr1;
struct more *ptr2;
...
};

static const difficult_null = {0};

struct difficult ptr = malloc(sizeof *ptr);
*ptr = difficult_null;

Now I can create the rest in an order and if I write all my "destroy"
functions so that like free they are a noop for a null pointer all is
simple.

But in both cases I am relying on conventions (that I am defining) to
make sure I can do it. I am pretty sure that my conventions don't
impact the ultimate scope of programming that I am capable of, so I
could make a claim that I think that its always possible to clean up
in a well defined way, but I can't say this with certainty.

I suspect your convention is probably similar to what I've just shown
and I can't think of how destroying a partially created object can be
difficult if you stick to it.

Clean-up
from failures along the way is certainly the least structured and
therefore least maintainable code I have ever written. So it would
similarly not surprise me if there were some cases where it was
actually *really hard* to write proper clean up code.

Apply appropriate methods at all levels of constructing the object and I
find it hard to see how you can reach a really hard to destroy
situation. Do a bad job of constructing, on the other hand, and you can
make it damn near impossible.

I have code, not written by me, that can manage to tidy up a heck of a
lot through some horrendous structures. At most levels the destroy code
is simple linear code or simple loops, the only places it is tricky is
where IMHO the constructor is truly horrible.

I happen to like try/catch, but it is not part of the C language.

Hmmm ... right. Personally, I try to separate each of my "extensions"
from the standard.

I agree with your implication that your libraries are not really
extensions, just third party libraries. I also agree with keeping them
independent *unless* there is a real dependency. For example, libxslt
depending on libxml makes perfect sense to me. With real extensions
there is a much bigger argument for keeping them separate, and to be
fair to Jacob I believe his try/catch extension is independent from his
other extensions.

>snip>

With the stack I don't think it "frees the stack to make it available
for use during recovery". Rather it uses the already allocated reserve
stack. Linux can have a separate stack for signal handling for probably
similar reasons. So in the malloc instance I would say you make use of
the pre-allocated reserve rather than freeing it so you can do further
mallocs whilst recovering.

Yes I see. But in the case of the depleted stack there is a platform
specific design problem with your program that cannot be resolved at
runtime; so some instant desperate strategy *must* be employed. With

There is another strategy. You analyse your code and *prove* the maximum
stack usage. This is easiest if you completely avoid recursion. It was
also a requirement for the embedded work I used to do in which recursion
was actually banned.

the heap, things are different. There many well known algorithm
alternatives which trade of memory footprint for speed -- if you fail
with the memory allocation method, you can often just retry your
algorithm with a slower memory conservative solution.

The same can apply to stack usage. Switch to an algorithm that does not
use recursion for example (maybe implementing a "stack" using malloc ;-)).

This is
fundamentally why I cannot endorse the xmalloc() design -- it removes
a legitimate programming path in which you literally write recovery
code in your software (if you could no longer use malloc() I mean).
Yup.

Right. I am just sensitive to this sort of thing, so I feel the need
to bring it up when I see it. Its clear the entire C language
committee seems to care less about these things as they continue to
endorse errno(), strtok(), asctime() and so on.

<snip>

errno is not a function ;-) Actually, errno can be implemented in a
thread-safe manner and I believe the it is on current Linux systems.
This actually broke some ancient code I inherited which provided its own
external declaration of errno rather than including error.h

strtok, asctime and other functions which use static data are a pain
though and should be avoided when doing threading. However removing them
from the language would break perfectly good single-threaded code that
uses them correctly, so I don't think they will be removed.

Flash Gordon · Jan 31, 2008

Paul Hsieh wrote, On 31/01/08 19:40:

This is not a solution.

Agreed. However, it is what some people seem to do.

You only do this because you don't know how to solve the problem
without all that memory, or else, you *can't* solve the problem
without all that memory (rare.)
Agreed.

3a = 3b = 2. Aborting doesn't require that you clean up first -- any
real OS(TM) will clean you up upon aborting anyways.

<snip>

Cleaning up is not always simply a matter of freeing memory and closing
files. In the clean-up code we have for one application I work on it
also involves...
Sending a message to the client to say it is aborting
Deleting temporary files that where deliberately *not* opened using
tmpfile (there is reason for this)
Logging that it is crashing
Sending an email saying that it is crashing I've tested this and it
can manage it some of the time

Keith Thompson · Jan 31, 2008

Paul Hsieh said:
This is not a solution.

I didn't mean to imply that it is.

You only do this because you don't know how to solve the problem
without all that memory, or else, you *can't* solve the problem
without all that memory (rare.)
Agreed.

3a = 3b = 2. Aborting doesn't require that you clean up first -- any
real OS(TM) will clean you up upon aborting anyways.

No, 3b is not equivalent to 2. Cleaning up can include
application-specific stuff that the OS can't be expected to handle.
For example, a text editor might save a copy of its buffer to disk
before aborting, as opposed to having the memory allocator
preemptively abort the program without giving the application to save
anything.

Your 3c only makes sense if you are talking about your whole program,
not just the code fragments where the error occurs (because that
typically will make no sense.) But you are inherently omitting how
you *achieve* it.

Yes, I am.

[...]

This is just a fancy way of implementing 3c with special language
features.

Yes, but it can be a lot easier for the programmer (at the expense of
extra work for the implementation).

[snip]

Malcolm McLean · Jan 31, 2008

Paul Hsieh said:
You only do this because you don't know how to solve the problem
without all that memory, or else, you *can't* solve the problem
without all that memory (rare.)

Rarely it makes sense to have adjustable buffer sizes. But only rarely. A
theoretical result is that you _can_ perform any computation with just a
disk file and a finite list of states, but generally if a small amount of
memory is not made available, you might as well forget about the operation.

You've instantly doubled you development costs by insisting on this.

Eric Sosman · Jan 31, 2008

Malcolm said:
Rarely it makes sense to have adjustable buffer sizes. But only rarely.
A theoretical result is that you _can_ perform any computation with just
a disk file and a finite list of states, but generally if a small amount
of memory is not made available, you might as well forget about the
operation.

You've instantly doubled you development costs by insisting on this.

Can you cite any measurement or any research or even
any opium-induced dreams to substantiate this claim? Or
is this just the latest in your long string of unsupported
quantitative assertions?

CBFalconer · Jan 31, 2008

jacob said:
.... snip ...

C can use garbage collection and exceptions. For an implementation
of those features in C see the lcc-win compiler system

No, C can't. C can use an extension library implementing such a
system for some subset of the memory allocation problems. Please
try to keep that distinction in mind. The C standard already
specifies various things about the malloc/calloc/realloc/free
subsystem.

CBFalconer · Feb 1, 2008

jacob said:
.... snip ...

Boasting here is very easy.

Nah. There are two things I never do. Boast, and make mistakes.

CBFalconer · Feb 1, 2008

Malcolm said:
The consquence can only be to return zero for an insufficient
memory condition.

An application that asks for a negative amount of memory is bugged.
There's no ideal way to treat a bugged application.

You seem to have a reading impediment. It asked for:
18446744073709551613
repeat 18446744073709551613 bytes.

To me, 18446744073709551613 is not a negative number. If the
malloc system has such a memory block available, it should mark it
as used and return a pointer to it.

CBFalconer · Feb 1, 2008

Malcolm said:
5. Demand user intervention to supply the required memory.

Due to my natural intransigence, when machines 'demand' I tend to
ignore them. This has been known to lead to program failure.

Morris Dovey · Feb 1, 2008

jacob said:
Mr Bjarnason is found of saying how clever he is. This is of course
not verifiable here. He can claim he never makes allocation mistakes
but this sounds very hollow to me. Like other claims done by the
"regulars" here, they sound pathetic, like the one of the famous
Dan Pop that claimed that he never had a crash of a program written by
him.

Boasting here is very easy.

Sometimes it's good to be an "irregular" so as to be (or at least
feel) exempt from this kind of envy. <g>

Hmm - I'm not terribly clever, so if you consider yourself
particularly clever you're welcome to feel superior. I don't let
my lack of cleverness keep me from boasting a bit from time to
time, but just so that you don't feel threatened I'll offer the
following credentials right up front:

~: ls -al --full-time core
-rw------- 1 mrd 262144 Tue Jan 29 14:17:30 2008
core

Jacob, you can cast yourself in the role of pariah if you choose
- but so doing won't gain you much traction. No offense intended,
it just hasn't ever worked in anyone's favor here...

I don't know Kelsey, but can offer that Dan Pop liked to play the
role of an irascible old curmudgeon. He plays that role well, but
you're less clever than you imagine if you confuse the actor with
the role.

pete · Feb 1, 2008

Pardon me.I haven't programmed for a few years but I was involved in
some fairly complex software development. Why is it not possible to
check every malloc result?

10 Because it's impossible to write "if"
as many times as "malloc".

9

dynamic allocation file buffer	26	Sep 9, 2008
A solution to the MSVCRT vs MSVCR71 problem?	7	Jan 21, 2007
The recent SPAM messages, and a suggested solution	0	Oct 6, 2008
PyWart: The problem with "print"	102	Jun 2, 2013
looking for a solution	2	Jan 16, 2005
memory allocation logging for leak detection - problem with globals	1	May 9, 2004
Help me find the best class design for following problem	1	Nov 5, 2011
how to change "solution" paths for source control	0	Nov 10, 2006

A solution for the allocation failures problem

Keith Thompson

Malcolm McLean

Malcolm McLean

Eric Sosman

Malcolm McLean

dj3vande

jacob navia

Paul Hsieh

Paul Hsieh

Flash Gordon

Flash Gordon

Keith Thompson

Malcolm McLean

Eric Sosman

CBFalconer

CBFalconer

CBFalconer

CBFalconer

Morris Dovey

pete

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads