When to check the return value of malloc

K

Kelsey Bjarnason

[snips]

I am not in any sense tied to the xmalloc() error-handling method. If
anyone can come up with a better idea, please post. It must not return
null unless no memory is to be found, and it must be easier for caller
to use than regular malloc().

Your malloc "replacement" is not easier to use than malloc; it is, in
fact, more difficult, as it lies about what it does, then fails to even
meet the details of the lie - and it does erroneous things with perfectly
legitimate allocation requests.

So, let's see. "it must not return NULL unless no memory is to be
found". Hmm. That would be malloc.

Easier to use? You just don't *get* much easier than malloc. To call it
usefully, you have to store the result into a pointer anyhow. Validation
of success is accomplished by a single if statement: it's just about
impossible to get simpler than that and still be useful.

And if nothing else, malloc doesn't lie about what it actually does, and
works on all legitimate size requests, if memory is, in fact, available
to meet the request.
 
S

santosh

Kelsey said:

And if nothing else, malloc doesn't lie about what it actually does,
and works on all legitimate size requests, if memory is, in fact,
available to meet the request.

Unfortunately, only on systems that do not overcommit memory.
 
R

Richard Tobin

Richard Heathfield said:
If I see an assertion failure dialog box pop up on a shipped program, my
immediate thought is "this wasn't tested properly; I will consider ceasing
to use it".

This is an argument in favour of proper testing, not against leaving
assertions enabled.
Now, I agree that in this circumstance I am better off for
knowing that the program wasn't tested properly. But I'd far rather have a
program that *was* tested properly! :)

I'd rather have a program that was tested properly *and* left assertions
(or something similar) enabled.

-- Richard
 
M

Malcolm McLean

Kelsey Bjarnason said:
The whole point of the function is to take error-handling code which
will never[1] be executed out of calling code.

If the error handling code will never be called - because malloc will
never fail - then there's simply no need for your function, the whole
point of which is to cope with situations where malloc fails.
The point is to have a program that is correct.
Let's say we've got a function that sorts random data. For some reason, it
will crash if inputs are in bit-reversed order. As long as the list is
longer than about ten items, we can confidently say that the crash will
never happen.
However there is a point in detecting the bad input and resorting to qsort()
or something. Simply because it is better to have a correct program than an
incorrect one.

Similarly, xmalloc() is for allocations that "can't" fail. However sometimes
programmers miscalculate - maybe that trivial allocation reallly will be
called millions of times. Maybe the user is deliberately running the app
with most of the memory disabled. So it handles the out of memory condition
reasonably well, but take the checks out of caller's source code, where they
are cluttering the functions and adding to maintenance costs.
 
W

Willem

Malcolm wrote:
) reasonably well, but take the checks out of caller's source code, where they
) are cluttering the functions and adding to maintenance costs.

You could also argue that, in a program that always checks the return value
of malloc, most of the checks will be extremely similar. Moving often-used
pieces of code into a sub function is a Good Thing(tm).


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
 
M

Malcolm McLean

Kelsey Bjarnason said:
Your malloc "replacement" is not easier to use than malloc; it is, in
fact, more difficult, as it lies about what it does, then fails to even
meet the details of the lie - and it does erroneous things with perfectly
legitimate allocation requests.

So, let's see. "it must not return NULL unless no memory is to be
found". Hmm. That would be malloc.

Easier to use? You just don't *get* much easier than malloc. To call it
usefully, you have to store the result into a pointer anyhow. Validation
of success is accomplished by a single if statement: it's just about
impossible to get simpler than that and still be useful.


And if nothing else, malloc doesn't lie about what it actually does, and
works on all legitimate size requests, if memory is, in fact, available
to meet the request.
Sure. Once or twice in your program you may have a legitimate memory request
that could exceed the range of an int. In that case you, the programmer,
should be aware that you are asking for an enormous amount of memory. Unless
you totally control the system, you should also be aware that it is rather
likely that the request won't be honoured.

The test if(ptr == NULL) is trivial. What's non-trivial is what goes in the
if condition, You might want to put up a box for the user saying "image too
large". In which case your windowing system had better be in scope.

If the allocation is most unlikely to fail, why go through all this,
polluting things like a string duplicator with calls to windowing functions,
or stderr? In which case one version will fail if you don't compile for that
OS, whilst the other will fail under MS Windows, though not under X. Or
maybe you should litter the code with #ifdefs. That's where xmalloc() is
better.
 
R

Richard Heathfield

Richard Tobin said:
This is an argument in favour of proper testing, not against leaving
assertions enabled.

If you leave assertions enabled in production code, you drastically limit
what you can reasonably do with them.
I'd rather have a program that was tested properly *and* left assertions
(or something similar) enabled.

Let me give you a real world example. I have a bbst library in which,
before and after every "transaction" that modifies the tree (i.e. an
insertion, a data update, a removal, or a re-balancing), there is an
assertion that checks that the tree is properly constructed and properly
balanced (because, if it isn't, then I've broken the bbst library
somehow).

The check is O(n), obviously, and so if I add n items to the tree, the
overall checking strategy becomes O(n*n), dramatically degrading the
performance of the program. I am prepared to put up with this (up to a
point) in development, but I could not expect real users to tolerate it.
So I *have* to switch off that check for a production release, or remove
it altogether. I'd rather just switch it off.
 
C

CBFalconer

Richard said:
.... snip ...

If I see an assertion failure dialog box pop up on a shipped
program, my immediate thought is "this wasn't tested properly;
I will consider ceasing to use it". Now, I agree that in this
circumstance I am better off for knowing that the program wasn't
tested properly. But I'd far rather have a program that *was*
tested properly! :)

Or that some hardware failed. Or that a cosmic ray struck. Etc.
Assertions pick up faults from many causes, and are not limited to
programming goofs.
 
R

Richard Tobin

Let me give you a real world example. I have a bbst library in which,
before and after every "transaction" that modifies the tree (i.e. an
insertion, a data update, a removal, or a re-balancing), there is an
assertion that checks that the tree is properly constructed and properly
balanced (because, if it isn't, then I've broken the bbst library
somehow).

The check is O(n), obviously, and so if I add n items to the tree, the
overall checking strategy becomes O(n*n), dramatically degrading the
performance of the program. I am prepared to put up with this (up to a
point) in development, but I could not expect real users to tolerate it.
So I *have* to switch off that check for a production release, or remove
it altogether. I'd rather just switch it off.

Yes, assert() isn't very flexible. For example, it would be useful to
be able to specify a message to be printed along with the failed test.

There are ways to get around your specific problem. For example, you
could have your own macro NEXPENSIVEDEBUG that you set to 1 when you
compile code for production use and 0 otherwise, and use something
like

assert(NEXPENSIVEDEBUG || expensive_test());

I suspect I would often want to turn it off during development too!

If I were writing a large program where I need such fine control I
would probably write my own assert()-like macros.

-- Richard
 
I

Ian Collins

Richard said:
Let me give you a real world example. I have a bbst library in which,
before and after every "transaction" that modifies the tree (i.e. an
insertion, a data update, a removal, or a re-balancing), there is an
assertion that checks that the tree is properly constructed and properly
balanced (because, if it isn't, then I've broken the bbst library
somehow).

The check is O(n), obviously, and so if I add n items to the tree, the
overall checking strategy becomes O(n*n), dramatically degrading the
performance of the program. I am prepared to put up with this (up to a
point) in development, but I could not expect real users to tolerate it.
So I *have* to switch off that check for a production release, or remove
it altogether. I'd rather just switch it off.
Let me provide a counter example, I had an embedded product in the field
and we received a number of reports from customers that the units were
rebooting. When we checked the assertion log of one, we found a device
was generating an interrupt when it should not, which would have cased
bad data to enter the system. It turned out there was a problem with a
batch of parts and we were able to provide a patch for the vulnerable
units. If there wasn't an assert to catch the problem, we would have
had great difficulty tracking the fault.
 
I

Ian Collins

Richard said:
Surely it's obvious that Malcolm means that these compilers don't
themselves set NDEBUG in release mode?
What is release mode? Just about every product I've worked on had it's
own set of compiler options used for production builds.
 
R

Richard Heathfield

Ian Collins said:
So I *have* to switch off that check for a production release, or remove
it altogether. I'd rather just switch it off.
Let me provide a counter example, I had an embedded product in the field
and [...war story...]

Perhaps the lesson to draw from this is that whether assertions should be
left on is a more complex question than we normally think, and that the
right answer (to NDEBUG or not to NDEBUG) depends on circumstances that
will vary across the industry, the nature of the application, and perhaps
even the nature of the assertion itself.

I suppose it's no accident that, uniquely amongst standard headers,
<assert.h> can be included multiple times within a source with its effect
changing each time if desired.
 
M

Malcolm McLean

Ian Collins said:
What is release mode? Just about every product I've worked on had it's
own set of compiler options used for production builds.
In gcc you have a -O setting to set the optimisation level. Normally you
leave it unset whilst debugging, and ramp it up to maximum for the
production build.

Microsoft compilers similarly have "debug" and "release" configurations".

In the embedded world things are different. Often you will work with a
"soft" chip which is actually an IO device attached to the PC, and only blow
a genuine device in the final stages.
 
I

Ian Collins

Malcolm said:
In gcc you have a -O setting to set the optimisation level. Normally you
leave it unset whilst debugging, and ramp it up to maximum for the
production build.

Microsoft compilers similarly have "debug" and "release" configurations".
That's the bit I don't like, I'm sure people just select "debug" or
"release" in some IDE without evaluating all the relevant options.
 
A

Army1987

Richard said:
Yes, assert() isn't very flexible. For example, it would be useful to
be able to specify a message to be printed along with the failed test.
assert(test() && "message");
 
A

Army1987

Malcolm said:
In gcc you have a -O setting to set the optimisation level. Normally you
leave it unset whilst debugging, and ramp it up to maximum for the
production build.
Why?
 
M

Malcolm McLean

Army1987 said:
Because it compiles faster with the optimisations settings set low. Also, if
you turn debugging or profiling on it gives a stack trace rather than
merging subroutines.
Then there are a few bugs that appear with high optimisation settings.
 
U

Ulrich Eckhardt

Richard Tobin wrote:
[ using int vs size_t for the size in a malloc replacement]
I've had several bugs (my own and others) where a miscalculation
resulted in an attempt to malloc() a negative size. If the size
parameter could be negative, you might get a more informative
error message. (Of course, after a while you get to recognise what
malloc failing with a an argument of 4294967292 means.)

Well, it might have discovered errors for you, and I don't think anyone
claims that it isn't possible, but that is not what people are arguing
against. No, the problem with using a malloc-like function using int for
the size is that it is not a general replacement. Even worse, you might
calling this with 0x14000 and since int is 16 bit it will allocate 0x4000!
This will obviously cause buffer overflows. Further, a legitimate
allocation of 0x20000 is simply not possible. Lastly, for another program
which just inputs and outputs numbers and therefore never needs more than a
few hundred bytes, the artificial limit of 0x7fff byte is pure nonsense.

Seriously, the remedy for this has been already mentioned in this thread
long ago: use size_t, as it was meant to be, and use a configurable maximum
allocation size.

Some further notes:
1. I know that int is not 16 bits. ;)
2. Converting size_t to int can cause buffer overflows. Using size_t the
whole way through doesn't guarantee freedom from those errors, you can
still mess up size calculations. Using xcalloc( size, count) or a similar
interface helps.

Uli
 
R

Richard Tobin

Ulrich Eckhardt said:
Well, it might have discovered errors for you, and I don't think anyone
claims that it isn't possible, but that is not what people are arguing
against. No, the problem with using a malloc-like function using int for
the size is that it is not a general replacement. Even worse, you might
calling this with 0x14000 and since int is 16 bit it will allocate 0x4000!

I'm certainly not suggesting that you should use int for a
general-purpose malloc() replacement. Rather, it would be an
argument for size_t being signed.

-- Richard
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,787
Messages
2,569,627
Members
45,329
Latest member
InezZ76898

Latest Threads

Top