When to check the return value of malloc

C

CBFalconer

Ian said:
You forgot option 4, use the operating system's file mapping API
and let the OS take care of the memory allocations.

He also forgot option 1a, replace fgetc with getc and use the
systems builtin buffers.
 
D

David Thompson

Randy said:
Eric Sosman wrote

This thread was useful, now I know I never have to buy extra
memory again.

PROVIDED you malloc it in 20 byte chunks. Since the standard
specifies that freed memory be made available again, you must be
perfectly safe in allocating 4k by:

for (i = 0; i < 20; i++) a = malloc(20);
for (i = 0; i < 20; i++) free(a);
ptr = malloc(4000);

with suitable declarations for a, i, ptr, and all the needed
#includes. Learning is wunnerful.

Apparently this is some strange new value of 20 * 20 of which I was
previously unaware.

(I know it's a joke, but cf. Twain on lightning != lightningbug.)
<G>
- formerly david.thompson1 || achar(64) || worldnet.att.net
 
R

Randy Howard

Of course. However, the reliability is increased by something like
a factor of 1000 or more. I seem to remember quotes of 100,000
thirty years ago.

I have systems here with ECC memory in them. The memory is official,
supported memory from the hardware vendor. I can replace those memory
modules with a dozen other samples. I can replace the motherboard with
any of several identical motherboards from the same vendor. I can make
it corrupt memory in less than 15 seconds on any variation you choose.
It's called poor signal integrity, and it's far more common than most
people think. They just don't have typical usage patterns that employ
data that makes this problem obvious.
 
B

Ben Pfaff

Randy Howard said:
I have systems here with ECC memory in them. The memory is official,
supported memory from the hardware vendor. I can replace those memory
modules with a dozen other samples. I can replace the motherboard with
any of several identical motherboards from the same vendor. I can make
it corrupt memory in less than 15 seconds on any variation you choose.
It's called poor signal integrity, and it's far more common than most
people think. They just don't have typical usage patterns that employ
data that makes this problem obvious.

Interesting. What usage patterns make this problem obvious?
 
R

Randy Howard

Interesting. What usage patterns make this problem obvious?

Something like a BERT tester is common, often implemented in software
for memory testing in a production system. A SmartBits is a common
hardware solution for testing networked devices in a similar fashion.

In essentially random usage, you might experience a lockup, reboot, or
unexplained data corruption every once in a while, and blame it on your
OS, an application you don't like, sunspots, the phase of the moon, or
it being a Monday. :)
 
B

Ben Pfaff

Randy Howard said:
Something like a BERT tester is common, often implemented in software
for memory testing in a production system. A SmartBits is a common
hardware solution for testing networked devices in a similar fashion.

I have had systems that I've run memory testing software
(memtest86) on for days without corruption appearing. It seems
bizarre that you'd have such reproducible corruption in 15
seconds. Makes me wonder whether it's just a badly designed
model of motherboard.
 
M

Malcolm McLean

santosh said:
You can set a global variable from the call-back to tell the higher
level code the amount of memory that was actually allocated. Ugly, but
doable.
That's a workaround. However it is easier just to call malloc() if you've
got a failure strategy. xmalloc() is for when the only failure strategy,
realistically, is to abort, or for when you are running in a multi-tasking
environment and you know that you can make small amounts of memory available
by killing off something else or allowing it to finish.
 
M

Malcolm McLean

Kelsey Bjarnason said:
No, it's a case of engineering the application to do what it *should* do,
and engineering it *well* to do that job.

Overengineering is what you get when the designs are significantly beyond
the requirements, like writing a calculator app that handles everything
from bignums to integration, when what was asked for was a simple 4-
function calculator for kitchen-scope math.
No, that's not the definition of over-engineering. Over-engineering is when
the design phase costs more than is justified by the improvement to the
product. Which can frequently include over-specifying the result. However
the example you gave wouldn't normally be called an "over-engineered"
calculator but an "over-specified" one. If you run a 2 week psychological
experiment to see if function buttons along the bottom lead to more input
errors than function buttons down the side, that's over-engineering.
Probably. Not if you are going to sell hundreds of millions of that model.
 
K

Kelsey Bjarnason

[snips]

Let me provide a counter example, I had an embedded product in the field
and we received a number of reports from customers that the units were
rebooting. When we checked the assertion log of one, we found a device
was generating an interrupt when it should not, which would have cased
bad data to enter the system.

And this is impossible to do with an if (...) which wouldn't abort the
program, but instead give you endless opportunities to do other, more
graceful things - or to simply abort, should that be best?

I will never understand this bizarre reliance on assert, when the
language actually contains conditional branching constructs.
 
K

Kelsey Bjarnason

You forgot option 4, use the operating system's file mapping API and let
the OS take care of the memory allocations.

You mean memory mapping? I've used that, at least in Windows, and found
that it's most significant effect is to grind the system to a halt. I
suspect this may have something to do with the size of the file, the
manner in which it's being processed and the like, that is, a specific-
case problem rather than a general-case problem, but the results were
worse, as in orders of magnitude worse, than doing the same operations
using the slowest of the usual file access methods.

Not saying such functions ain't useful; they just weren't useful, in my
testing, for the code in question.
 
K

Kelsey Bjarnason

[snips]

You can set a global variable from the call-back to tell the higher
level code the amount of memory that was actually allocated. Ugly, but
doable.

Ugly and in many cases _not_ doable. CLC aside, many apps use threads.
 
K

Kelsey Bjarnason

[snips]

No, that's not the definition of over-engineering. Over-engineering is
when the design phase costs more than is justified by the improvement to
the product.

Exactly - like giving 'em a calculator that does bignums and integration
when what they wanted was a 4-function item to do kitchen math.
Which can frequently include over-specifying the result.
However the example you gave wouldn't normally be called an
"over-engineered" calculator but an "over-specified" one.

No, it would be overengineered, as per your definition. Requirement was
something simple. Additional options may well degrade from usability.
Design was far out of scope of justification.

Meanwhile, we're still left with xmalloc, arguably the worst possible way
of creating an easy allocator. It's design cannot be considered
overengineering, as its design is virtually non-existent, what is there
is broken, and doesn't even meet the lies it tells.

You're not one to prattle about engineering practices, Malcolm.
 
P

pete

Dereferencing an allocated pointer,
without prior checking the value, is wrong.

Whenever an allocated pointer is going to be dereferenced
or used in pointer arithmetic, the value should be checked first.
No.

Imagine you've got 2GB installed and are allocating 20 bytes.
The system is stressed and programs crash or terminate
for lack of memory once a day. Any
more than that, and no-one would tolerate it.
So the chance
the crash being caused by your allocation is 1/ 100 000 000,

If your program is a mail sorter
and makes an allocation for every peice of mail sorted,
then with odds like that,
your program won't crash when you test it on the machine in lab,
but your program will crash five times
on the day that it is installed in the post office.
 
K

Kelsey Bjarnason

Kelsey said:
[snips]

Let me provide a counter example, I had an embedded product in the
field and we received a number of reports from customers that the
units were rebooting. When we checked the assertion log of one, we
found a device was generating an interrupt when it should not, which
would have cased bad data to enter the system.

And this is impossible to do with an if (...) which wouldn't abort the
program, but instead give you endless opportunities to do other, more
graceful things - or to simply abort, should that be best?
As does the definition of the assert macro.
I will never understand this bizarre reliance on assert, when the
language actually contains conditional branching constructs.
It clearly expresses that the condition being tested should never
happen.

Like when you try to open the app's configuration file and it doesn't
exist, or isn't readable, at which point you have many options: log the
reason for the failure to the system log and exit. Pop something up to
alert the user, who can then either fix the problem and continue or
decide to cancel the app run. Create a default config file, alert the
user (or log, or both) and continue or exit. Skip the file entirely, use
defaults from the app, notify the user (and/or log) and continue or exit.

An assert can do some or all of this with enough jiggery-pokery, I
suppose, but it's an ugly way to do it, particularly as all your
wonderful testing instantly vanishes into the ether the second someone
recompiles with NDEBUG.

Designing error traps which vanish that easily is not, IMO, a good idea,
except perhaps in the case where you can guarantee, absolutely, nobody
anywhere under any circumstances whatsoever will ever be able to compile
your code outside your build environment.

Personally, I prefer methods which aren't so fragile.

For a perfect example of this sort of thing, see Malcolm's xmalloc, which
disallows negative size values, *unless* NDEBUG is defined, in which case
it allows them just fine, thanks.
 
D

dj3vande

[snips]

xmalloc() takes a signed int rather than a size_t as the size parameter.

Yes, it does, which leads to the immediate question "What benefit is
there to allocating -3 bytes?"

That should be obvious.
I need 3 bytes, and I don't feel like handling an allocation failure.
I'll just allocate -3 bytes, which adds -3 to the total allocated
memory size. Now I know I'm at least 3 bytes below the maximum (since
whatever it was before was obviously not above the maximum and now I'm
3 bytes lower). Then I'll allocate my 3 bytes and IT CAN'T FAIL!!

....Oh, wait, the person who's claiming that this makes sense has
already asserted that allocating three bytes can never fail, even
without going through these contortions first. Never mind.


dave
 
D

dj3vande

CBFalconer said:
He also forgot option 1a, replace fgetc with getc and use the
systems builtin buffers.

getc and fgetc will use the same stdio buffer, so if the problem is
buffer size that won't solve anything.
If fgetc is too slow, that means either (1) you're doing so little with
the characters that function call overhead is the killer (in this case
getc may help), or (2) you really do need a bigger buffer, and your
options are to hope[1] that your implementation efficiently handles
fread()ing into a bigger buffer than the stdio stream's, or to pass
over stdio entirely and use (nonportable) system calls to do your own
file I/O and buffer management.

(2) seems more likely to me.


dave

[1] or check
 
S

santosh

Kelsey said:
[snips]

Let me provide a counter example, I had an embedded product in the
field and we received a number of reports from customers that the
units were
rebooting. When we checked the assertion log of one, we found a
device was generating an interrupt when it should not, which would
have cased bad data to enter the system.

And this is impossible to do with an if (...) which wouldn't abort the
program, but instead give you endless opportunities to do other, more
graceful things - or to simply abort, should that be best?

I will never understand this bizarre reliance on assert, when the
language actually contains conditional branching constructs.

I suppose that use if/else is more work that an assert().
 
I

Ian Collins

Kelsey said:
[snips]

Let me provide a counter example, I had an embedded product in the field
and we received a number of reports from customers that the units were
rebooting. When we checked the assertion log of one, we found a device
was generating an interrupt when it should not, which would have cased
bad data to enter the system.

And this is impossible to do with an if (...) which wouldn't abort the
program, but instead give you endless opportunities to do other, more
graceful things - or to simply abort, should that be best?
As does the definition of the assert macro.
I will never understand this bizarre reliance on assert, when the
language actually contains conditional branching constructs.
It clearly expresses that the condition being tested should never happen.
 
T

Tim Streater

"Bart C said:
If there are a lot of these malloc()s bunched together, all these checks can
get very untidy and obscure the code. And still leaves the problem of
exactly how to deal with failures, right in the middle of some otherwise
elegant code. Especially annoying when the probability of failure is known
to be low.

You need to re-organise your code to handle the error cases. You are
checking the returned values from other API calls, aren't you? You have
designed your code with the handling of errors in mind, haven't you?
 
T

Tim Streater

That's a massive overestimate, probably because you're going about it
the wrong way. _First_ you ask for the memory, _then_ you start
assigning the values. That way, when malloc() fails, all you have to do
is return an error to the caller.


Typically this is not what a wisely written program does. Simply
terminating, throwing all the work up to that point away, is in most
circumstances a bad idea.

Take my current hobby project. It reads in a lot of files with data in
them, and filters out all data which is relevant to it. Then it stores
that data in a tree. I expect all needed data to fit into memory with
room to spare, and I expect never to run out of memory. Nevertheless,
this project _will_ check for memory allocation failure, and should it
occur, the tree constructor will return an error, which will be signaled
by the main program, which will then continue to process as much data as
has already been read and note where it had to stop.
You may think I'm a sucker for writing if-cases and recovery code, but
I'm not. It is actually considerably _easier_ to write the code this way
than to have to cope with randomly aborting programs and then find
another way to process that data. And IME, this is true for the average
program.

Quite right. And randomly aborting programs which you then have to debug
in case there are errors in the processing you are doing. The last C
program I wrote, some 20 years ago, was 30k lines of C and was full of
error handling. It typically ran for 9 months between restarts. And
these were only required because of annual site-wide power maintenance
(they switched from the 240kV line to the 60kV line).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,781
Messages
2,569,619
Members
45,314
Latest member
HugoKeogh

Latest Threads

Top