xmalloc string functions

Kelsey Bjarnason · Jan 29, 2008

[snips]

assert(user_didnt_try_to_frotz_a_widget_that_was_already_frobbed);

That bug report hasn't gone anywhere. How much do you want to bet that
if a commercial program was dumping core because two internal checks
disagreed about what was valid and a user reported the problem, how to
reproduce it, and a suggested fix, it would be fixed within two years of
being reported?

Well, in all fairness, it's not a commercial/noncommercial issue; most of
the OSS coders I've encountered tend to _be_ OSS coders more or less
because they love coding; unless this bug was, in fact, trivial in the
face of vastly more serious bugs, most would likely have had it fixed in
hours, perhaps days - not years.

That said, it's hardly impossible for a team - commercial or not, OSS or
not - to get into a mindset where such things just get pushed to the
side. Particularly if their generic answer to error conditions is to
abort anyway; this wouldn't even really be a bug for them, just business
as usual, no?

santosh · Jan 29, 2008

(e-mail address removed) wrote:

For the record (once more): you should not lose user's
data. Anything changed? Do I contradict with myself?
Care to quote?

Yes, but this is made needlessly difficult by GLib which decided to take
high-level action itself, instead of reporting it to it's caller.

In error recovery situations things are difficult enough as it is that
you don't want your library routine to try and make things even more
complicated.

Malcolm McLean · Jan 29, 2008

santosh said:
OTOH a failure of a few kilobytes is pretty big blow and the only
sensible thing is to try and log a message, try to close your files and
resources and exit.

Or nag the user to close down his other applications and give you more
memory. Which is what the X windows handler to xmalloc() does.
Unfortunately, although the nag window itself is designed to be very
resource light, X itself isn't very robust to allocation failures, and I
can't think of any way round that.
There is also no way of demanding extra resources using just the standard
library, other than calling a function on stdin, which is unlikely to be
acceptable.

ymuntyan · Jan 29, 2008

This is why a complex application needs more than one type of
out-of-memory handler. If a fairly large allocation fails, there is a
real chance that a much smaller allocation /can/ succeed, either for
continuing the program's usual work at a slower pace or to pop up a
message box for the user giving him the option of quitting or killing
some other program and retrying.

OTOH, if a small allocation, say a kilobyte or less, fails then we
probably have a serious resource crunch. There is /no/ point in
reducing the allocation size and trying again, nor is there much point
in trying to bring a dialogue box (though you can try). Probably a
sensible strategy is to attempt saving open files (which may or may not
succeed, but it won't hurt to try), attempt to log a message to stderr
or whatever and then exit with a error status.

But even in this severe case an abort() or a segfault or exiting
silently in a library routine is not a good strategy, IMHO.

Err, for allocations which can "normally" fail, for those big
allocations, you don't use g_malloc and use g_try_malloc instead.
Which can return NULL. What you don't handle in a normal way (that
is you don't check return value because it won't be NULL, let's not
start again about losing user data) is failure on allocation of
a small chunk of memory, like when you are making a string, or
appending a node to a list, or creating a widget.

Yes. This is why not all allocations in a program can meaningfully have
the same error handling logic. Some allocations may be totally
redundant. Say the program wants to draw a nice splash screen. If
allocation fails here, nothing needs to be done, the program still
attempt running. A splash screen is just eye-candy.

Similarly the program may try to allocate a large buffer for efficiency
and if this fails, it can try running with a much smaller buffer. Again
the error handling is different.

So you can do that. g_try_malloc() the buffer if you know that you
will be able to cope with NULL. Same thing with a splash screen:
the core structure, allocation of which will abort (by default,
you can still save your user's data, blah blah blah) your application,
is pretty small. The image you will load into it will be allocated
using g_try_malloc, which can safely fail (at least gtk will do that,
you of course can blindly call g_malloc(MANY_MEGS)).

OTOH a failure of a few kilobytes is pretty big blow and the only
sensible thing is to try and log a message, try to close your files and
resources and exit.

Yes, this is work, but the other option is to respond uniformly and
unintelligently to each and every resource acquisition failure, always
terminating abruptly on the user. I have used many such applications
and they are a /big/ annoyance.

No. An MP3 player can still run. Perhaps the stupid user tried to put
10000 titles into the playlist. Perhaps memory for this is unavailable
but enough memory is still available to continue running the current
track. As I said error handling depends (or should depend) intimately
on the exact contextual state of the application and it's environment.

Well, if mp3 player author thinks so, he can use data structures
other than glib's ones. Like use his own list instead of GList.
Normally he won't do that, he will use glib's list, and his
application will abort when you try to load a bajillion-title
playlist (10000 is not enough, if your computer won't get memory
for 10000 items, then you won't run that mp3 player in the first
place, we are talking about memory-greedy hippos here).
Perhaps it's too bad, you won't use such a player. For me
it's okay (I can change my mind when I see that player crash,
perhaps).

A uniform strategy is hardly better than no error handling at all.

First of all, it's not true. Second, huh?

Yevgen

Nick Keighley · Jan 29, 2008

Yes. It got to put the data somewhere.

in a preallocated piece of memory maybe?

if ((x = malloc(27)) == 0)
{
save_stuff();
exit(FAILURE);
}

list_append(x);

You mean nice versus one is very few? OK.

nine? and some of those are brackets

And a code parser which will find all malloc() calls,
and then a script generator which will generate all
possible code paths to make debugger test those
places. I'd be glad to have such a tool, certainly.

Except where you got a bug. In the place you won't
test.

sorry, you must have missed "You ALWAYS test the return from malloc()"

So do it. You *can*. Glib doesn't do anything but abort()
by default, naturally, since it can't do anything else.
If you are fine with this, do nothing. If not, write code
which will save the user data.

no this baffles me. If aborting in the malloc() wrapper
isn't the correct option you don't use it...

so...

no I'm still baffled. Why not follow a policy of try and allocate
memory, if it fails take some application specific recovery action?
Only the application can know what to do.

And man, I didn't say we should lose the user data!

but you have no choice if malloc crashes!

So do it? Who said you shouldn't it? You can save data
and terminate. But you ought to terminate, with saving
or not, if you use glib.

what? glib only crashes sometimes?

you don't *have* to terminate. You could free some memory.
Or abort some operation. On large long running systems
you don't die just because the user can't display the current
alarm list

That's it. But I am told that
an application can do more. No it can't.

sometimes it can.

If the main loop
can't push an event onto the event queue, then you're
screwed.

yes, that's a bit extreme. But if it's comoing from an external
source you could discard it and wait for the repeat.

No "application performs worse" or "some parts
are not working".

I think you are mistaken.

(Note, it is not about all applications,
don't talk about failsafe mp3 players or about webservers,
those simply shouldn't use glib)

so what class of application should? Games? Mobile radio Systems?
Database servers?

ymuntyan · Jan 29, 2008

in a preallocated piece of memory maybe?

if ((x = malloc(27)) == 0)
{
save_stuff();
exit(FAILURE);
}

list_append(x);

nine? and some of those are brackets

<snip>

sorry, you must have missed "You ALWAYS test the return from malloc()"

You missed the meaning of "test".

no this baffles me. If aborting in the malloc() wrapper
isn't the correct option you don't use it...

so...

no I'm still baffled. Why not follow a policy of try and allocate
memory, if it fails take some application specific recovery action?
Only the application can know what to do.

I don't know why not. You can do that.

but you have no choice if malloc crashes!

malloc doesn't crash.

what? glib only crashes sometimes?

It rarely crashes, indeed. What are you talking about?

you don't *have* to terminate. You could free some memory.
Or abort some operation. On large long running systems
you don't die just because the user can't display the current
alarm list

So you write the application in a different way. Gtk
toolkit is not for such an application (not for the
application which leaks so much that it runs out of
memory on the long run).

sometimes it can.

Sometimes yes. Who objects?

yes, that's a bit extreme. But if it's comoing from an external
source you could discard it and wait for the repeat.

There won't be repeat.

I think you are mistaken.

so what class of application should? Games? Mobile radio Systems?
Database servers?

Desktop applications.

Yevgen

Malcolm McLean · Jan 29, 2008

sorry, you must have missed "You ALWAYS test the return from malloc()"

You're mixing up execute with fork. Your policy is to always fork on every
call to malloc(). The question is, do you then execute both branches of the
fork in every debug run?

Malcolm McLean · Jan 29, 2008

Kelsey Bjarnason said:
I see. It's better to simply kill the app and lose the data, than to
write error handling code which _might_ fail. Yes, I see. Given the
choice between no protection and some, you prefer none. Got it. And you
think this is sane, do you?

Might corrupt the data. It all depends on the data, of course. Sometimes a
corrupt document is better than none at all, sometimes a lot worse.

Richard Tobin · Jan 29, 2008

Kelsey Bjarnason said:
Glib's documentation says, "If any call to allocate memory fails, the
application is terminated".

Yes, but it also has functions that don't do that. For example, on
the page mentioned earlier, I see:

g_try_new()
...
Attempts to allocate n_structs elements of type struct_type, and
returns NULL on failure. Contrast with g_new(), which aborts the
program on failure. T

Do feel free to explain how this ties in with a proper graceful shutdown
procedure saving all the available data.

Losing a user's data is obviously undesirable. But it happens even
with the most careful programming. Is the rate of loss that results
from these functions significant compared with that from power
failures, hardware faults, user error and so on? Perhaps the programs
in question take other steps to minimise the risk, such as auto-saving
frequently. Is it outside the range of normal risks that one takes
every day, such as using a frying pan without gloves on?

-- Richard

santosh · Jan 29, 2008

Malcolm said:
Might corrupt the data. It all depends on the data, of course.
Sometimes a corrupt document is better than none at all, sometimes a
lot worse.

I don't see how an allocation failure can corrupt data on a well written
program. At most a malloc() failure stops you doing any more work, but
how can it corrupt what already exists? Incomplete data, maybe, but
corrupt?

Malcolm McLean · Jan 29, 2008

santosh said:
I don't see how an allocation failure can corrupt data on a well written
program. At most a malloc() failure stops you doing any more work, but
how can it corrupt what already exists? Incomplete data, maybe, but
corrupt?

Every call to malloc() has a custom recovery routine in its if(!ptr) fork.
In the nature of things, this will only be tested when the program is hooked
up to a special apparatus that generates null mallocs for the purpose. In
the nature of things, we are halfway though constructing a data structure of
some kind when the allocation failure happens.
So we've got code that is unlikely to be tested as thoroughly as we would
like saving data that is likely to be in an unusual configuration. It is by
no means beyond the bounds of the possible that the data object will be
saved in such a way as to form a corrupt, but maybe valid, save file. For
instance the employee has a linked list of sales for which he receives a
bonus. The allocation fails, but the linked list is still written to file,
minus his last couple of sales. Whether this is worse than aborting without
a save depends on your data. In this case, one rather unhappy salesman, but
probably nothing that accounts can't put straight at cost of minor
secretarial and management time. If it's a biological database, on the other
hand, the error could persist for years, corrupting everyone's analyses.

A well written program won't have these bugs, of course. It's just that some
of us are more sceptical than others about the actual amount of testing of
error-handling code that goes on.

santosh · Jan 29, 2008

Malcolm said:
Every call to malloc() has a custom recovery routine in its if(!ptr)
fork. In the nature of things, this will only be tested when the
program is hooked up to a special apparatus that generates null
mallocs for the purpose. In the nature of things, we are halfway
though constructing a data structure of some kind when the allocation
failure happens. So we've got code that is unlikely to be tested as
thoroughly as we would like saving data that is likely to be in an
unusual configuration. It is by no means beyond the bounds of the
possible that the data object will be saved in such a way as to form a
corrupt, but maybe valid, save file.

Not necessarily corrupt. There are many cases where the incomplete data
set can be saved and processed at another program run. I personally use
the word "corrupt data" to describe data which has been clobbered by
program errors or hardware errors, like following a wrong pointer,
writing beyond array bounds, memory corruption etc.

An allocation failure usually leaves the program with incomplete data.
Now granted in some cases incomplete data is as good as corrupt data
and we must erase the whole set and start afresh, but in other cases,
it might be possible to write whatever data has been gathered thus far
to disk and resume processing later.

For instance the employee has a
linked list of sales for which he receives a bonus. The allocation
fails, but the linked list is still written to file, minus his last
couple of sales. Whether this is worse than aborting without a save
depends on your data.

Yes. In some cases incomplete data is useless. In other cases it can be
completed later and hence, in such circumstances it is *unacceptable*
to simply throw away all of it just because a part could not be
retrieved or created.

In this case, one rather unhappy salesman, but
probably nothing that accounts can't put straight at cost of minor
secretarial and management time. If it's a biological database, on the
other hand, the error could persist for years, corrupting everyone's
analyses.

If it is properly flagged and corrected (or deleted) later then it is
not an error. Database programs periodically run threads that check the
database for corruption and try to fix them. So do most filesystems.
Given enough effort even corrupt data could be fixable. So it's usually
not wise to throw away data because it's incomplete. Sometimes
incomplete data is normal for the applications line of work.

A well written program won't have these bugs, of course. It's just
that some of us are more sceptical than others about the actual amount
of testing of error-handling code that goes on.

Yes. Really well engineered programs are apparently rare but that's no
reason to get discouraged into writing slack code.

Malcolm McLean · Jan 29, 2008

santosh said:
Malcolm McLean wrote:

Yes. In some cases incomplete data is useless. In other cases it can be
completed later and hence, in such circumstances it is *unacceptable*
to simply throw away all of it just because a part could not be
retrieved or created.

Useless data can be saved without any problem. The snag comes when corrupt
data is dangerous, but missing data is merely inconvenient. For instance if
nurse has no drug list for the patient because the program that keeps track
of them is down, that means waking up the doctor. If she has the wrong drugs
on the list because the program has corrupted the data, that means one dead
patient.

Richard Heathfield · Jan 29, 2008

Malcolm McLean said:

Every call to malloc() has a custom recovery routine in its if(!ptr)
fork. In the nature of things, this will only be tested when the program
is hooked up to a special apparatus that generates null mallocs for the
purpose.

Yes, but you can design the apparatus in software, and you can design it in
such a way that it fails only when you want it to, and then you can design
your tests to make it fail at each point in turn. This is not even
particularly difficult to do. (I have done it myself, so how hard could it
be?)

<snip>

Flash Gordon · Jan 29, 2008

Richard Tobin wrote, On 29/01/08 18:01:

Yes, but it also has functions that don't do that. For example, on
the page mentioned earlier, I see:

g_try_new()
...
Attempts to allocate n_structs elements of type struct_type, and
returns NULL on failure. Contrast with g_new(), which aborts the
program on failure. T

So which of those does the rest of glib use? You would have to avoid
calling any of the functions which call g_new() not just avoid calling
it yourself.

Losing a user's data is obviously undesirable. But it happens even
with the most careful programming. Is the rate of loss that results
from these functions significant compared with that from power
failures, hardware faults, user error and so on? Perhaps the programs
in question take other steps to minimise the risk, such as auto-saving
frequently. Is it outside the range of normal risks that one takes
every day, such as using a frying pan without gloves on?

Well, I get the Lotus Notes client reporting that it doesn't have enough
memory to open a new windows a *lot* more frequently than I get it
crashing, and the HW it is running on has yet to fail. I also get VMWare
reporting an out of resources message a lot more often than it crashes
on me (I've not bothered verifying what the resource shortage is, but
the only resource I am normally short of is memory). So there you go,
real apps that really do handle out-of-memory situations.

Flash Gordon · Jan 29, 2008

Malcolm McLean wrote, On 29/01/08 10:06:

I've done this as well. It was adding so much complexity to code, all
because of allocation failures that couldn't happen.

It did not cause me significant extra work. The most likely reason for
the difference is that I designed the entire system knowing that
out-of-resource errors *do* occur so it was part of the entire design
concept rather than an extra I had to try and fit in.

Oh, and my real-time applications degraded gracefully rather than
failing on out-of-time errors as well. Again it was a case of
considering how to do the job properly during the design of the
application not during coding.

Finally, within
BabyX (my X windows toolkit) there was no way I could think of of
propagating the error conditions back to the caller. Flow control is
just too complex with the whole thign beign held together by a newtwork
of function pointers. So I decided BabyX would use xmalloc().

Well, I would not have used your BabyX library anyway, but now I have
even more reason to avoid it.

Then I realised that this released something for string handling.
because we know that those string functions can never return null, code
using them is so much more expressive and flexible.

Your knowledge is still faulty.

Flash Gordon · Jan 29, 2008

Malcolm McLean wrote, On 29/01/08 19:29:

Useless data can be saved without any problem. The snag comes when
corrupt data is dangerous, but missing data is merely inconvenient. For
instance if nurse has no drug list for the patient because the program
that keeps track of them is down, that means waking up the doctor. If
she has the wrong drugs on the list because the program has corrupted
the data, that means one dead patient.

I program that correctly reports that it could not provide the drug list
due to lack of memory (either on that program or the program that was
meant to write the file but instead wrote an indication to that effect)
is even more useful. Then there is no risk of the nurse assuming that no
drug list means no drugs. Also the hospital will know that they have a
problem causing memory exhaustion rather than anything else which could
help with resolving it long term.

ymuntyan · Jan 29, 2008

[snips]

Oh yes. Easy. How many lines of code is that?

Click to expand...

Just about exactly the number required to handle the error without doing
something as abysmally bad as killing the application, tossing the user's
data, simply because you're too freaking lazy to write code to handle the
allocation failure.

And how many lines of code
will you get if you do it ten times?

Click to expand...

Depends on the structure of the code, now don't it?

Yeah. And in a well-designed application, ... Blah blah
blah.

I see. It's better to simply kill the app and lose the data, than to
write error handling code which _might_ fail. Yes, I see. Given the
choice between no protection and some, you prefer none. Got it. And you
think this is sane, do you?

Once again I "lose user data" and so on. Man, am I evil!
Except you put words in my mouth, and I didn't say that.

No, that's *your* asinine approach. I prefer apps written by people who
actually give a damn about the data.

What *are* you babbling about?

Stuff which you have no idea about, I take. So if you have
no idea what I am talking about, how can you advise on how
to do that properly? Because I am not thinking about
"general" design here, I will do that some other time.
I was talking about Gtk applications, ones which have that
main loop inside. GUI, you know. Main loop, events, stuff
like that. If you are talking about how to write "applications",
then I will agree with all what you said (meaning, I don't care).
If you are talking about those GUI applications which, as
you claim, should not use glib, then learn what you are
talking about first.

[snip]

Take another look. Says right there in the docs, the app is aborted.
You know, *aborted*. Doesn't even get a returned NULL or other indicator
which would indicate "something wrong, save your data", it just gets
summarily nuked.

I suggest you read the docs you refer to. At least that *whole*
page. It's not as simple as

void *g_malloc(size_t n)
{
void *ptr;
if (!(ptr = malloc(n)))
abort();
return ptr;
}

Your application *can* do stuff on malloc() failure, whether
you see it or not.

Good bye,
Yevgen

ymuntyan · Jan 29, 2008

Richard Tobin wrote, On 29/01/08 18:01:

So which of those does the rest of glib use? You would have to avoid
calling any of the functions which call g_new() not just avoid calling
it yourself.

Click to expand...

g_malloc() is not a raw wrapper around malloc. You can get into
it, and do stuff where malloc() fails. And *then* return NULL
to glib, and it will abort (or quit application yourself). Or
even fancier, return memory from some emergency pool, and shut
down the application later, as you would after a signal is caught.
What your imagination allows.

Well, I get the Lotus Notes client reporting that it doesn't have enough
memory to open a new windows a *lot* more frequently than I get it
crashing, and the HW it is running on has yet to fail. I also get VMWare
reporting an out of resources message a lot more often than it crashes
on me (I've not bothered verifying what the resource shortage is, but
the only resource I am normally short of is memory). So there you go,
real apps that really do handle out-of-memory situations.

Click to expand...

Linux VMWare uses Gtk for its UI, isn't it funny? That "out of
resources message" corresponds to a failed g_try_malloc(), so
what? Yes, you can call g_try_malloc.

As to "frequently", here is another piece of usage data: when
applications crash here, they crash because of random bugs,
never because of malloc failure. Does that mean malloc() never
fails? I mean, yeah, Lotus is great. So what? What losses do
actually happen because g_malloc() will abort by default when
malloc() fails? How often, how much?

Yevgen

CBFalconer · Jan 29, 2008

Kelsey said:
.... snip ...

size_t didn't exist until C99.

It existed in C89 up.

xmalloc.c - my xmalloc	11	Feb 15, 2008
Observing a container	2	Dec 10, 2013
Does ANSI C allow free() to always be empty?	38	Mar 9, 2014
In the Matter of Herb Schildt: a Detailed Analysis of "C: TheComplete Nonsense"	109	Apr 3, 2010
comp.lang.c Answers to Frequently Asked Questions (FAQ List)	15	Apr 1, 2006
Call for Papers: International Conference on Communications Systemsand Technologies ICCST 2011	0	Jun 14, 2011
Call for Papers: International Conference on Communications Systemsand Technologies ICCST 2010	0	Jun 19, 2010
David Mark's Essential Javascript Tips - Volume #8 - Tip #47E -Attaching and Detaching Event Listene	1	Dec 15, 2011

xmalloc string functions

Kelsey Bjarnason

santosh

Malcolm McLean

ymuntyan

Nick Keighley

ymuntyan

Malcolm McLean

Malcolm McLean

Richard Tobin

santosh

Malcolm McLean

santosh

Malcolm McLean

Richard Heathfield

Flash Gordon

Flash Gordon

Flash Gordon

ymuntyan

ymuntyan

CBFalconer

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads