xmalloc string functions

R

Randy Howard

Kelsey Bjarnason said:
[snips]

On Mon, 28 Jan 2008 04:17:54 +0000, Yevgen Muntyan wrote:
All this apart from real
problems you have to solve. Yes, *real*. No, g_malloc() aborting an
application is not a real problem. Not for a regular desktop
application.

Except that at least one person *here*, in a comparatively small
community, has reported application crashes *precisely* due to this.

Yes, Kelsey, but so what? Read what he's saying: this is *not* a real
problem. It's only data - how could mere data possibly be important enough
to justify spending a little extra time working out how to salvage it in
the event of an allocation failure?

Apparently, "regular old desktop applications", such as Office suites,
database front-ends, image management programs like Photoshop, Gimp,
etc., Mail applications and such, all contain information too worthless
to be worth wasting the precious developer's time on. What's a few
hours of work here or there on the user's behalf, when the developer
has a LAN party to go to, or needs to practice riding his unicycle up
and down the hall?

I wonder what the reaction would be if their favorite 'all the bells
and whistles' IDE blew up in their face and they lost all the source
code changes they had made in the last few hours? (All four lines of
them)

You're a dinosaur who has failed to adapt to the Postmodernist school of
programming, where anything goes, any old garbage is acceptable, users are
worth nothing, and their data is worth rather less than nothing. Forget
best software engineering practice, because it's old hat nowadays - if you
want robustness, buy a Volvo.

glibfoo->gCreateMassiveAppWithgoverhead(GIGABYTE);
/* pretend like nothing bad can happen here, they're only dumb
* users. Besides, the code is open source, if they don't like
* it, they can bloody well fix it by themselves
*/
 
R

Randy Howard

The point is that they don't. If fopen() fails, there is nothing
to recover from.

False. If fopen() fails, it could have been because the user had a
typo in a filename he wanted to open in another window to cut/paste
from. When you bail out of the app because this file couldn't be
opened, that's just plain /stupid/. There are cases where a fopen()
failure could be critical with no possible way to recover, continue, or
retry, but they are rare.
Though if all your application does is fopen(),
then you can safely abort() when fopen() fails.

Or, you could try and determine why it failed, and take corrective
action, either automatically, or at the user's direction. But hey,
that cuts into the Xbox360 time, eh?
So, you work with a list, and you append an element to it.
Now you do list = g_list_append(list, something); with malloc
error handling you'll have to test whether list_append()
succeeded. Not much more typing, no. A little bit more, huh?
C++ exceptions would be appropriate here, but manual error
checking in C code *is* much much more typing.

No, it's way better if the guy happens to have a dozen apps open at the
moment, and this one is really critical work that he's been entering
data into for the last 2 hours. You could do this:

Display a message to the effect of "Unable to add new record due to an
out of memory condition, please close some other applications if you
would like to try again, or save the work in progress to prevent data
loss"

But instead, this /wonderfully/ designed application aborts and dumps
all his work. That's so much better, right? Clearly this "simpler"
solution is much better than actually protecting the user's data in
glib land. I just can't figure out how they convinced anyone to use
it.


Maybe the real issue with open source is who it lets into the party,
not how much the cover charge is.
There are about five bazillion allocations, debugger won't do. Random
malloc() failures will do as a nice stress test, yes. But you still
won't be able to test it properly. At least not that piece of code
where it will segfault when user runs it (here I assume that user
will be able to see it, perhaps on windows).

If you check, it won't segfault. That's the whole point. You'll
detect an error, and handle it, instead of merrily trudging along and
counting on the runtime to abort your entire process so you don't have
to worry about branch prediction hits in your wonderfully bloated, yet
somehow pseudo-optimized pile of crap you foist on the user community.
A better thing to do
is to test malloc() failure in one place, and possibly do what you
can do there, and abort.

That is an option, out of many that are available in general. With
glib or other designs with that model, that's pretty much all you're
left with. Any chance of doing anything even remotely professional in
light of an allocation failure is out the window.
Yeah, mozilla leaking too much. Or evolution leaking too much (?).
It would be nice to see something more substantial than "I know
for a fact" (debugged it, looked at the core file?). And would
be nice to hear about gedit or gnumeric crashing because of malloc()
failure.

Turn off overcommit, fire up some quick command line tools to chew up
ram and fill up the swap, and watch them all start crashing. Easy.
Avoiding losing your work in an emergency condition is just
a different story,

No, it isn't. In the majority of cases where a malloc() fails other
than immediately after program launch there is a potential for data
loss, file corruption, leaving stale files laying around, etc.
say you can have your application lost its
terminal or X connection, in those cases you can possibly do
something to save user's work. And that's something you can
(try to) do from inside g_malloc() when malloc() fails. It's
not necessary to write a g_list_append() which can fail for
that.

What about when the g_list_append() is the thing that fails, and it's
trying to add the 4000th new element to this list, of which the 3999
other elements are already in, but the data has not been saved? This
little "I didn't feel it was necessary to handle that" comment isn't
going to make the user happy at all. But who cares, it's free software
right, they have no complaints coming.
So, you got an event, you need to put it into the event queue.
Either you allocate memory for that (and it fails), or you
preallocate memory, it is not enough, and you try to allocate
again (and it fails). What can you do apart from some emergency
action (saving important data or something) and exit? How
do you "recover"?

That little act of "saving important data or something" isn't a minor
thing. You /might/ not be able to recover completely and continue if
nothing else happened. You should be able to save work in progress
though. You might even be able to notify that the user that you're
having an issue with grabbing more memory, and let the user try and
shut down some other apps to make it available.

But what the hell, that's just too much trouble.
 
D

dj3vande

glibfoo->gCreateMassiveAppWithgoverhead(GIGABYTE);
/* pretend like nothing bad can happen here, they're only dumb
* users. Besides, the code is open source, if they don't like
* it, they can bloody well fix it by themselves
*/
assert(user_didnt_try_to_frotz_a_widget_that_was_already_frobbed);

That bug report hasn't gone anywhere. How much do you want to bet that
if a commercial program was dumping core because two internal checks
disagreed about what was valid and a user reported the problem, how to
reproduce it, and a suggested fix, it would be fixed within two years
of being reported?


dave
(yes, it's fixed in my local build. no, that's not the point.)
 
R

Randy Howard

Kelsey Bjarnason said:
All this apart from real
problems you have to solve. Yes, *real*. No, g_malloc() aborting an
application is not a real problem. Not for a regular desktop
application.
Except that at least one person *here*, in a comparatively small
community, has reported application crashes *precisely* due to this.

Yes, Kelsey, but so what? Read what he's saying: this is *not* a real
problem. It's only data - how could mere data possibly be important enough
to justify spending a little extra time working out how to salvage it in
the event of an allocation failure?

Strawman.

It's not a strawman at all. He simply taking your position "aborting
an application is not a real problem. Not for a regular desktop
application" and putting it in real world terms.

Apparently there might exist for you some class of "irregular
applications" that are worthy of actual error handling. But "regular
desktop applications" deserve no such effort.
 
Y

ymuntyan

False. If fopen() fails, it could have been because the user had a
typo in a filename he wanted to open in another window to cut/paste
from. When you bail out of the app because this file couldn't be
opened, that's just plain /stupid/. There are cases where a fopen()
failure could be critical with no possible way to recover, continue, or
retry, but they are rare.

There is nothing to recover because failure of fopen()
is a normal situation. Absolutely different from a failure
of malloc() when you are trying to allocate a structure
to push into the event queue to scroll text a bit later.
Or, you could try and determine why it failed, and take corrective
action, either automatically, or at the user's direction. But hey,
that cuts into the Xbox360 time, eh?

Strawman you see. We were talking about malloc() and
you tell my application will crash when fopen() fails.

Note that if your application can ask user about something,
then it does more than fopen(). If it can take any
action when fopen() fails, then it does more. That's
what I meant. cat utility doesn't have much to do
if it can't open the file it's supposed to read.
A gui application doesn't have much to do if it
doesn't have memory to draw stuff it draws.
No, it's way better if the guy happens to have a dozen apps open at the
moment, and this one is really critical work that he's been entering
data into for the last 2 hours. You could do this:

Display a message to the effect of "Unable to add new record due to an
out of memory condition, please close some other applications if you
would like to try again, or save the work in progress to prevent data
loss"

BS. You can't display message if you don't have memory. You could
reserve some, specially for the message, and *try* to display it.
But it won't show up anyway.
But instead, this /wonderfully/ designed application aborts and dumps
all his work.

It's not necessary to dump the user's work. On the contrary,
you should try to save his work if you can. But I'll be glad
to see what you would do in a dictionary application. Or in
an mp3 player. If you believe that an mp3 player shouldn't
abort when it can't allocate memory for a playlist, it's a
fine opinion, and you simply shouldn't use mp3 players
based on glib. Others still will (you know why? because
it won't abort)
That's so much better, right? Clearly this "simpler"
solution is much better than actually protecting the user's data in
glib land. I just can't figure out how they convinced anyone to use
it.

Maybe the real issue with open source is who it lets into the party,
not how much the cover charge is.

So, strawman.

If you check, it won't segfault. That's the whole point. You'll
detect an error, and handle it, instead of merrily trudging along and
counting on the runtime to abort your entire process so you don't have
to worry about branch prediction hits in your wonderfully bloated, yet
somehow pseudo-optimized pile of crap you foist on the user community.

So, we got to crap finally. Good. Think what you do when you
next time run a shell, or python, or a perl script. Aren't you
afraid of using gui applications, by the way? Or can you
present an example of one which continues working after malloc()
fails? Source code please.
That is an option, out of many that are available in general. With
glib or other designs with that model, that's pretty much all you're
left with. Any chance of doing anything even remotely professional in
light of an allocation failure is out the window.

Professional talking here! Have you written some application
for us mere mortals to learn from?
Turn off overcommit, fire up some quick command line tools to chew up
ram and fill up the swap, and watch them all start crashing. Easy.

Even easier, use kill (1). So?
No, it isn't. In the majority of cases where a malloc() fails other
than immediately after program launch there is a potential for data
loss, file corruption, leaving stale files laying around, etc.


What about when the g_list_append() is the thing that fails, and it's
trying to add the 4000th new element to this list, of which the 3999
other elements are already in, but the data has not been saved? This
little "I didn't feel it was necessary to handle that" comment isn't
going to make the user happy at all. But who cares, it's free software
right, they have no complaints coming.
Huh?


That little act of "saving important data or something" isn't a minor
thing. You /might/ not be able to recover completely and continue if
nothing else happened. You should be able to save work in progress
though. You might even be able to notify that the user that you're
having an issue with grabbing more memory, and let the user try and
shut down some other apps to make it available.

But what the hell, that's just too much trouble.

Strawman. Who said you shouldn't save data? Do save it. And
exit the application.
 
K

Kelsey Bjarnason

[snips]

You shouldn't equate glib with GNOME.

Perhaps not, but if Gnome _uses_ glib, I have to question the thinking
which went into that decision.
Miguel de Icaza and some others
have opined that C is an inadequate tool for what they wish to
accomplish on the desktop.

Possibly, though I still don't see how that excuses "Whoops, can't
allocate a menu item, so screw your data, it don't matter, chuck you
farley, the app is dead."

OTOH, maybe you can. I'm not sure how many people have followed Miguel's
lead. And glib seems, anecdotally, to have become more pervasive in Gtk+
and GNOME apps.

Among other things, apparently. Oh well, best we can do, I guess, at
this point is try to stop the rot from spreading.
 
Y

ymuntyan

[snips]

The point is that they don't. If fopen() fails, there is nothing to
recover from.

If my fopen fails and I can't load my app's config file, I can't get the
preferred options. Depending on the app, this could be anything from a
minor annoyance to a critical failure. In every one of those cases,
there is something to recover from.

If my fopen fails because the user has no read permissions (or no write
permissions, if he's trying to open for write), there is something to
recover from - let him know he has no permissions, go back to the file
select dialog (or whatever called fopen) and let the user decide what to
do about it - choose a different file name, fix the permissions, whatever.

I'm sorry, but "there's nothing to recover from" is just so completely
out of touch with reality I can't believe you said it.
So, you work with a list, and you append an element to it. Now you do
list = g_list_append(list, something); with malloc error handling you'll
have to test whether list_append() succeeded.

That depends entirely on how your list functions work. As a simple
example:

/* NODE and LIST are actually the same thing, just written differently
for clarity's sake */
void list_append( LIST *list, NODE *new )
{
NODE *node = list;

/* Find end of list */
while( node->next != NULL )
node = node->next;

/* Add item to end of list */
node->next = new;
new->next = NULL;

}

Sorry, what, exactly, did I need to check here with malloc error
handling? What's that you say, nothing at all, since there _is_ no
malloc involved in adding a node to the list? Ah, yes. Of course, we
need to allocate the node itself:

NODE *node;
LIST *list;

...

node = malloc( sizeof( *node ) );
if ( node == NULL )
{
blah}

else
{
list_append( list, node );
blah

}

Golly, how unbearably painful.

Oh yes. Easy. How many lines of code is that? And how
many lines of code will you get if you do it ten times?
That is, lines of untested code (no, I won't buy the
tales that you will test it. You won't)
Huh? A segfault is a result, generally, of one of two things: a runaway
pointer, or an allocation failure - you know, the very thing we're
suggesting you actually design your code to test for.

I'm saying that you got so many possible segfault places
that you won't test the one where it will actually segfault.
Really? Okay, fine. I've got allocated data buffers *with live data* in
793 different places in the code. The only way your "one place" is going
to be able to "do what you can there" and save the data is if every
single piece of data in the entire program is a freaking *global*, which
is *not* gonna happen.

Now, if I don't use this half-baked notion of "allocate or die", I can
report the failure to the caller, and then to its caller, and so on and
so forth, with each level doing whatever is appropriate for the data it
has in its care, *none* of which your method has *any* ability to do.

But hey, it's not like data matters, right? Who cares, data's worthless,
just crash the app.

Okay, okay. After we crashed the application and lost user data,
let's talk about that old boring main loop. Who is it going to
report errors to? After we failed to process an event in the user
code, user code returns an error to the events dispatcher. What
does that do? Pretends nothing happened? What do you do if signal
emission failed because you can't allocate memory required for
the emission? The callbacks must be called, otherwise you get
all sorts of funny results, including data loss. You just got
to quit the application, that's all you can do. Save the data
if you can, certainly.
Could be either. Point is, we know that _real world_ applications _are_
dying from this design principle, which means that while today it's "only
a browser", there's no telling what it will be tomorrow.

Err, if an application eats too much memory then the die-on-oom
simply doesn't work for it. It's the application bug if it eats
too much memory and dies. When gedit crashes because of malloc()
failure, *then* there will be a gedit bug. Hypothetical abort()
in gedit is not something I would worry about if I were a gedit
developer.
No, it's the same story. Application tried to allocate something,
couldn't, *died* and took the data with it. Depending on the app - and
the data - this would be grounds for anything from simply pulling hair
out to actually hunting down the developer with a baseball bat. Or a
lawyer.

So don't lose data? You can avoid losing data with glib, yes.
Just please stop talking this stuff about losing data. Data
must not be lost, period. An application may abort on malloc()
failure, period. These two *are* possible to combine.
"So what" is that my data wasn't hooped by your brain-dead strategy of
simply aborting on error, that's what. I know you don't think users'
data actually *means* anything, but I can assure you, the *user* thinks
it does.

Sorry, I didn't say lose data. I said you can't continue working
as nothing happened after that. "Recovery code will be able to run
successfully", right. Now read what you snipped, and *then* argue.
 
R

Randy Howard

BS. You can't display message if you don't have memory. You could
reserve some, specially for the message, and *try* to display it.
But it won't show up anyway.

Sigh. A malloc() failure does not necessarily mean that you are out
of memory in the entire system, or even in your own process. It means
that a specific call to malloc() failed. It's strange that you say
this is "BS", when I have seen these sorts of messages appear on
terminals and displays periodically in various applications over the
last 30 years. Maybe they used magic fairy dust.
It's not necessary to dump the user's work. On the contrary,
you should try to save his work if you can. But I'll be glad
to see what you would do in a dictionary application.

Are you claiming that no dictionary application offers any recovery for
a malloc() failure than an abort? If not, what are you trying to say?
If you believe that an mp3 player shouldn't
abort when it can't allocate memory for a playlist, it's a
fine opinion, and you simply shouldn't use mp3 players
based on glib.

On that we can agree.
Others still will (you know why? because it won't abort)

I can't parse this with any degree of confidence.

At any rate, it seems fairly clear that neither of us is going to move
the other, so it seems pointless to continue.
 
K

Kelsey Bjarnason

[snips]

Oh yes. Easy. How many lines of code is that?

Just about exactly the number required to handle the error without doing
something as abysmally bad as killing the application, tossing the user's
data, simply because you're too freaking lazy to write code to handle the
allocation failure.
And how many lines of code
will you get if you do it ten times?

Depends on the structure of the code, now don't it?
I'm saying that you got so many possible segfault places that you won't
test the one where it will actually segfault.

I see. It's better to simply kill the app and lose the data, than to
write error handling code which _might_ fail. Yes, I see. Given the
choice between no protection and some, you prefer none. Got it. And you
think this is sane, do you?
Okay, okay. After we crashed the application and lost user data,

No, that's *your* asinine approach. I prefer apps written by people who
actually give a damn about the data.
let's
talk about that old boring main loop. Who is it going to report errors
to?

What *are* you babbling about?
After we failed to process an event in the user code, user code
returns an error to the events dispatcher. What does that do?

No idea. If it's built to your standards, it probably kills the app
because it couldn't put a "mouse moved" notification into the queue.
Err, if an application eats too much memory

"Too much" according to whom? You? Some random spec nobody outside the
Gnome world has ever seen? Process limits?
doesn't work for it. It's the application bug if it eats too much memory
and dies.

Ah, I see. It's a *bug* if the app eats too much memory and dies. Hmm,
wait a second... that's *exactly* what glib does on allocation failures -
causes the app to die. Yes, we're agreed, it's a bug. A stupendously
bad one which should never have seen the light of day.

I'm glad we're in agreement. Of course, the proper thing to do there is
*not* to abort, but to return a notification of allocation failure. I'm
sure you'll report this to the Gnome team.
So don't lose data?

Right. Which means do *NOT* use retarded libraries like Malcolm's
xmalloc, or glib, which insists upon aborting the app on allocation
failure.
You can avoid losing data with glib, yes.

Really? Good. Please explain how, when one of the glib allocation
functions, the ones which _abort the app_ on failure, allow for a
sensible data save on exit. You know, data which may be strewn about in
variables visible only to the functions they're defined in, that sort of
thing.

Just
please stop talking this stuff about losing data.

As soon as this nonsense about aborting the app on allocation stops being
included in code, I'll stop talking about losing data. Let me know when
glib is fixed to work in a sane and safe manner.
Sorry, I didn't say lose data.

Take another look. Says right there in the docs, the app is aborted.
You know, *aborted*. Doesn't even get a returned NULL or other indicator
which would indicate "something wrong, save your data", it just gets
summarily nuked.

Yes, you did say "lose data".
 
F

Flash Gordon

The point is that they don't. If fopen() fails, there is nothing
to recover from. Though if all your application does is fopen(),
then you can safely abort() when fopen() fails.

As others have pointed out, that is a good way to stop people using your
apps.
So, you work with a list, and you append an element to it.
Now you do list = g_list_append(list, something); with malloc
error handling you'll have to test whether list_append()
succeeded. Not much more typing, no. A little bit more, huh?
C++ exceptions would be appropriate here, but manual error
checking in C code *is* much much more typing.

Not always. I've done it using structured programming in assembler
without implementing exceptions and at each point I checked the status
and propagated the error until it could be handled. The handling
consisted of processing what it had memory for giving degraded
performance instead of giving up which would not have been acceptable.
It was also easy to do because I new resources were limited and designed
the SW assuming that they could run out.
There are about five bazillion allocations, debugger won't do. Random
malloc() failures will do as a nice stress test, yes. But you still
won't be able to test it properly. At least not that piece of code
where it will segfault when user runs it (here I assume that user
will be able to see it, perhaps on windows). A better thing to do
is to test malloc() failure in one place, and possibly do what you
can do there, and abort.

You propagate the error upwards until it can be handled sensibly. So
most of your malloc failure tests are simply returning a failure code
after maybe some simple local recovery. Far better not to loose the
users hard work for a little thought in design and implementation.
Yeah, mozilla leaking too much. Or evolution leaking too much (?).
It would be nice to see something more substantial than "I know
for a fact" (debugged it, looked at the core file?). And would
be nice to hear about gedit or gnumeric crashing because of malloc()
failure.

Well, I have seen the Lotus Notes client report out-of-memory on
attempting to open a window. The Notes client did not crash, it reported
it and left the dialogue box there. When I could I closed down some
other applications (not immediately) and avoided loosing partly typed
emails. I've also had VMWare report out-of-resource at times when the
only resource that was tight was memory, and again it gave me the chance
to recover the situation which saved me significant work because I had
two VMs running and the state between them was important and took time
setting up.
Avoiding losing your work in an emergency condition is just
a different story, say you can have your application lost its
terminal or X connection, in those cases you can possibly do
something to save user's work. And that's something you can
(try to) do from inside g_malloc() when malloc() fails. It's
not necessary to write a g_list_append() which can fail for
that.

See above. I *use* applications that do not abort but instead allow me
to keep all my data and continue.
Yes I actually do (try xorg sources). But it's not quite relevant,
since I wasn't talking about X server. Though if X server dies then
Xlib will kill the application too.

If the X-server itself is passing the event to the application then it
is whether the X-server has the memory that is important. The
application can generate events too, but the application do things to
recover itself.
So, you got an event, you need to put it into the event queue.
Either you allocate memory for that (and it fails), or you
preallocate memory, it is not enough, and you try to allocate
again (and it fails). What can you do apart from some emergency
action (saving important data or something) and exit? How
do you "recover"?

You increase the space allocated *before* you are out so that if it
fails you still have the resources to pop up a dialogue telling the user
and giving them a chance to do something about it.
Right, he didn't suggest anything.

He did not suggest the specifics because the specifics vary depending on
the situation. However he did say that you don't throw away the users data.
Recovery code will be able to run successfully, so what?
The rest of the code still wants memory.

The recovery may be as simple as telling the user there is no memory to
open a new window and let the user continue with the existing ones (I've
*seen* this behaviour). Or it may give the user the option of retrying
or aborting. Or...
Again, no recovery process will be able to make your main loop
spin happily again. The recovery process can't get memory from
nowhere (unless by recovery you mean killing parts of application,
in which case it again doesn't make sense to proceed in normal way).

Well, I've seen GUI applications do sensible things thus avoiding me
from loosing the data. See above.
Perhaps. Except it's not "just". Again, I'll be delighted to see an
application which handles malloc() failure when it draws a menu label.

I've seen it done on attempting to open a window. Either it was not
enough memory to open the window or it was some component of the window,
either case is not hard to handle.
Preferably its code, to learn from. Oh, and see the code which works
with list allocated on heap, which handles every possible failure of
list_append() (no exceptions and alike please, don't we agree that
all we need is 'if (failed()) recover()'?).

The precise strategy varies. Fortunately the applications I have to use
*do* handle the problem in ways that avoid me loosing data.
 
N

Nick Keighley

does g_list_append() allocate memory?

Oh yes. Easy. How many lines of code is that?

very few
And how
many lines of code will you get if you do it ten times?

still very few
That is, lines of untested code (no, I won't buy the
tales that you will test it. You won't)

it seems you're assumptions about his development methodology


and the wrapper?
I'm saying that you got so many possible segfault places
that you won't test the one where it will actually segfault.

there won't be *any* segfault places. You ALWAYS test the return from
malloc().

Okay, okay. After we crashed the application and lost user data,

no no no! The point is you do something with the user data
*before* you terminate the application.
let's talk about that old boring main loop. Who is it going to
report errors to?

I'm not sure what your loop does.

After we failed to process an event in the user
code, user code returns an error to the events dispatcher. What
does that do?

if the user code saves the data the event loop can terminate.
Your crashing malloc() doesn't allow this option.
You make the decisions in the user code which understands the
application and not in the malloc library.

Pretends nothing happened? What do you do if signal
emission failed because you can't allocate memory required for
the emission? The callbacks must be called, otherwise you get
all sorts of funny results, including data loss. You just got
to quit the application, that's all you can do. Save the data
if you can, certainly.

but you can't do that in your code!

Err, if an application eats too much memory then the die-on-oom
simply doesn't work for it. It's the application bug if it eats
too much memory and dies. When gedit crashes because of malloc()
failure, *then* there will be a gedit bug. Hypothetical abort()
in gedit is not something I would worry about if I were a gedit
developer.

why not?

So don't lose data? You can avoid losing data with glib, yes.
Just please stop talking this stuff about losing data. Data
must not be lost, period. An application may abort on malloc()
failure, period. These two *are* possible to combine.

how? This makes no sense.

Sorry, I didn't say lose data. I said you can't continue working
as nothing happened after that. "Recovery code will be able to run
successfully", right. Now read what you snipped, and *then* argue

"recovery" includes saving data and terminating.
 
K

Kelsey Bjarnason

[snips]

It's unbearbly painful for you,

No, actually, it's trivial for me, as this is such a standard approach to
such things you can practically boilerplate it.
because you wrote "blah" instead of the
error-handling code.

Clue for you: there's an app missing in the example, complete with
anywhere from a dozen to tens of thousands of functions, data structures
and the like. "blah" cannot be replaced with meaningful code unless and
until those structures, those functions, the basic application design are
in place.
In an sense you are right. A top-notch appplication will check malloc()
on every call, and apply appropriate, custom error-handling.
Bingo.

application put into a "default" state, and the user informed. It's just
that we don't believe that your code is genuinely of that quality.

This is the *norm* for proper development. It's *habit*. It's also a
matter of development practise: putting your code through regular
validations. You know, like going back tomorrow and double-checking that
you didn't type "13" when you meant "137", checking that you actually
validate allocations, file opens and the like.
It's
much more likely that it is a mess, with every malloc() checked

Err, yes, of course every malloc is checked. That's the whole point to
what we're saying to you: allocations _need to be validated_. Failures
_need to be handled_.
- you've
got the intelligence and consistency to remember that little rule, but
the so-called recovery code often not really recovering the error,
sometimes aborting, sometimes even putting the application into an
invalid state.

As I said elsewhere, if your allocation fails, and if trying to alert the
user fails, and if attempting to open or write a file to save the data
fails, at some point, yes, you have to simply give up and abort. Nobody
is denying this.

Choosing "give up and abort" as the *first* alternative, however, is just
so unbelievably bad I can't believe anyone seriously thinks this is a
viable option.
And it's hard to tease out because there is so much of
this error-handling logic. What's even worse is that it is hard to find
bugs in normal program flow control, because it is obscrured by all the
malloc() error-condition code.

You know, such code is just not that difficult to write, nor that
complex, nor that messy.
There's one thing worse than lost data. That's corrupted data. Are you
sure that your program won't corrupt user data?

No, I'm not sure of that - any more than _any_ application can _ever_ be
sure of that. Quick: how often, after you write a file, do you read the
data back in, compare it to the original data and thus ensure that what
was written matches what should have been written? No, wait, you also
have to guarantee that you're not just seeing an OS file system buffer,
so you need to guarantee the data was actually flushed to disk, then the
buffer cleared, then the data read back off the disk, *then* compare.

You *do* do this for every single byte you write to a file, right? If
not, then you're not sure your program won't corrupt the user's data.

Of course you don't do that. There are limits on how far any method
should go in attempting to preserve data, where the cost of handling
those cases is not justifiable in terms of the value of the data being
handled.

That said, throwing up your hands in disgust and not even *trying*, at
the very first sign of a problem, is simply irresponsible.
Do you test the save
strategy, at every possible point of malloc() failure?

Depends on the design, now don't it?
How much times
does that add to your development costs? Are development costs important
to you? That's a real question. If you're writing software for
spacecraft the answer will be "no".

No? Hmm. NASA wants you to write a function which will check a sensor,
and if the value returned is outside a specified range, cause a light to
flash. The entire thing can be done in 10 lines of code. According to
you, they're not going to care if the development cost for that function
exceeds a hundred million dollars.

I think we've just about reached the end here; you are so completely
removed from anything remotely resembling the real world that further
discussion with you seems impossible.
 
Y

ymuntyan

does g_list_append() allocate memory?

Yes. It got to put the data somewhere.

You mean nice versus one is very few? OK.
still very few
OK.


it seems you're assumptions about his development methodology


and the wrapper?

And a code parser which will find all malloc() calls,
and then a script generator which will generate all
possible code paths to make debugger test those
places. I'd be glad to have such a tool, certainly.
there won't be *any* segfault places. You ALWAYS test the return from
malloc().

Except where you got a bug. In the place you won't
test.
no no no! The point is you do something with the user data
*before* you terminate the application.

So do it. You *can*. Glib doesn't do anything but abort()
by default, naturally, since it can't do anything else.
If you are fine with this, do nothing. If not, write code
which will save the user data.

And man, I didn't say we should lose the user data!
"recovery" includes saving data and terminating.

So do it? Who said you shouldn't it? You can save data
and terminate. But you ought to terminate, with saving
or not, if you use glib. That's it. But I am told that
an application can do more. No it can't. If the main loop
can't push an event onto the event queue, then you're
screwed. No "application performs worse" or "some parts
are not working". (Note, it is not about all applications,
don't talk about failsafe mp3 players or about webservers,
those simply shouldn't use glib)

Yevgen
 
Y

ymuntyan

(e-mail address removed) wrote, On 28/01/08 23:30:
[snip]
Right, he didn't suggest anything.

He did not suggest the specifics because the specifics vary depending on
the situation. However he did say that you don't throw away the users data.

And I did say you should throw away the users data,
huh? Sheesh, a "little thought", "losing users work".
Bullshit. Or, politely, strawman. Failing to open a
window and continuing work is one thing, it may be
a very nice feature which is impossible with glib.
But it's different from not losing user data. You
can avoid losing user data with glib.

For the record (once more): you should not lose user's
data. Anything changed? Do I contradict with myself?
Care to quote?

Yevgen
 
K

Kelsey Bjarnason

[snips]

And I did say you should throw away the users data, huh? Sheesh, a
"little thought", "losing users work". Bullshit. Or, politely, strawman.
Failing to open a window and continuing work is one thing, it may be a
very nice feature which is impossible with glib. But it's different from
not losing user data. You can avoid losing user data with glib.

Glib's documentation says, "If any call to allocate memory fails, the
application is terminated".

Note that: the application is terminated. Nothing about this remotely
suggests any sort of ability to trap that action and take whatever steps
may be appropriate, such as, oh, saving data before close, or choosing
not to close because, after all, there may be 10,000 other things we can
do rather than exit, no, it's just "the application is terminated".

Do feel free to explain how this ties in with a proper graceful shutdown
procedure saving all the available data.
 
K

Kelsey Bjarnason

[snips]

On Mon, 28 Jan 2008 15:53:59 -0800, ymuntyan wrote:
[snip]
That, however, does not excuse the whole notion of "Hey, first thing
we tried failed, so let's just abort."
Nobody said this, you can do stuff on malloc() failure.

Actually, that's pretty much the entire point to both the Malcolm and
the glib sides of the discussion; if allocation fails, abort the app.
No ifs ands or buts, it is allocate or die - which only makes sense if
you *cannot* do anything on allocation failure.

But it's not true, there are ifs and buts. Using glib, you *can* do
stuff when malloc() fails.

Obviously I can do things, even in a glib app, even when malloc fails. I
*cannot* do much of anything, according to glib's documentation, if any
_glib_ allocations fail.

I quote: "If any call to allocate memory fails, the application is
terminated. "

It's pretty black and white, you *cannot* do other stuff when the
allocation fails, the application is going to die. It further assures us
this is the case, by saying "This also means that there is no need to
check if the call succeeded." - "the call" here being the call involving
allocation. It succeeds, or the app dies. There is no "do stuff", not
in any useful fashion, because some designed thought the best possible
response to an out of memory condition was to kill the app, instead of
report the problem and let the caller sort out what to do.
Yes, using glib all you sensibly can do on
malloc() failure is some sort of emergency work and quit.

So we're agreed, it's broken beyond redemption. So far so good. Now
explain to us how I propagate the allocation failure notification through
all my functions such that the data visible only to them gets properly
saved.
If your
application requires more, then you don't want to use glib, that's it.
But I claim that for 'regular' desktop applications that is quite
enough.

Oh, of course. After all, if you have three documents open in a word
processor and try to open a fourth, the fact you run out of memory is
always a good excuse to abort, rather than simply report "Sorry, can't
load the new document, not enough memory" and continue about your
business as if nothing at all had happened. Yes, obviously this is a
perfectly sensible strategy.

Sorry, but that's crap. It is *not* a sensible strategy. Reporting that
there's insufficient memory to load the new document, that makes sense.
Reporting that there's insufficient resources to launch the spell
checker, that makes sense. Simply puking because some coder is too
blinking lazy to write good code, that does not make sense.
 
M

Malcolm McLean

Kelsey Bjarnason said:
Sorry, what, exactly, did I need to check here with malloc error
handling? What's that you say, nothing at all, since there _is_ no
malloc involved in adding a node to the list? Ah, yes. Of course, we
need to allocate the node itself:

NODE *node;
LIST *list;

...

node = malloc( sizeof( *node ) );
if ( node == NULL )
{
blah
}
else
{
list_append( list, node );
blah
}

Golly, how unbearably painful.
It's unbearbly painful for you, because you wrote "blah" instead of the
error-handling code.
In an sense you are right. A top-notch appplication will check malloc() on
every call, and apply appropriate, custom error-handling. If it fails to
allocate memory for an event the queue will be purged, the application put
into a "default" state, and the user informed. It's just that we don't
believe that your code is genuinely of that quality.
It's much more likely that it is a mess, with every malloc() checked -
you've got the intelligence and consistency to remember that little rule,
but the so-called recovery code often not really recovering the error,
sometimes aborting, sometimes even putting the application into an invalid
state. And it's hard to tease out because there is so much of this
error-handling logic. What's even worse is that it is hard to find bugs in
normal program flow control, because it is obscrured by all the malloc()
error-condition code.
"So what" is that my data wasn't hooped by your brain-dead strategy of
simply aborting on error, that's what. I know you don't think users'
data actually *means* anything, but I can assure you, the *user* thinks
it does.
There's one thing worse than lost data. That's corrupted data. Are you sure
that your program won't corrupt user data? Do you test the save strategy, at
every possible point of malloc() failure? How much times does that add to
your development costs? Are development costs important to you? That's a
real question. If you're writing software for spacecraft the answer will be
"no".
 
S

santosh

There is nothing to recover because failure of fopen()
is a normal situation. Absolutely different from a failure
of malloc() when you are trying to allocate a structure
to push into the event queue to scroll text a bit later.

Strawman you see. We were talking about malloc() and
you tell my application will crash when fopen() fails.

Note that if your application can ask user about something,
then it does more than fopen(). If it can take any
action when fopen() fails, then it does more. That's
what I meant. cat utility doesn't have much to do
if it can't open the file it's supposed to read.
A gui application doesn't have much to do if it
doesn't have memory to draw stuff it draws.

This is why a complex application needs more than one type of
out-of-memory handler. If a fairly large allocation fails, there is a
real chance that a much smaller allocation /can/ succeed, either for
continuing the program's usual work at a slower pace or to pop up a
message box for the user giving him the option of quitting or killing
some other program and retrying.

OTOH, if a small allocation, say a kilobyte or less, fails then we
probably have a serious resource crunch. There is /no/ point in
reducing the allocation size and trying again, nor is there much point
in trying to bring a dialogue box (though you can try). Probably a
sensible strategy is to attempt saving open files (which may or may not
succeed, but it won't hurt to try), attempt to log a message to stderr
or whatever and then exit with a error status.

But even in this severe case an abort() or a segfault or exiting
silently in a library routine is not a good strategy, IMHO.

BS. You can't display message if you don't have memory. You could
reserve some, specially for the message, and *try* to display it.
But it won't show up anyway.

Yes. This is why not all allocations in a program can meaningfully have
the same error handling logic. Some allocations may be totally
redundant. Say the program wants to draw a nice splash screen. If
allocation fails here, nothing needs to be done, the program still
attempt running. A splash screen is just eye-candy.

Similarly the program may try to allocate a large buffer for efficiency
and if this fails, it can try running with a much smaller buffer. Again
the error handling is different.

OTOH a failure of a few kilobytes is pretty big blow and the only
sensible thing is to try and log a message, try to close your files and
resources and exit.

Yes, this is work, but the other option is to respond uniformly and
unintelligently to each and every resource acquisition failure, always
terminating abruptly on the user. I have used many such applications
and they are a /big/ annoyance.
It's not necessary to dump the user's work. On the contrary,
you should try to save his work if you can. But I'll be glad
to see what you would do in a dictionary application. Or in
an mp3 player. If you believe that an mp3 player shouldn't
abort when it can't allocate memory for a playlist, it's a
fine opinion, and you simply shouldn't use mp3 players
based on glib. Others still will (you know why? because
it won't abort)

No. An MP3 player can still run. Perhaps the stupid user tried to put
10000 titles into the playlist. Perhaps memory for this is unavailable
but enough memory is still available to continue running the current
track. As I said error handling depends (or should depend) intimately
on the exact contextual state of the application and it's environment.

A uniform strategy is hardly better than no error handling at all.

So, we got to crap finally. Good. Think what you do when you
next time run a shell, or python, or a perl script. Aren't you
afraid of using gui applications, by the way? Or can you
present an example of one which continues working after malloc()
fails? Source code please.

In the past I have used a 3D modelling application that would try to
continue the simulation under low memory conditions by turning off
other parts of the program and non-essential parts of the simulation
itself. For example it would switch off background and colour-filling
for the object.

Off course this is a case of a large allocation failure and not a
failure for a few hundred bytes, which would have been essentially
unrecoverable.

<snip>
 
M

Malcolm McLean

Flash Gordon said:
Not always. I've done it using structured programming in assembler without
implementing exceptions and at each point I checked the status and
propagated the error until it could be handled. The handling consisted of
processing what it had memory for giving degraded performance instead of
giving up which would not have been acceptable. It was also easy to do
because I new resources were limited and designed the SW assuming that
they could run out.
I've done this as well. It was adding so much complexity to code, all
because of allocation failures that couldn't happen. Finally, within BabyX
(my X windows toolkit) there was no way I could think of of propagating the
error conditions back to the caller. Flow control is just too complex with
the whole thign beign held together by a newtwork of function pointers. So I
decided BabyX would use xmalloc().
Then I realised that this released something for string handling. because we
know that those string functions can never return null, code using them is
so much more expressive and flexible.
 
K

Kelsey Bjarnason

[snips]

Yes, Kelsey, but so what? Read what he's saying: this is *not* a real
problem. It's only data - how could mere data possibly be important
enough to justify spending a little extra time working out how to
salvage it in the event of an allocation failure?

Yeah, I know, I have such unreasonable expectations of developers.
Writing error handling code is just so damned tiresome, and, as you point
out, it's just data, not like it's anything valuable or important. I
really shouldn't pick on the whole concept so much.

'Course, that said, this *is* going to drastically simplify my code. I'm
going to write some new wrapper functions. xfopen, for example, either
opens the file or aborts. xfgets, which either gets a line or aborts
(end of file is considered a failure condition). I'll use Malcolm's
xmalloc - fixed to use size_t - to handle allocations; either I get the
memory, or the app aborts.

Never another line of error handling! Life couldn't be simpler. Mind
you, I pity the poor bastard who uses my app, but hey, they should just
be grateful they get anything, right?
You're a dinosaur who has failed to adapt to the Postmodernist school of
programming, where anything goes, any old garbage is acceptable, users
are worth nothing, and their data is worth rather less than nothing.
Forget best software engineering practice, because it's old hat nowadays
- if you want robustness, buy a Volvo.

It does seem that way, doesn't it? I wish I knew *why* things were
moving this way. It can only lead to a net degradation in the quality of
the apps we use, which doesn't seem to be a good thing.

This, too, shall pass. One hopes.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top