Avoiding recursive stack overflow in C on Unix/Linux?

Casper H.S. Dik · May 8, 2011

Keith Thompson said:
I understand that overly optimistic allocation can make sense
for stack space; a program will likely never use as much stack
as it could. But I'm not convinced that the same thing applies
to malloc(). If a program calls malloc(), it's telling the OS that
it *will* use that memory (and probably soon). If the system can't
fulfill the request, it should refuse it in the first place.

There are many usage of anonymous memory in a system: stack,
heap but also the copy-on-write pages after a fork().

The system may not allocate the different types of memory
in a different way, but some make a difference between the
stack (no reservation is made) or fork/malloc (a reservation
is made). Specifically in the case of fork, there is an argument
to be made that that memory is unlikely ever used.

Do many actual programs malloc() much more memory than they're
actually going to use?

Not a lot; but a lot of fork()ing happens without ever touching
the copy-on-write pages. That is likely the most important reason
for overcommit.

Casper
--

Kenny McCormack · May 8, 2011

Actaully the OOM-killer kills *some* process, not necessarily the one
that just tried to use allocated memory.

Show me where this thing of which you speak, this "OOM-killer", is mentioned
in (any of) the C standards documents. Thank you.

P.S. Having just Googled it, I see that this "OOM-killer" is a
Linux-specific thing, so discussion of it is OT in both newsgroups to which
this is being posted.

James Kuyper · May 8, 2011

Of course there is. Do not make the call-tree big enough to do that,
and if there's no recursion involved, it's rather easy to derive a max
stack size and put it in your hardware requirements.

Imposing a hardware requirement is what makes that solution
non-portable. It's not a bad solution; but it cannot be ported to a
machine that doesn't meet those requirements. Some mechanism for
determining, within your program, whether a given function call would
require more stack than is currently available would make it possible to
write code which at least fails gracefully on such hardware, instead of
simply aborting (or worse).

So, basically, you're saying, since your cannot avoid it absolutely
anyway, all advice is useless, and it's OK to use tail-recursion to
skip through a 2 Mb file. Ok.

No, I'm saying that any precautions you need to take to make recursion
safe necessarily limit the portability of your code; the same is true of
making non-recursive function calls safe; it's just a little easier to
take the appropriate precautions with non-recursive code. As long as you
take those precautions, and consider the associated limitations on the
portability of your code acceptable, there's no other reason to avoid
recursion. And, where it's appropriate, there can be strong reasons to
favor a recursive approach.

....

You're like a mechanic telling me how brakes are useless since they
cannot guarantee that you won't run into anything, anyway. Even when
applying the brakes, you can still run over a pedestrian or a 20-ton
truck, so basically brakes are useless.

No, you should not confuse non-portable with useless.

....

Nope, it's mandated by the POSIX standard instead. But i'm getting
curious, pray describe the superior solution you have in mind.

I haven't come up with an alternative that is easy to use.
The simplest approach I can think of, from the point of view of the
implementor, would be two constructs. I would call them standard library
functions, and using function-call syntax might be the appropriate way
to specify them in the standard, but they must be implemented in such a
way that they can always be called safely, regardless of how little
stack there is left. One function determines how much memory is
currently available for objects with automatic storage duration (if the
exact amount is hard to determine, it may instead return a guaranteed
minimum amount).
The second function takes a function pointer as an argument, and reports
the amount of memory required for objects with automatic storage
duration needed to call that function. It would not include the memory
needed by function calls within the pointed-at function; avoiding stack
overflow due to those calls would be the responsibility of the function
that directly performs them.

There's a number of reasons why this simple approach would be inadequate
for C: VLAs and objects with automatic storage duration defined inside
conditionally executed blocks are the two biggest problems I can see;
I'm sure there are others. However, a C-like language without those
features could implement such constructs. It might be possible to
implement some variant of this idea in C itself.

The key advantage of this idea, over SIGSEGV, is that it allows a
program to avoid the problem, rather than merely react to the problem
after it has already happened.

Understand: I am not advocating the creation of such a mechanism; my
main point is that the problem that such a mechanism would solve, exists
whether or not the program in question uses recursion. You shouldn't
avoid recursion just because that problem is a little harder to deal
with in recursive code than in non-recursive code.

....

Ok. So you have different ways of dealing with a stack-overflow. Again
i'm curious how you do that. Portably, since you've stressed the
importance of that quality in your software.

I can't - because neither C, nor any other language that I'm familiar
with, has portable methods to avoid stack overflow. I use non-portable
methods that are probably not too different from yours. They basically
amount to making sure that it works on every machine I have available to
test it on, and hoping that it works elsewhere.

While my software is required to be available to the public, the only
machines it absolutely must run on are machines operated by my own
company, and I have development machines I can test my code on that are
(supposed to be) configured identically to the production machines.
But if a user is having trouble porting my code to their machine, and is
willing to give me access to that machine for testing, I would certainly
be glad to fix the problem. I have anecdotal evidence suggesting that
most users don't bother; many of them simply make the fix themselves,
without even bothering to complain about it. I found a bug last year
when we first ported our code to a 64-bit CentOS machine, and when I
announced my bug fix, three users wrote to tell me they'd already
noticed the problem and fixed it in their own copies of the code,
without bothering to tell me about the bug.

....

So your point applies, and i can't guarantee that a single function-
call won't crash the system. Mission impossible? Mission
irresponsable?

What alternatives do you have?

None - which is the substance of my complaint.

I'm getting curious again, what sort of killer-functioncall did you
have in mind? Something which allocates a 2Gb array on stack and
passes it around by value? If that's the case, your design stinks.

No, just a function that requires more total stack space than is
available; it doesn't all have to be allocated in a single object.
Virtual memory makes such a limit harder to exceed, but not all systems
have virtual memory. On some small machines, it could take a whole lot
less than 2GB to exceed that limit.

Kleuskes & Moos · May 8, 2011

Malcolm McLean wrote:

) The problem is that if a data structure is inherently recursive then
) it can be very difficult to write non-recursive code to traverse it.
) The usual solution is to use a stack. However this doesn't really
) solve the problem, it just means that the stack overflow occurs, if it
) occurs, in a different place.

In a place where you can catch the error and terminate cleanly.

One of the major problems with C is that you cannot reliably handle
stack overflows, unless you roll your own stack.

Exactly. And since implementing a stack mechanism with all the
goodities and niceties a demanding programmer wants, is in the range
of Computer Science 101, it's rather superfluous as a language
element.

The language C is so powerful and lean since it does put
responsability on the programmer. So take it.

Kleuskes & Moos · May 8, 2011

Imposing a hardware requirement is what makes that solution
non-portable.

That may be true, but still, there's no way around it. At leats your
requrements should include a fair estimate of memory needed to run the
program.

Besides, no software, however portable, will run on all systems. The
differences are simply too big. So wether you want to or not, hardware
requirements are a fact of life.

It's not a bad solution; but it cannot be ported to a
machine that doesn't meet those requirements.

Nope. That's the basic meaning of hardware requirements.

Some mechanism for determining, within your program, whether a given
function call would require more stack than is currently available would
make it possible to write code which at least fails gracefully on such
hardware, instead of simply aborting (or worse).

Some mechanism of checking RAM size and seeing wether or not that
meets the minimum requirements is much simpler. Then it's up to the
programmer to halt the program or inform the user that he's on his own
and the programmer can't guarantee safe execution.

No, I'm saying that any precautions you need to take to make recursion
safe necessarily limit the portability of your code; the same is true of
making non-recursive function calls safe; it's just a little easier to
take the appropriate precautions with non-recursive code. As long as you
take those precautions, and consider the associated limitations on the
portability of your code acceptable, there's no other reason to avoid
recursion. And, where it's appropriate, there can be strong reasons to
favor a recursive approach.

Still curious what "precautions" you are talking about, after first
telling me there's no safe and portable way. So please expound.

No, you should not confuse non-portable with useless.

Oh, trust me... I don't. Some of the most useful software i ever wrote
was totally and utterly non-portable. In a software design-sense, that
is. It even required a custom processor to run in the first place.

I haven't come up with an alternative that is easy to use.
The simplest approach I can think of, from the point of view of the
implementor, would be two constructs. I would call them standard library
functions, and using function-call syntax might be the appropriate way
to specify them in the standard, but they must be implemented in such a
way that they can always be called safely, regardless of how little
stack there is left.

If you're out of stack, you can't call anything, for lack of stack-
space to hold a return address. But still, you could reserve some
space for an "emergency stack".

One function determines how much memory is
currently available for objects with automatic storage duration (if the
exact amount is hard to determine, it may instead return a guaranteed
minimum amount).
Ok.

The second function takes a function pointer as an argument, and reports
the amount of memory required for objects with automatic storage
duration needed to call that function. It would not include the memory
needed by function calls within the pointed-at function; avoiding stack
overflow due to those calls would be the responsibility of the function
that directly performs them.

There's a number of reasons why this simple approach would be inadequate
for C: VLAs and objects with automatic storage duration defined inside
conditionally executed blocks are the two biggest problems I can see;
I'm sure there are others. However, a C-like language without those
features could implement such constructs. It might be possible to
implement some variant of this idea in C itself.

It might.

The key advantage of this idea, over SIGSEGV, is that it allows a
program to avoid the problem, rather than merely react to the problem
after it has already happened.

Tja... It seems to me such a solution would have a few problems. First
off, on many systems the amount of memory available to automatic
variables variable itself. Secondly it would introduce a shitload of
overhead, since before calling any function a check must be made. And
that's apart from problems you already mentioned.

Understand: I am not advocating the creation of such a mechanism; my
main point is that the problem that such a mechanism would solve, exists
whether or not the program in question uses recursion. You shouldn't
avoid recursion just because that problem is a little harder to deal
with in recursive code than in non-recursive code.

No. You should avoid it because it's fundamentally unsafe, unless you
and your customers are prepared to live with the consequences. In the
latter case, and this is quite frequent, recurse the hell out of your
system if it makes your software easier to read and maintain.

As i mentioned earlier, it _is_ the saving grace of recursion.

I can't - because neither C, nor any other language that I'm familiar
with, has portable methods to avoid stack overflow. I use non-portable
methods that are probably not too different from yours. They basically
amount to making sure that it works on every machine I have available to
test it on, and hoping that it works elsewhere.

Ok. I'm specialized i software where "hoping it doesn't crash" just
does not cut it. People get killed that way (no joke).

While my software is required to be available to the public, the only
machines it absolutely must run on are machines operated by my own
company, and I have development machines I can test my code on that are
(supposed to be) configured identically to the production machines.
But if a user is having trouble porting my code to their machine, and is
willing to give me access to that machine for testing, I would certainly
be glad to fix the problem. I have anecdotal evidence suggesting that
most users don't bother; many of them simply make the fix themselves,
without even bothering to complain about it. I found a bug last year
when we first ported our code to a 64-bit CentOS machine, and when I
announced my bug fix, three users wrote to tell me they'd already
noticed the problem and fixed it in their own copies of the code,
without bothering to tell me about the bug.

Perhaps they thought you already knew about it.

In many applications a crashing program isn't the end of the world,
and customers just shrug and try again. That is also my expirience.

None - which is the substance of my complaint.

Ok. In that case, try to avoid recursion, if at all possible,
especially if you suspect you stack might me overrun. Your customers
do not mind a crash every now and then, mine do and consider it a
matter of life and death or at least a substantial amount of cash.
They don't come complaining to me, but their lawyers start complaining
to my boss.

No, just a function that requires more total stack space than is
available; it doesn't all have to be allocated in a single object.
Virtual memory makes such a limit harder to exceed, but not all systems
have virtual memory. On some small machines, it could take a whole lot
less than 2GB to exceed that limit.

True. So take heed that does not happen.

Kleuskes & Moos · May 8, 2011

Using an alternate signal stack is able to catch this, but
don't be an a* about asking for "reliably and portability".

If it's unreliable then it's useless, so if you consider me an ass to
demand it to be reliable, then so be it. Portability is a major issue,
so calling me an ass for that falls under the same category.

But i'd still be interested in your non-portable way, and the above is
insufficient to give me an idea about what you have in mind.

Xavier Roche · May 8, 2011

Le 07/05/2011 20:43, Kleuskes & Moos a écrit :

overflows are impossible to catch. Please expound on your method to
reliably (and portably) catch stack-overflows in recursive algorithms.

To play devil's advocate (yes, having a stack overflow is a clear sign
that your algorithm sucks), you could in theory play with setjmp/longjmp
and a signal handler. This method is a big evil and gory, but, well,
using POSIX functions

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <signal.h>
#include <setjmp.h>
#include <sys/ucontext.h>
#include <assert.h>

static jmp_buf jmp_return;

static void handler(int code, siginfo_t *si, void *ctx) {
static const char message[] = "Caught SEGV\n";
(void) write(2, message, sizeof(message) - 1);
longjmp(jmp_return, 1);
}

/* Crappy recusrive function. */
static long recurse(long i) {
if (i > 3) {
return recurse(i - 1)*recurse(i - 2)*recurse(i - 3);
} else {
return 1;
}
}

static void *myFun(void *arg) {
long long value = *(long long*) arg;
struct sigaction sa;

printf("entering thread\n");

/* Install alternative stack. */
stack_t * const ss = calloc(1, sizeof(stack_t));
ss->ss_size = SIGSTKSZ;
ss->ss_sp = malloc(ss->ss_size);
ss->ss_flags = 0;
if (sigaltstack(ss, NULL) == -1) {
assert(0);
}

/* Install SEGV handler. */
if (sigemptyset(&sa.sa_mask) != 0) {
abort();
}
sa.sa_sigaction = handler;
sa.sa_flags = SA_SIGINFO | SA_ONSTACK;
if (sigaction(SIGSEGV, &sa, NULL) != 0) {
abort();
}

/* Save context. */
printf("executing function\n");
if (setjmp(jmp_return) == 0) {
const long result = recurse(value);
printf("result == %lu\n", result);
} else {
fprintf(stderr, "recurse failed\n");
}

printf("returning from thread\n");

return NULL;
}

int main(int argc, char **argv) {
pthread_t t;
long long value = atoll(argc > 1 ? argv[1] : "100000000000");
printf("main\n");
if (pthread_create(&t, NULL, myFun, &value) == 0) {
if (pthread_join(t, NULL) == 0) {
printf("success\n");
return EXIT_SUCCESS;
}
}
printf("error\n");
return EXIT_FAILURE;
}

Rainer Weikusat · May 8, 2011

Jorgen Grahn said:
Dr Nick said:

[...]

Where recursion becomes an issue is when you use it for every element of
a sequential data structure. For instance, if a parser's algorithmic
structure were something like:

read_first_token
process_token
recurse(rest of document)

it would probably run into a stack limit on most implementations when
processing any real input.

Most implementations where the compiler doesn't optimise tail recursion
away, anyway.

Click to expand...

Eh ... you do understand that 'compiler detects that programmer was a
crackpot and works around that automatically' implies that recursion
is probematic, do you?

Click to expand...

It's the other way around -- recursion is less problematic if it can
be compiled into something which runs efficiently.

For obvious reasons, recursion cannot cause anything detrimental if it
is automaticlly removed. But if there wasn't a problem to begin with,
there would be no reason for this automatic removal. 'Tail recursion'
is essentially something which shouldn't be used because the recursion
doesn't really help with anything in this case: The fact that it is
'at the tail' implies that the current processing step can be
completed before the next step and the most 'natural' represention of
that is a processing loop. Recursion is useful when the current
processing step cannot be completed before the result of the next
processing step is known, IOW, when some algorithm must first move to
the tail of something and then process data backwards from there,
utilizing the 'state information' that's stored in the environments of
the stacked subroutine invocations. In an imperative language, it is
essentially a trick which can be used to cause the compiler
generate the necessary state management code (at the expense of a
certain runtime overhead) automatically instead of writing it by
hand.

Compilers and interpreters for functional languages (Haskell, ML,
Lisp ...) where recursion replaces loops, have been doing this for
decades.

People who like express simple things in weird, complicated ways have
(for decades, presumably) relied on computer program to silently
undo the weirdness in order to maintain the illusion that what they
are doing actually makes any sense ...

Malcolm McLean · May 8, 2011

People who like express simple things in weird, complicated ways have
(for decades, presumably) relied on computer program to silently
undo the weirdness in order to maintain the illusion that what they
are doing actually makes any sense ...

The Lisp family of languages are fundamentally different to the C
family. I must admit I've never been able to get anything useful out
of them, but they I've only played about with them in my spare time.

Nobody · May 8, 2011

If it's unreliable then it's useless, so if you consider me an ass to
demand it to be reliable, then so be it. Portability is a major issue,
so calling me an ass for that falls under the same category.

Not really. Very few real programs are truly portable insofar as they
require nothing beyond what is mandated by the C standard. If you need to
use directories, child processes, networking, a GUI, etc, you have to
forego portability to get them. A lot of code forsakes portability simply
for performance reasons.

Nobody · May 8, 2011

Of course there is. Do not make the call-tree big enough to do that,
and if there's no recursion involved, it's rather easy to derive a max
stack size and put it in your hardware requirements.

Unless you're using any library functions which don't document their stack
requirements, which is almost certainly the case. I've only ever seen
(note: seen, not used) one library which documented its stack requirements.

Nobody · May 8, 2011

P.S. Having just Googled it, I see that this "OOM-killer" is a
Linux-specific thing, so discussion of it is OT in both newsgroups to which
this is being posted.

Only if you consider programming to be an intellectual curiosity with no
practical significance. To be of practical use, a program has to run on a
real system which is bound to have implementation-specific behaviour.

Måns Rullgård · May 8, 2011

Nobody said:
Unless you're using any library functions which don't document their stack
requirements, which is almost certainly the case. I've only ever seen
(note: seen, not used) one library which documented its stack requirements.

It is relatively simple to examine the code and determine the stack
requirements.

Keith Thompson · May 8, 2011

Kleuskes & Moos said:
So, basically, you're saying, since your cannot avoid it absolutely
anyway, all advice is useless, and it's OK to use tail-recursion to
skip through a 2 Mb file. Ok.

[...]

No, he didn't say anything like that.

I've found that debates like this are much more productive if each party
states his own opinions, rather than arguing against some exaggerated
parody of the other party's position.

Michael Press · May 8, 2011

Kleuskes & Moos said:
Where i work, code like that is likely to get you fired on the spot.

The Code of Hammurabi.

James Kuyper · May 9, 2011

No. You should avoid it because it's fundamentally unsafe,

I guess we'll just have to agree to disagree on that point; I don't wee
either of us making any progress in convincing the other.

Malcolm McLean · May 9, 2011

Kleuskes & Moos wrote:

But are you advising that if it is known or suspected that the C
implementation uses a stack for the operation of function calls, that
such an implementation detail alone is enough reason to avoid recursion
in mission-critical code, because C does not have a standard way to
detect proximity to overflow of said stack?

There's no way of implementing recursion without some sort of LIFO
structure, which almost always is a stack. C implementations that
don't use stacks usually don't support recursion.

The problem is that if the stack overflows you get undefined
behaviour, usually but not guaranteed to be a crash. There's a
possibility of malicious exploits. For many applications it doesn't
matter, but for many it does, notably safety-critical systems and
popular mass-market applications. In reality I think almost everyone
just sanity tests the depth of recursion.

Xavier Roche · May 9, 2011

To play devil's advocate (yes, having a stack overflow is a clear sign
that your algorithm sucks), you could in theory play with setjmp/longjmp
and a signal handler. This method is a big evil and gory, but, well,
using POSIX functions

Note: for the lurkers, in the sample, it would be better to replace
setjmp() by sigsetjmp(), and longjmp() by siglongjmp().

Kleuskes & Moos · May 9, 2011

I guess we'll just have to agree to disagree on that point; I don't wee
either of us making any progress in convincing the other.

That's a sensible outcome. It was a pleasure exchanging views with
you.

Kleuskes & Moos · May 9, 2011

So, basically, you're saying, since your cannot avoid it absolutely
anyway, all advice is useless, and it's OK to use tail-recursion to
skip through a 2 Mb file. Ok.

Click to expand...

[...]

No, he didn't say anything like that.

I've found that debates like this are much more productive if each party
states his own opinions, rather than arguing against some exaggerated
parody of the other party's position.

Every now and then it pays to be a little provocative. I call it,
"kicking the pear tree". And yes, the exaggeration was intentional and
designed to provoke a response. It payed off, i'm glad to say.

C/Ubuntu ][ Problem in Buffer Overflow logic	1	Dec 12, 2008
checking for stack overflow in recursive function	4	Sep 22, 2003
Recursion crash in STL on linux	14	Apr 27, 2010
The machine stack and the C language	122	Jan 12, 2008
C language now truly universal	0	Jan 1, 2011
Installing a module written in C on a UNIX without root access	5	Jan 3, 2009
In the Matter of Herb Schildt: a Detailed Analysis of "C: TheComplete Nonsense"	109	Apr 3, 2010
Coding for a USB joystick in C++ on a Redhat linux machine	2	May 21, 2007

Avoiding recursive stack overflow in C on Unix/Linux?

Casper H.S. Dik

Kenny McCormack

James Kuyper

Kleuskes & Moos

Kleuskes & Moos

Kleuskes & Moos

Xavier Roche

Rainer Weikusat

Malcolm McLean

Nobody

Nobody

Nobody

Måns Rullgård

Keith Thompson

Michael Press

James Kuyper

Malcolm McLean

Xavier Roche

Kleuskes & Moos

Kleuskes & Moos

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads