Avoiding recursive stack overflow in C on Unix/Linux?

Discussion in 'C Programming' started by boltar2003, May 5, 2011.

  1. boltar2003

    boltar2003 Guest

    Hello

    If a program recurses too deeply there's always a chance that it could
    run out of stack space and die with a SIGSEGV. Is there any API that can
    tell you how much stack space is left or some other method to prevent this
    in C/C++? I realise I could catch the signal but thats a pretty damn ugly
    way to do it.

    B2003
     
    boltar2003, May 5, 2011
    #1
    1. Advertisements

  2. boltar2003

    boltar2003 Guest

    Actually I suppose i could use getrlimit to get the stack size but is there
    a clean way of actually finding the current position of top of the stack? I
    could take the addresses of local variables but that seems a bit prone to
    error.

    B2003
     
    boltar2003, May 5, 2011
    #2
    1. Advertisements

  3. boltar2003

    Xavier Roche Guest

    There is no portable way to find the stack bottom/size ; I am not even
    sure what getrlimit(RLIMIT_STACK) returns exactly (current thread stack
    size ? main thread stack size ? default stack size ?)

    (And there is no portable way to find the stack top/bottom -- you'll
    have to play with an address on the current stack around main, and play
    with arithmetics with the page size)
     
    Xavier Roche, May 5, 2011
    #3
  4. The only proper way to avoid stack-overflows is to prevent it from
    happening in the first place. If at all possible (and frequently it
    is) avoid recursion, if it's unavoidable, as it sometimes is, make
    sure your design prevents those situations from arising in the first
    place.

    Trying to solve a design problem by imposing artificial limitations
    seems a bad idea.
     
    Kleuskes & Moos, May 5, 2011
    #4
  5. I find this answer interesting, mainly because I think is suggests a
    very different view about programming. I would not try to avoid
    recursion "if at all possible". In general I consider that it is up to
    the system to provide the resources needed by a program and this
    includes stack for a reasonable amount of recursion. These resources
    can run out of course, and the stack is special in that few languages
    provide any way to handle its exhaustion elegantly, but that does not
    seem enough reason to design out all recursion.

    There are special cases: some environments are severely resource limited
    but there's no indication that the OP is using such a system and,
    anyway, I don't think you can make general rules from specific situations.
    Isn't this what you are doing? Isn't the ban on recursion not an
    artificial limitation?
     
    Ben Bacarisse, May 5, 2011
    #5
  6. boltar2003

    James Kuyper Guest

    And how do you do that? If the stack size is sufficiently limited, even

    int main(int argc, char*argv[]) { return 0; }

    could overflow the stack.
    You can take measures to reduce the likelihood of overflow, but there's
    no easy way to know whether you've done enough. And doing enough on one
    system might be a complete waste of time on others. The systems I use
    most frequently allow me to put a gigabyte or more of data on the stack;
    other systems are far more limited.

    ....
    So what's the design solution you propose to "prevent overflow"?
     
    James Kuyper, May 6, 2011
    #6
  7. Hmm... only in the most contrived implementation. The standard mandates
    that an implementation must be able to translate and run at least one
    program that is very much more complex than this. The system would have
    to penalise simple, small programs deliberately!

    It's a detail. I agree with your point that almost any call can be the
    one that causes a stack overflow. Recursion need not be to blame.

    <snip>
     
    Ben Bacarisse, May 6, 2011
    #7
  8. boltar2003

    Nobody Guest

    It's the size of the region of address space allocated to the main
    thread's stack.
    If available, the value returned by alloca() will typically be the exact
    bottom of the stack.

    alloca() doesn't conform to any standard, but it's a common extension.
     
    Nobody, May 6, 2011
    #8

  9. The more conventional terminology would have alloca() returning
    something near the *top* of the stack - and the direction of stack
    growth is irrelevant. But so will taking the address of an automatic,
    which is much simpler. The approximate bottom of the stack could be
    stored at entry to main().
     
    robertwessel2, May 6, 2011
    #9
  10. boltar2003

    boltar2003 Guest

    Well its always theoretically possible to re-write recursive code in an
    iterative format but generally it leads to any reasonably complex recursive
    code becoming an unreadable mess.

    B2003
     
    boltar2003, May 6, 2011
    #10
  11. Recursion or not isn't really important. What matters is having a
    static bound on the amount of stack space required. Any design with
    potentially unbounded stack usage (determined by runtime input) is
    flawed since there is no way to reliably detect and recover from a stack
    overflow.
     
    Måns Rullgård, May 6, 2011
    #11

  12. Not inherent to the problem being solved or not part of the original,
    abstract, algorithm being implemented. But you're right, "artificial"
    is a weird word to use in the programming context.
     
    Kleuskes & Moos, May 6, 2011
    #12
  13. boltar2003

    boltar2003 Guest

    I'd not heard of that function before. Calling it with zero - alloca(0) -
    seems to work and return an address but the man page doesn't hint at
    whether this should work or work in the future. Would it be best to call it
    with a non zero value just in case the implementation changes?

    B2003
     
    boltar2003, May 6, 2011
    #13
  14. If you compare recursive to equivalent non-recursive algorithms, i
    think you will find recursion is usually slower and demands (much)
    more resources. So as a rule of thumb, don't recurse, unless you have
    to.

    And yes, i do have a background in embedded systems, where these
    considerations are more important than on modern PC's.
    well... It's bread and butter for the science guys and the practice is
    called "inference". Nevertheless, it's easy to show that recursive
    algorithms are outperformed my equivalent non-recursive ones.

    The saving grace of recursion is that recursive implementations are
    usually easier to understand. If it weren't for that, i'd ban the
    practice outright.
    Nope. It follows from the design of processors and a wish to avoid
    unneccesary overhead (passing arguments, stack-frame maintenance,
    pushing and popping return adresses) and that's without even
    considering any effect a call may have on things like branch-
    prediction, instruction pipe-lines and such. A 'call' instruction is
    (relative to a branch) quite expensive and prevents some optimisations
    like loop-unrolling.

    So no, that's not an artificial limitation, but a design
    consideration.
     
    Kleuskes & Moos, May 6, 2011
    #14
  15. Ah yes... Computers today are so fast and so big that sound design is
    superfluous. Unless of course you are dealing with huge amounts of
    data, since not only have memory sizes and processor speeds increased,
    the amount of data they are supposed to work on have also increased
    dramatically, thus cancelling out said improvements in size and speed.

    If you don't believe me, try importing planet.osm (see
    http://wiki.openstreetmap.org/wiki/Planet.osm)

    25 years ago i never imagined i would fill a 54 Gb partition (20 Mb
    was a pretty big HD those days), now i'm using a 1Tb drive and looking
    at options for an 10 Tb Raid-server.
     
    Kleuskes & Moos, May 6, 2011
    #15

  16. While still non-portable, just write a function like:

    char *getappxroximatestackpointer()
    {
    char c;
    return &c;
    }


    Call it once at the beginning of main(), and save that value. Call it
    again when you're interested, subtract the two (watch the order of the
    subtraction based on the direction of stack growth - or do an abs() ),
    and you'll have a pretty close measurement of the amount of stack
    space you're current using on many implementations.

    As I said, non-portable (not least, the stack doesn't have to be
    contiguous storage), but it will work on many implementations.
     
    robertwessel2, May 6, 2011
    #16
  17. boltar2003

    boltar2003 Guest

    I implemented a simple program that would check when it was getting near
    the limit of the stack but for some reason Linux sends it SEGV when
    (assuming my program is correct) there should be 10 or 20K of space left
    on the stack. Is this normal? I include the code below for someone to point
    out the stupid mistake I must have made somewhere. I changed all void * to
    char * just in case that was throwing the maths but it made no difference:

    #include <stdio.h>
    #include <stdlib.h>
    #include <alloca.h>
    #include <sys/time.h>
    #include <sys/resource.h>

    char *stack_top;
    unsigned int depth;

    void recurse()
    {
    unsigned long int dist;

    ++depth;
    dist = labs((long int)(stack_top - (char *)&dist));
    if (dist < 20000)
    {
    printf("Nearing limit: depth = %u, addr = %lu, dist to end = %lu
    \n",
    depth,&dist,dist);
    }
    recurse();
    }


    int main()
    {
    struct rlimit rl;
    long int mult;
    char *top;

    depth = 0;

    /* Find out which way stack is growing */
    top = (char *)alloca(0);
    stack_top = (char *)alloca(1);

    if (stack_top < top)
    {
    puts("Stack grows downwards");
    mult = -1;
    }
    else
    {
    puts("Stack grows upwards");
    mult = 1;
    }

    /* Now find where max extent of the stack by getting its max size */
    if (getrlimit(RLIMIT_STACK,&rl) == -1)
    {
    perror("getrlimit()");
    return 1;
    }
    stack_top += (rl.rlim_cur * mult);
    printf("Stack max size = %lu\n",rl.rlim_cur);
    printf("Stack max extent = %lu\n",stack_top);
    getchar();

    recurse();
    return 0;
    }


    B2003
     
    boltar2003, May 6, 2011
    #17
  18. That would be _very_ bad design, omitting to consider the basic
    processing needs of the system.
    Nope, there isn't. But then again, there's no easy way to produce any
    software. Designing software _is_ hard and it will remain hard. Anyone
    offering an "easy" solution to this kind of considerartions is either
    a fool or a fraud.
    True. But then again, considering the platform(s) your software will
    run on is an integral part of the design process.
    Nice. So what?
    Avoid recursion if at all possible, for instance by using things like
    finite state machines for parsing data. Recursion is quite costly. I'm
    surprised i have to explain this in this forum.

    And no, it's not _always_ possible to avoid recursion, not is it
    _always_ wise to do so, but as a rule of thumb, it's good.
     
    Kleuskes & Moos, May 6, 2011
    #18
  19. Bravo! That's he first sensible objection I've read in this thread,
    and you're right. Recursive algorithms are usually easier to grasp.
     
    Kleuskes & Moos, May 6, 2011
    #19
  20. You're not taking into account the stack space above the call to main(),
    which will include some amount of space used by Libc plus the command
    line and environment.
     
    Richard Kettlewell, May 6, 2011
    #20
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.