malloc

Discussion in 'C Programming' started by Joe keane, Feb 16, 2012.

  1. Joe keane

    Joe keane Guest

    Bugs of memory allocation will make you mad.

    Bugs *in* memory allocation will put you in the cuckoo people place.
     
    Joe keane, Feb 16, 2012
    #1
    1. Advertising

  2. Joe keane

    BGB Guest

    On 2/16/2012 2:21 PM, Devil with the China Blue Dress wrote:
    > In article<jhjr14$rjk$>, (Joe keane) wrote:
    >
    >> Bugs of memory allocation will make you mad.
    >>
    >> Bugs *in* memory allocation will put you in the cuckoo people place.

    >
    > Which is why they are exceedingly rare. Nearly all allocation problems are due
    > to the program storing outside array bounds.
    >


    which are annoyingly difficult to track down sometimes...


    it would be nice if compilers would offer an option (as a debugging
    feature) to optionally put in bounds checking for many operations (it is
    also possible to do this without adding fat pointers or fundamentally
    changing how the language works, although it would still introduce a
    run-time cost).

    simple idea:
    on a array access, identify the memory object pointed-to by the base
    pointer (heap-lookup);
    determine if the address being assigned to is the same object (possibly,
    another heap lookup, or checking the target against the first object);
    if not, blow up.

    now, if the person can't afford the cost, they don't have to use the
    feature.


    some other use cases are a little harder, say:
    double *fp;
    fp=...
    while(fp) { *fp=*fp+1; fp++; }

    the issue would be that, potentially, the pointer could jump from one
    object to the next without the run-time checks noticing.

    a secondary defense could be to place "trip wires" in the heap between
    objects, and if a memory-write check determines that the pointer points
    to such a trip-wire, then an exception can be raised (one option for
    such trip-wires is to have them occupy the same physical space as
    memory-object headers or similar, and maybe also several for words
    following the end of a memory-object).


    although, without the explicit testing or throwing (kind of a problem
    with a standard compiler), I have used similar before as a means to
    attempt to debug array overruns with several custom memory managers of
    mine (the memory managers will make some attempt to detect and
    diagnose/report these sorts of problems).


    or such...
     
    BGB, Feb 16, 2012
    #2
    1. Advertising

  3. Joe keane

    Eric Sosman Guest

    On 2/16/2012 4:21 PM, Devil with the China Blue Dress wrote:
    > In article<jhjr14$rjk$>, (Joe keane) wrote:
    >
    >> Bugs of memory allocation will make you mad.
    >>
    >> Bugs *in* memory allocation will put you in the cuckoo people place.

    >
    > Which is why they are exceedingly rare. Nearly all allocation problems are due
    > to the program storing outside array bounds.


    The only allocator bug I have personally encountered was with
    a malloc() implementation that never, never returned NULL. When it
    ought to have returned NULL, it crashed the program instead ...

    --
    Eric Sosman
    d
     
    Eric Sosman, Feb 17, 2012
    #3
  4. Joe keane

    Goran Guest

    On Feb 17, 2:27 am, Eric Sosman <> wrote:
    > On 2/16/2012 4:21 PM, Devil with the China Blue Dress wrote:
    >
    > > In article<jhjr14$>, (Joe keane) wrote:

    >
    > >> Bugs of memory allocation will make you mad.

    >
    > >> Bugs *in* memory allocation will put you in the cuckoo people place.

    >
    > > Which is why they are exceedingly rare. Nearly all allocation problems are due
    > > to the program storing outside array bounds.

    >
    >      The only allocator bug I have personally encountered was with
    > a malloc() implementation that never, never returned NULL.  When it
    > ought to have returned NULL, it crashed the program instead ...


    If you don't provide a way to verify that, it didn't happen ;-), and
    there was a bug in your code.

    Goran.
     
    Goran, Feb 17, 2012
    #4
  5. Joe keane

    Goran Guest

    On Feb 16, 10:04 pm, (Joe keane) wrote:
    > Bugs of memory allocation will make you mad.
    >
    > Bugs *in* memory allocation will put you in the cuckoo people place.


    .... Provided you really encountered one. If you don't provide
    verifiable evidence that you did, you didn't.

    Goran.
     
    Goran, Feb 17, 2012
    #5
  6. Goran <> writes:
    > On Feb 17, 2:27 am, Eric Sosman <> wrote:
    >> On 2/16/2012 4:21 PM, Devil with the China Blue Dress wrote:
    >>
    >> > In article<jhjr14$>, (Joe keane) wrote:

    >>
    >> >> Bugs of memory allocation will make you mad.

    >>
    >> >> Bugs *in* memory allocation will put you in the cuckoo people place.

    >>
    >> > Which is why they are exceedingly rare. Nearly all allocation problems are due
    >> > to the program storing outside array bounds.

    >>
    >>      The only allocator bug I have personally encountered was with
    >> a malloc() implementation that never, never returned NULL.  When it
    >> ought to have returned NULL, it crashed the program instead ...

    >
    > If you don't provide a way to verify that, it didn't happen ;-), and
    > there was a bug in your code.


    I suspect he's referring to the malloc() implementation
    on typical Linux systems, which overcommits memory by default.
    It can allocate a large chunk of address space for which no actual
    memory is available. The memory isn't actually allocated until
    the process attempts to access it. If there isn't enough memory
    available for the allocation, the "OOM killer" kills some process
    (not necessarily the one that did the allocation).

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Will write code for food.
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Feb 17, 2012
    #6
  7. Joe keane

    Noob Guest

    Eric Sosman wrote:

    > The only allocator bug I have personally encountered was with
    > a malloc() implementation that never, never returned NULL. When it
    > ought to have returned NULL, it crashed the program instead ...


    Visual Studio 6 used to crash on free(NULL).
     
    Noob, Feb 17, 2012
    #7
  8. Joe keane

    Goran Guest

    On Feb 17, 11:26 am, Keith Thompson <> wrote:
    > Goran <> writes:
    > > On Feb 17, 2:27 am, Eric Sosman <> wrote:
    > >> On 2/16/2012 4:21 PM, Devil with the China Blue Dress wrote:

    >
    > >> > In article<jhjr14$>, (Joe keane) wrote:

    >
    > >> >> Bugs of memory allocation will make you mad.

    >
    > >> >> Bugs *in* memory allocation will put you in the cuckoo people place..

    >
    > >> > Which is why they are exceedingly rare. Nearly all allocation problems are due
    > >> > to the program storing outside array bounds.

    >
    > >>      The only allocator bug I have personally encountered was with
    > >> a malloc() implementation that never, never returned NULL.  When it
    > >> ought to have returned NULL, it crashed the program instead ...

    >
    > > If you don't provide a way to verify that, it didn't happen ;-), and
    > > there was a bug in your code.

    >
    > I suspect he's referring to the malloc() implementation
    > on typical Linux systems, which overcommits memory by default.
    > It can allocate a large chunk of address space for which no actual
    > memory is available.  The memory isn't actually allocated until
    > the process attempts to access it.  If there isn't enough memory
    > available for the allocation, the "OOM killer" kills some process
    > (not necessarily the one that did the allocation).


    That crossed my mind, but what he said doesn't correspond with what
    happens: malloc does return something and __doesn't__ crash the
    program. OOM killer kills the code upon an attempt to access that
    memory.

    But given the way he explained it, it's possible that he's affected by
    OOM killer, and he forgot, or never knew, what really happened.

    Goran.
     
    Goran, Feb 17, 2012
    #8
  9. Joe keane

    Eric Sosman Guest

    On 2/17/2012 2:40 AM, Goran wrote:
    > On Feb 17, 2:27 am, Eric Sosman<> wrote:
    >> On 2/16/2012 4:21 PM, Devil with the China Blue Dress wrote:
    >>
    >>> In article<jhjr14$>, (Joe keane) wrote:

    >>
    >>>> Bugs of memory allocation will make you mad.

    >>
    >>>> Bugs *in* memory allocation will put you in the cuckoo people place.

    >>
    >>> Which is why they are exceedingly rare. Nearly all allocation problems are due
    >>> to the program storing outside array bounds.

    >>
    >> The only allocator bug I have personally encountered was with
    >> a malloc() implementation that never, never returned NULL. When it
    >> ought to have returned NULL, it crashed the program instead ...

    >
    > If you don't provide a way to verify that, it didn't happen ;-), and
    > there was a bug in your code.


    Yeah, it was probably my code. Awfully nice of DEC to fix it
    for me by patching VMS' C library, don't you think?

    --
    Eric Sosman
    d
     
    Eric Sosman, Feb 17, 2012
    #9
  10. Noob wrote:
    > Eric Sosman wrote:
    >
    >> The only allocator bug I have personally encountered was with
    >> a malloc() implementation that never, never returned NULL. When it
    >> ought to have returned NULL, it crashed the program instead ...

    >
    > Visual Studio 6 used to crash on free(NULL).


    Which seems OK, of malloc() never returns NULL.
    free() only needs to take what malloc() gives, doesn't it?

    Bye, Jojo

    PS: ;-)
     
    Joachim Schmitz, Feb 17, 2012
    #10
  11. Joe keane

    Noob Guest

    pete wrote:

    > No.
    > The standard says:
    > void free(void *ptr);
    > If ptr is a null pointer, no action occurs.


    You missed his post scriptum.
     
    Noob, Feb 17, 2012
    #11
  12. Joe keane

    James Kuyper Guest

    On 02/17/2012 11:05 AM, Joachim Schmitz wrote:
    > Noob wrote:
    >> Eric Sosman wrote:
    >>
    >>> The only allocator bug I have personally encountered was with
    >>> a malloc() implementation that never, never returned NULL. When it
    >>> ought to have returned NULL, it crashed the program instead ...

    >>
    >> Visual Studio 6 used to crash on free(NULL).

    >
    > Which seems OK, of malloc() never returns NULL.


    That would be non-conforming; if malloc() successfully allocates memory,
    "Each such allocation shall yield a pointer to an object disjoint from
    any other object". If a program calls malloc(SIZE_MAX) often enough
    without ever free()ing the memory, it must sooner or later either fail
    with a null return value, or return a pointer to memory that is NOT
    disjoint from the memory previously allocated,, even in the very loose
    (and IMO, nonconforming) sense that over-committing malloc()s use for
    the term "allocate".

    > free() only needs to take what malloc() gives, doesn't it?


    No, there's an explicit requirement that passing a null pointer to
    free() has no affect. Crashing doesn't qualify.
    --
    James Kuyper
     
    James Kuyper, Feb 17, 2012
    #12
  13. Goran <> writes:
    > On Feb 17, 11:26 am, Keith Thompson <> wrote:

    [...]
    >> I suspect he's referring to the malloc() implementation
    >> on typical Linux systems, which overcommits memory by default.
    >> It can allocate a large chunk of address space for which no actual
    >> memory is available.  The memory isn't actually allocated until
    >> the process attempts to access it.  If there isn't enough memory
    >> available for the allocation, the "OOM killer" kills some process
    >> (not necessarily the one that did the allocation).

    >
    > That crossed my mind, but what he said doesn't correspond with what
    > happens: malloc does return something and __doesn't__ crash the
    > program. OOM killer kills the code upon an attempt to access that
    > memory.


    If you call malloc() and it overcommits, it won't crash the
    program until you access the allocated memory. (The rationale for
    overcommitting is that most programs don't actually use most of
    the memory the memory the allocate. I find that odd)

    > But given the way he explained it, it's possible that he's affected by
    > OOM killer, and he forgot, or never knew, what really happened.


    Given the post you're responding to, I would find it more likely that
    he knows exactly what happened and didn't mention it. Reading his
    more recent followup, apparently it was on VMS, not Linux, and was
    likely a bug in DEC's C library.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Will write code for food.
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Feb 17, 2012
    #13
  14. On Feb 17, 9:25 am, Keith Thompson <> wrote:
    > If you call malloc() and it overcommits, it won't crash the
    > program until you access the allocated memory.  (The rationale for
    > overcommitting is that most programs don't actually use most of
    > the memory the memory the allocate.  I find that odd)


    Really? The explanation that I'm most familiar with is that most fork
    calls are immediately followed by exec, and thus if you're low on
    memory, then a large process cannot spawn a new process without
    overcommit because the only process create primitive is fork, which
    "copies" the memory space of the parent process.

    I of course think this is a broken state of affairs for several
    reasons. 1- Just introduce a new create process primitive that creates
    a process from an executable file with a copy options for specifying
    the env vars, the command line, the working dir, etc. Sure it lacks
    the full "power" of fork + exec, but it's a lot easier to use, and for
    most uses of fork, I expect this would suffice, and then we could
    avoid this nasty overcommit problem. 2- I'm annoyed at the ease at
    which resources can be leaked to child processes and the near
    impossibility to do anything portably about it.

    However, due to discussions on this board, I've come to learn that OOM
    situations are very hard to program for, and OOM in a common desktop
    just can't be handled gracefully.
     
    Joshua Maurice, Feb 17, 2012
    #14
  15. On Feb 17, 12:46 pm, Paavo Helde <> wrote:
    > BGB <> wrote innews:jhk2rh$19o$:
    >
    >
    >
    >
    >
    >
    >
    >
    >
    > > On 2/16/2012 2:21 PM, Devil with the China Blue Dress wrote:
    > >> In article<jhjr14$>, (Joe keane)
    > >> wrote:

    >
    > >>> Bugs of memory allocation will make you mad.

    >
    > >>> Bugs *in* memory allocation will put you in the cuckoo people place.

    >
    > >> Which is why they are exceedingly rare. Nearly all allocation
    > >> problems are due to the program storing outside array bounds.

    >
    > > which are annoyingly difficult to track down sometimes...

    >
    > > it would be nice if compilers would offer an option (as a debugging
    > > feature) to optionally put in bounds checking for many operations

    >
    > Yes it is, and they are doing that. Use e.g. MSVC with iterator checking
    > switched on (this is the default) and accessing e.g. a std::vector out of
    > bounds will generate a runtime error. With gcc one can use MALLOC_CHECK_
    > or -lmcheck, these also should catch some out-of-bounds access errors.
    >
    > For raw pointers it is more difficult, you can use something like
    > ElectricFence or Valgrind on Linux, but it makes your program to run many
    > times slower and consume lots of more memory so this is just for
    > debugging.


    MSVC also has some debug options that try to catch writes past the
    ends of allocated regions as well. IIRC, they overallocate, and put
    special bit patterns at the start and end on allocation, and when
    freed they check to see if those bit patterns are intact, raising a
    fatal error or something if it finds a problem.
     
    Joshua Maurice, Feb 17, 2012
    #15
  16. Joe keane

    Kaz Kylheku Guest

    ["Followup-To:" header set to comp.lang.c.]
    On 2012-02-17, Keith Thompson <> wrote:
    > Goran <> writes:
    >> On Feb 17, 11:26 am, Keith Thompson <> wrote:

    > [...]
    >>> I suspect he's referring to the malloc() implementation
    >>> on typical Linux systems, which overcommits memory by default.
    >>> It can allocate a large chunk of address space for which no actual
    >>> memory is available.  The memory isn't actually allocated until
    >>> the process attempts to access it.  If there isn't enough memory
    >>> available for the allocation, the "OOM killer" kills some process
    >>> (not necessarily the one that did the allocation).

    >>
    >> That crossed my mind, but what he said doesn't correspond with what
    >> happens: malloc does return something and __doesn't__ crash the
    >> program. OOM killer kills the code upon an attempt to access that
    >> memory.

    >
    > If you call malloc() and it overcommits, it won't crash the
    > program until you access the allocated memory. (The rationale for
    > overcommitting is that most programs don't actually use most of
    > the memory the memory the allocate. I find that odd)


    Odd or not, it is borne out empirically. Applications are physically
    smaller than their virtual footprints.

    It may be the case that C programs that malloc something usually use
    the whole block.

    But overcommitting is not implemented at the level of malloc, but
    at the level of a lower level allocator like mmap.

    If the system maps a large block to give you a smaller one, that large
    block will not be all used immediately.

    Another example is thread stacks. If you give each thread a one megabyte
    stack and make 100 threads, that's 100 megs of virtual space. But that one
    megabyte is a worst case that few, if any, of the threads will hit.

    Programs with lots of threads on GNU/Linux have inflated virtual footprints
    due to the generous default stack size.
     
    Kaz Kylheku, Feb 17, 2012
    #16
  17. In comp.lang.c++ Joshua Maurice <> wrote:

    (snip)

    > Really? The explanation that I'm most familiar with is that most fork
    > calls are immediately followed by exec, and thus if you're low on
    > memory, then a large process cannot spawn a new process without
    > overcommit because the only process create primitive is fork, which
    > "copies" the memory space of the parent process.


    That was fixed about 20 years ago. Among others, there is vfork()

    "vfork - spawn new process in a virtual memory efficient way"

    A simple explanation is that vfork() tells the system that you
    expect to call exec() next, and it can optimize for that case.

    > I of course think this is a broken state of affairs for several
    > reasons. 1- Just introduce a new create process primitive that creates
    > a process from an executable file with a copy options for specifying
    > the env vars, the command line, the working dir, etc.


    (snip)

    -- glen
     
    glen herrmannsfeldt, Feb 18, 2012
    #17
  18. Joe keane

    Kaz Kylheku Guest

    On 2012-02-18, glen herrmannsfeldt <> wrote:
    > In comp.lang.c++ Joshua Maurice <> wrote:
    >
    > (snip)
    >
    >> Really? The explanation that I'm most familiar with is that most fork
    >> calls are immediately followed by exec, and thus if you're low on
    >> memory, then a large process cannot spawn a new process without
    >> overcommit because the only process create primitive is fork, which
    >> "copies" the memory space of the parent process.

    >
    > That was fixed about 20 years ago. Among others, there is vfork()
    >
    > "vfork - spawn new process in a virtual memory efficient way"
    >
    > A simple explanation is that vfork() tells the system that you
    > expect to call exec() next, and it can optimize for that case.


    vfork is a dangerous hack which exposes the semantics of the optimization
    of fork to the program.

    A modern copy-on-write fork hides the semantics: the parent and child
    spaces are shared, but appear duplicated.

    A copy-on-write fork does have to account for the virtual space required
    to duplicate the private mappings of the parent process, because those
    will be copied physically if they are touched.

    If you have a 500 megabyte process, of which 400 megabytes are private
    mappings, and that process forks, the virtual layout of the system
    increases by 400 megabytes. If overcommit is not allowed, that means
    that the 400 megabytes has to be counted as physical memory.

    Joshua is completely right here: forking is one of the use cases for
    overcommit for this reason.
     
    Kaz Kylheku, Feb 18, 2012
    #18
  19. Joe keane

    BGB Guest

    On 2/17/2012 3:51 PM, Joshua Maurice wrote:
    > On Feb 17, 12:46 pm, Paavo Helde<> wrote:
    >> BGB<> wrote innews:jhk2rh$19o$:
    >>
    >>
    >>
    >>
    >>
    >>
    >>
    >>
    >>
    >>> On 2/16/2012 2:21 PM, Devil with the China Blue Dress wrote:
    >>>> In article<jhjr14$>, (Joe keane)
    >>>> wrote:

    >>
    >>>>> Bugs of memory allocation will make you mad.

    >>
    >>>>> Bugs *in* memory allocation will put you in the cuckoo people place.

    >>
    >>>> Which is why they are exceedingly rare. Nearly all allocation
    >>>> problems are due to the program storing outside array bounds.

    >>
    >>> which are annoyingly difficult to track down sometimes...

    >>
    >>> it would be nice if compilers would offer an option (as a debugging
    >>> feature) to optionally put in bounds checking for many operations

    >>
    >> Yes it is, and they are doing that. Use e.g. MSVC with iterator checking
    >> switched on (this is the default) and accessing e.g. a std::vector out of
    >> bounds will generate a runtime error. With gcc one can use MALLOC_CHECK_
    >> or -lmcheck, these also should catch some out-of-bounds access errors.
    >>
    >> For raw pointers it is more difficult, you can use something like
    >> ElectricFence or Valgrind on Linux, but it makes your program to run many
    >> times slower and consume lots of more memory so this is just for
    >> debugging.

    >


    I was mildly aware of ElectricFence and Valgrind, and was also thinking
    mostly of malloc'ed raw pointers and C-style arrays (when passed as
    pointers). the idea would be if something like Valgrind were directly
    integrated into the compiler/runtime as a debug option. this is mostly
    what I was writing about.

    as for bounds-checked collection types, yes, I have a few of those as
    well, and also mostly use a custom memory manager (mostly for GC and
    dynamic type-checking), which could (sadly) create issues (likely not
    detect bounds violations) if this feature were available.


    > MSVC also has some debug options that try to catch writes past the
    > ends of allocated regions as well. IIRC, they overallocate, and put
    > special bit patterns at the start and end on allocation, and when
    > freed they check to see if those bit patterns are intact, raising a
    > fatal error or something if it finds a problem.


    this is partly what my memory allocators do, but may also under certain
    cases scan the heap, detect corruption, and attempt to diagnose it.

    a nifty tool I have used in some places is object-origin tracking, where
    every time the allocator is accessed, it records where it was called
    from, and will use this information (combined with some data-forensics)
    to try to make an educated guess as to "who done it" (or, IOW, around
    where the offending code might be).

    although, sadly, there is no good way to implement a HW write barrier
    for this (sadly, neither Windows nor Linux give enough control over the
    CPU to really make something like this all that workable, even then it
    would still likely be page-fault driven slowness).

    I have some stuff for "software write barriers" (mostly needed for other
    reasons), but not a lot of code uses them (unless forced into it),
    partly due to the added awkwardness and performance overheads.

    one option that could sort of work for larger arrays would be to put
    unused pages between memory objects, such that going out of bounds is
    more likely to trigger a page fault.


    or such...
     
    BGB, Feb 18, 2012
    #19
  20. On Feb 17, 7:04 pm, (Scott Lurndal) wrote:
    > Joshua Maurice <> writes:
    > >Really? The explanation that I'm most familiar with is that most fork
    > >calls are immediately followed by exec, and thus if you're low on
    > >memory, then a large process cannot spawn a new process without
    > >overcommit because the only process create primitive is fork, which
    > >"copies" the memory space of the parent process.

    >
    > You're confusing overcommit with copy-on-write.  Fork uses COW[*] in
    > which the parent and child share the physical pages until the child
    > writes to one - at that point, they child gets a copy (and an allocation
    > occurs which may fail at that point if memory and swap are exhausted).
    >
    > Overcommit was allowed to support sparse arrays which are common
    > with some workloads.

    [...]
    > [*] COW came into general use in the SVR3.2/SVR4 timeframe. Linux has always
    > used COW on fork. The only cost for the child is the page table
    > (which actually can be quite a bit for a 1TB virtual address space using
    > 4k pages - IIRC about 2GB just for page tables to map that much VA; makes
    > 1G pages much more attractive (drops the page table size to 8k)).


    So, I thought that if I turned overcommit off in Linux that if I tried
    to fork with a large process and low commit, then the fork would fail.
    (We're getting a little off topic, but I do not care.)

    > (p.s.  see 'posix_spawn').


    posix_spawn solves the "COW" and fork memory problem, but posix_spawn
    still has all of the same process w.r.t. leaking resources to child
    processes because its defined semantics are "as if fork followed by
    exec".
     
    Joshua Maurice, Feb 18, 2012
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. John
    Replies:
    13
    Views:
    708
  2. ravi
    Replies:
    0
    Views:
    455
  3. Peter
    Replies:
    34
    Views:
    1,957
    Richard Tobin
    Oct 22, 2004
  4. porting non-malloc code to malloc

    , Feb 18, 2005, in forum: C Programming
    Replies:
    3
    Views:
    481
    Walter Roberson
    Feb 19, 2005
  5. Johs32

    to malloc or not to malloc??

    Johs32, Mar 30, 2006, in forum: C Programming
    Replies:
    4
    Views:
    324
    Captain Winston
    Mar 30, 2006
Loading...

Share This Page