How much memory does malloc(0) allocate?

Discussion in 'C Programming' started by Lynn McGuire, Jul 26, 2013.

  1. Lynn McGuire

    Eric Sosman Guest

    See 7.22.3p1: "Each such allocation shall yield a pointer
    to an object disjoint from any other object."
     
    Eric Sosman, Jul 26, 2013
    #21
    1. Advertisements

  2. Lynn McGuire

    James Kuyper Guest

    See above.

    The wrapper is not required to follow the same rules that malloc()
    itself is bound by. Allowing behavior that's different from that allowed
    for the wrapped function is one of the most common reasons for writing a
    wrapper.
     
    James Kuyper, Jul 26, 2013
    #22
    1. Advertisements

  3. The poet and divine John Donne condemned the newfangled introduction of
    zero in a sermon.
    The null or empty case is often difficult. For instance if we are pasting
    the empty image into a larger image, obviously it should be a no-op.
    But what if we are stretching an image to 256x256 and displaying it?
    What should that routine do if passed the empty image? Should a 2x0 image
    compare as equal to a 0x3 image? There aren't any obvious answers and,
    unless null images are used extensively, code authors are unlikely to
    document what happens in the null case.
    malloc(0) not being properly defined just adds one more gotcha. It's
    unsatisfactory. Of course it's possible to implement correct behaviour on
    top of it, but at the cost of cluttering every allocation request with a
    test for zero.
     
    Malcolm McLean, Jul 27, 2013
    #23
  4. Lynn McGuire

    Joe Pfeiffer Guest

    To me, it seems like code shouldn't try to malloc(0). There have been
    examples in this thread of programs that attempt to allocate an empty
    image; it still seems as if you're trying to allocate 0 bytes by the
    time you get to the malloc() call you've failed to adequately check your
    inputs.
     
    Joe Pfeiffer, Jul 27, 2013
    #24
  5. Lynn McGuire

    James Kuyper Guest

    On 07/26/2013 10:23 PM, Joe Pfeiffer wrote:
    ....
    Whether it's a "failure" or "inadequately checked" depends upon how it
    would be used. If, without having to write any special-case code, the
    simple fact that the amount of memory needed is zero guarantees that a
    given program will never dereference the pointer (which is fairly
    plausible), it's no problem if that pointer happens to be null. The
    program could either avoid calling malloc() when the size is zero, or
    avoid panicking when malloc returns 0 if the size is zero. I would favor
    avoiding the call, but I can't see anything horribly wrong with avoiding
    the panic, instead.
     
    James Kuyper, Jul 27, 2013
    #25
  6. Because a and v point to *different* blocks of memory that can hold 0
    bytes of data.

    There are two ways that malloc(0) can work:
    - It always fails (returns NULL)
    - It works exactly like malloc(X) for X>0 (so there is nothing special
    about allocating 0 bytes)

    Bart v Ingen Schenau
     
    Bart van Ingen Schenau, Jul 27, 2013
    #26
  7. Lynn McGuire

    Rosario1903 Guest

    i think malloc() function i use would return one address
    in response to malloc(0), each different each call of malloc(0)

    but why allocate all these different blocks of mem?
    only because standard say so?
     
    Rosario1903, Jul 27, 2013
    #27
  8. Lynn McGuire

    Kleuske Guest

    Ok. Almost anything. It remains tricky, though.
    Of course.
     
    Kleuske, Jul 27, 2013
    #28
  9. Lynn McGuire

    Kleuske Guest

    Thanks for that, it is enlightening.

    The rationale is "we do not wish to break existing code, reported to
    be in widespread use." (paraphrased). The committee doesn't seem to be
    very happy about that coding practice.

    The C89-committee was a bit wiser and refused to accept zero byte
    objects, resulting in a "quiet change" whenever programs rely on zero-
    byte allocations. Hence the resulting mishmash, which is, arguably, the
    worst of both worlds.

    I hope a future committee will be even wiser and force a bit of
    maintenance on programs that rely on zero-byte allocations.
     
    Kleuske, Jul 27, 2013
    #29
  10. Lynn McGuire

    Eric Sosman Guest

    That's all right, if the address returned is NULL.
    Why allow `short'? Only because standard say so?

    That is, "The Standard requires behavior X" is a very
    good reason for an implementation to behave X-ishly.
     
    Eric Sosman, Jul 27, 2013
    #30
  11. Lynn McGuire

    Eric Sosman Guest

    I think you're misreading the Rationale. No edition of the
    Committee -- for ANSI, C99, C11, or the various intermediate
    TC's and amendments -- supported the idea of a zero-size object.
    But all of them, starting with ANSI, allowed zero-byte allocations.
    The "quiet change" two-and-a-half decades ago amounted to

    - Code that assumes malloc(0)==NULL may break

    - Code that assumes malloc(0)!=NULL may break

    .... which wasn't really a "change" at all, since code that made
    either assumption might break when moved from one system to
    another.
    Just the way C99 "forced" variable-length arrays on, for
    example, Microsoft?

    If a Committee were to issue a Standard whose adoption would
    require reviewing and patching a billion lines of C (my off-the-
    cuff estimate, probably low) for no benefit beyond "It'll be
    cleaner when you're finished," how eager do you think anyone
    would be to adopt it? Such a Standard would be a dead letter.
     
    Eric Sosman, Jul 27, 2013
    #31
  12. Lynn McGuire

    Kleuske Guest

    You're right. Just live with the history and learn from it. It isn't C's
    only quirk.
     
    Kleuske, Jul 27, 2013
    #32
  13. I tried with 3 different compilers

    Borland Turbo C 2.0 and Borland C++ Builder 5 both return a NULL pointer.
    Here I did not try to write bytes to that pointer nor to free the pointer.

    Fedora 14 gcc returns some non-NULL pointer. I can write up to 5000 byte and
    maybe more to that pointer without crash. There's no crash when I have
    written maximum 13 byte to that pointer and then free it. Otherwise there is
    a crash on free.
     
    Heinrich Wolf, Jul 28, 2013
    #33
  14. Lynn McGuire

    James Kuyper Guest

    Keep in mind that overwriting the end of allocated memory can cause very
    serious problems that DO NOT have to include crashing your program. The
    single most common consequence is that if your program, either directly
    or indirectly, uses the malloc() family to allocate two or more
    different blocks of memory at the same time, overwriting the end of one
    will result in writing to one of the others. Since the behavior of such
    code is undefined, the compiler need not keep track of such overwrites,
    so data extracted from the other block might currently be copied to a
    register. As a result, it might appear to your program that no overwrite
    has occurred. It could be quite some time before the program needs to
    retrieve data from the actual overwritten memory, and even longer before
    that fact causes noticeable malfunctions (in some cases, the damage can
    be so subtle that it isn't noticed for years). That can make it very
    difficult to track down where the overwrite occurred.

    One of the two most common ways I know of for implementing the malloc()
    family requires storing heap-management information about each allocated
    block of memory in a location that is adjacent to that block. In that
    case, overwriting the end of one block will generally damage the
    heap-management information associated with the next block. That can
    cause the malloc() family of functions to malfunction wildly.

    Don't assume your code is safe just because it doesn't crash. When
    malloc(0) returns a non-null value, the only part of the returned memory
    you can portably assume is writeble is the very first byte in that block.
     
    James Kuyper, Jul 28, 2013
    #34
  15. Lynn McGuire

    Eric Sosman Guest

    That's "portable" in the sense of "likely to work," not in the
    sense of "guaranteed to work." The Standard does not require that
    the zero-byte allocation be writable or even readable. If such an
    attempt is made the behavior is undefined:

    7.22.3p1: "If the size of the space requested is zero, [...]
    the returned pointer shall not be used to access an object."
     
    Eric Sosman, Jul 28, 2013
    #35
  16. Lynn McGuire

    Geoff Guest

    Is this another case of "trust the programmer"? The word "shall"
    implies that there is some kind of enforcement of this section. Is the
    compiler/runtime required to disallow access of a zero sized object or
    is the programmer required to guard against it?


    #include <stdio.h>
    #include <stdlib.h>
    #include <stdint.h>

    int main(int argc, char *argv[])
    {
    void *mem;
    size_t size = 0;

    mem = malloc(size);
    if (mem == NULL) {
    printf("malloc failed\n");
    return EXIT_FAILURE;
    }
    memset(mem, 'A', size);
    free (mem);

    return EXIT_SUCCESS;
    }
     
    Geoff, Jul 28, 2013
    #36
  17. In this case, "shall" doesn't imply enformcement. Violating a "shall"
    that's not part of a constraint means the program has undefined
    behavior.

    Section 4 of the C standard says:

    In this International Standard, "shall" is to be interpreted as a
    requirement on an implementation or on a program; conversely, "shall
    not" is to be interpreted as a prohibition.

    If a "shall" or "shall not" requirement that appears outside of a
    constraint or runtime-constraint is violated, the behavior is
    undefined. Undefined behavior is otherwise indicated in this
    International Standard by the words "undefined behavior" or by the
    omission of any explicit definition of behavior. There is no
    difference in emphasis among these three; they all describe
    "behavior that is undefined".
     
    Keith Thompson, Jul 28, 2013
    #37
  18. Lynn McGuire

    Eric Sosman Guest

    No, although you need more context than I quoted to make
    the determination. The quoted "shall" is not inside a constraint or
    runtime-constraint section, so the implementation is not required to
    detect violations or take any particular action if they occur. All
    you get is "the behavior is undefined" (4p2).
    Neither. The implementation can do whatever it pleases ("the
    behavior is undefined"). On the other hand, the programmer is not
    required to avoid undefined behavior! Perhaps the programmer knows
    how the implementation at hand will respond to a particular violation,
    and wants to elicit that response.
    I'm not sure what you're trying to demonstrate here. As has
    already been pointed out, the NULL return in this case need not
    indicate a "failure" of malloc(). Bailing out might be a good idea,
    though, because `memset(NULL, 'A', 0)' invokes undefined behavior
    (7.1.4p1; also, you need to #include <string.h> to declare memset).
     
    Eric Sosman, Jul 28, 2013
    #38
  19. Lynn McGuire

    James Kuyper Guest

    I'd forgotten about that clause when I wrote the above paragraph. The
    clause that I was thinking about corresponds to the "..." in the above
    citation, so I really had no very good excuse for thinking about one
    without the other, but that's what I did.
    There's only one circumstance in which the standard mandates rejection
    of a program: if it contains a correctly formated #error directive that
    survives conditional compilation.

    ISO has no enforcement arm, no authority to create one. A false claim of
    conformance to the C standard might be legally actionable in many
    countries, but no more so than any other kind of false advertising.

    The standard is, in effect, a contract. It never prohibits anything - it
    just specifies what a conforming implementation of C is (and is not)
    required to do when asked to translate and execute a program. An
    implementation isn't prohibited from violating those requirements, it
    just fails to be conforming if it does so. A program can have syntax
    errors, constraint violations, or undefined behavior, but you're not
    prohibited from writing such code. It's a bad idea to write such code,
    but not because it's prohibited - its simply because the standard
    doesn't guarantee that such code will do what you want it to do.

    A program that violates a "shall" that occurs in a normative section of
    the C standard, but outside of a constraint section (as is the case
    here) has undefined behavior. That means the C standard imposes no
    requirements of any kind on how a conforming implementation may deal
    with it. Therefore, if there is anything, anything at all, that you
    don't want your program to do, then you should not write such code,
    because it's possible, at least in principle, that your program will do
    one of the things you don't want it to do.
     
    James Kuyper, Jul 28, 2013
    #39
  20. Lynn McGuire

    BGB Guest

    FWIW, in my custom allocator.

    previously:
    allocating 0-7 bytes resulted in a 16-byte (1 cell) allocation, and 8-23
    would allocate 32 bytes (2 cells).

    currently:
    0 will allocate 16 bytes, 1-15 will allocate 32, 16-31 will allocate 48
    bytes, ...

    the first 8/16 bytes are actually the object header.


    the change was mostly due to allowing larger allocations, and also the
    ability to track source location (file name, line number), ...

    previously, there was no support for source file/line tracking, and
    allocations were limited to 1GB (now 64PB, on 64-bit targets).

    another expansion is now there are (theoretically) up to 16M unique
    object types, though the current limit is still a bit smaller (types IDs
    are currently hash-indices, and 16M entries would be *huge* for a
    hash-table). the current limit may go away if I switch over to a
    hash-chained-array or similar.

    the headers also contain a small check-value (header hash), used to
    detect corrupted object headers.


    note:
    unlike "malloc()", memory objects have an associated type-name, which is
    usable for things like run-time type-checks for pointers (among other
    things).

    the use of source-file/line information is mostly for things like
    debugging and leak-detection (IOW: help trying to figure out where
    memory is being leaked from).


    allocator statistics have generally shown that small objects (< 4kB)
    represent a good portion of the total memory use, *1, but currently with
    a big spike at 32kB (one of the major subsystems allocates a lot of 32kB
    arrays).

    *1: roughly forming a Gaussian distribution centered on 0, with millions
    of small objects.

    currently, typically, it is dealing with heap-sizes of around 500MB to 2GB.
     
    BGB, Jul 28, 2013
    #40
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.