How to choose buffer size?

kj · Jun 4, 2009

In this program I'm writing I want to pre-allocate space for a but
an indeterminate (but potentially large) number of instances of a
given structure. And if this amount of memory proves insufficient,
I'll add another large chunk with realloc. Etc.

So somewhere I my code I'll have an initial pre-allocation that
looks like (very schematically):

size_t size = INITIAL_SIZE;
foo *pile_o_foo;
int next_slot = 0;

pile_o_foo = (foo *)malloc(size);
/* error check */

/* ... */

while (size <= next_slot * sizeof foo) {
size *= 2;
pile_o_foo = (foo *)realloc(pile_o_foo, size);
/* error check */
}

/* ... */

Is there a way to choose a good value for INITIAL_SIZE? Should I
just pick a number out of a hat? 101? 1001? 5091? Or should I
pick a "nice power of 2" out of a hat, like 1024 or 4096? Or is
there a more clueful way to go about this?

TIA!

kynnjo

Keith Thompson · Jun 4, 2009

Eric Sosman said:
Counts that produce power-of-two sizes should probably be
avoided (unless you have information that the "indeterminate"
number is likely to be such a value). The reasoning goes

- Many malloc() implementations inflate each allocation by
a few bytes, and store some of their own metadata in the
extra space, and

- Some malloc() implementations use fundamental "block sizes"
that are themselves powers of two.

If your malloc() has both of these characteristics, a power-of-two
request is very nearly the worst thing you can ask for.

[...]

Yes, but do any real malloc implementations behave that way?

I just tried an experiment, calling malloc four times with a size of
1048576 bytes (1 megabyte, or 1 mebibyte if you want to be picky). On
three different systems, the resulting addresses differed consistently
by 2**20 plus a relatively small overhead; on three different systems,
the overheads were 4096 bytes, 65536 bytes, and 8 bytes. (The systems
were Ubuntu, Cygwin under Windows XP, and Solaris.)

And yes, the code that determined the differences was non-portable.

Antoninus Twink · Jun 4, 2009

Thing *ptr;
size_t count = (1024 * 1024 - 16) / sizeof *ptr;
ptr = malloc(count * sizeof *ptr);

16 is a magic number, Eric. How did you choose it?

cr88192 · Jun 4, 2009

Keith Thompson said:
Eric Sosman said:

Counts that produce power-of-two sizes should probably be
avoided (unless you have information that the "indeterminate"
number is likely to be such a value). The reasoning goes

- Many malloc() implementations inflate each allocation by
a few bytes, and store some of their own metadata in the
extra space, and

- Some malloc() implementations use fundamental "block sizes"
that are themselves powers of two.

If your malloc() has both of these characteristics, a power-of-two
request is very nearly the worst thing you can ask for.

Click to expand...

[...]

Yes, but do any real malloc implementations behave that way?

I just tried an experiment, calling malloc four times with a size of
1048576 bytes (1 megabyte, or 1 mebibyte if you want to be picky). On
three different systems, the resulting addresses differed consistently
by 2**20 plus a relatively small overhead; on three different systems,
the overheads were 4096 bytes, 65536 bytes, and 8 bytes. (The systems
were Ubuntu, Cygwin under Windows XP, and Solaris.)

And yes, the code that determined the differences was non-portable.

but, alas, this is c.l.c where it does not matter so much whether something
is real so much as it matters people imagining that it is real...

so, nevermind that most practical implementations would realize that a
buddy-allocator would suck as a malloc implementation, people can still
imagine the possibility of a buddy allocator being used as the main
malloc...

it is like non-8-bit bytes and amazon-woman style cultures...

cr88192 · Jun 4, 2009

Antoninus Twink said:
16 is a magic number, Eric. How did you choose it?

well, it doesn't matter, because it is wrong...

infact, the correct calculation goes more like this:
42, this is the number that is the root of everything;
10, this number means completeness.

42*10=420

so, yes, 420 is the answer, yep, seems about right...

kj · Jun 5, 2009

In said:
I just tried an experiment, calling malloc four times with a size of
1048576 bytes (1 megabyte, or 1 mebibyte if you want to be picky). On
three different systems, the resulting addresses differed consistently
by 2**20 plus a relatively small overhead; on three different systems,
the overheads were 4096 bytes, 65536 bytes, and 8 bytes. (The systems
were Ubuntu, Cygwin under Windows XP, and Solaris.)

And yes, the code that determined the differences was non-portable.

I'd very much appreciate it if you'd share this code.

kynn
--

kj · Jun 5, 2009

In said:
kj said:

In this program I'm writing I want to pre-allocate space for a but
an indeterminate (but potentially large) number of instances of a
given structure. And if this amount of memory proves insufficient,
I'll add another large chunk with realloc. Etc.

[... malloc(), double, realloc(), lather, rinse, repeat ...]

Is there a way to choose a good value for INITIAL_SIZE?

Click to expand...

Not without information about the "indeterminate" number.
If you have some idea about the likely distribution of that
number, you could pick an initial size that's somewhere on
the high side of the median, the idea being that "most of the
time" you'll avoid the expense of reallocating and copying.
But if the number is really, truly unknown ...

Counts that produce power-of-two sizes should probably be
avoided (unless you have information that the "indeterminate"
number is likely to be such a value). The reasoning goes

- Many malloc() implementations inflate each allocation by
a few bytes, and store some of their own metadata in the
extra space, and

- Some malloc() implementations use fundamental "block sizes"
that are themselves powers of two.

If your malloc() has both of these characteristics, a power-of-two
request is very nearly the worst thing you can ask for. Consider
what happens if you ask for exactly one megabyte. malloc() starts
by inflating the request to one megabyte plus a smidgen, and then
looks for a block that will hold the total. How big is that block,
if only power-of-two blocks are in use? Two megabytes.

I think that's true only if the "fundamental block size" (call it
FBS) that this malloc uses happens to be exactly 1 megabyte. But
if instead FBS were, for example, 4096 bytes and that 1 smidgen <=
FBS. Then in the example you cite, the block that malloc would
use would be only 1MB + 4096 bytes, which is significantly less
than 2MB.

kynn

--

Eric Sosman · Jun 5, 2009

kj said:
In said:

[...]
- Some malloc() implementations use fundamental "block sizes"
that are themselves powers of two.

Click to expand...

If your malloc() has both of these characteristics, a power-of-two
request is very nearly the worst thing you can ask for. Consider
what happens if you ask for exactly one megabyte. malloc() starts
by inflating the request to one megabyte plus a smidgen, and then
looks for a block that will hold the total. How big is that block,
if only power-of-two blocks are in use? Two megabytes.

Click to expand...

I think that's true only if the "fundamental block size" (call it
FBS) that this malloc uses happens to be exactly 1 megabyte. But
if instead FBS were, for example, 4096 bytes and that 1 smidgen <=
FBS. Then in the example you cite, the block that malloc would
use would be only 1MB + 4096 bytes, which is significantly less
than 2MB.

.... and is also not a power of two. A "pure" buddy system
allocator can only manage power-of-two-sized blocks, and can
subdivide them only by halving into smaller powers of two.
Such an allocator cannot cope with a 1MB+4KB block, nor with
its 1MB-4KB "buddy," since the whole point of the buddy system
is that (block address) XOR (block size) = (buddy address).

How many pure buddy system allocators are there? Beats
me. For short-ish blocks, the buddy system has the advantage
that the neighbor of a freed block can be found without a
search. But for large-ish blocks, the vulnerability to
internal fragmentation (2MB for 1MB+4KB) is troublesome. I
suspect (without empirical evidence) that buddy system
allocators in practical use probably use a mixed strategy,
using pure buddy system for block sizes below some threshold
and a different method above that threshold.

Still, one never knows what's actually going on in the
malloc() of a system one hasn't seen. I think it prudent to
avoid what might -- *might* -- be a pathological case, just
in, well, just in pathological case.

(Research topic: Does anyone know of any "Fibonacci buddy"
allocators in actual use? See TAOCP exercise 2.5-31.)

kj · Jun 5, 2009

In said:
kj said:

In said:

[...]
- Some malloc() implementations use fundamental "block sizes"
that are themselves powers of two.

Click to expand...

If your malloc() has both of these characteristics, a power-of-two
request is very nearly the worst thing you can ask for. Consider
what happens if you ask for exactly one megabyte. malloc() starts
by inflating the request to one megabyte plus a smidgen, and then
looks for a block that will hold the total. How big is that block,
if only power-of-two blocks are in use? Two megabytes.

Click to expand...

I think that's true only if the "fundamental block size" (call it
FBS) that this malloc uses happens to be exactly 1 megabyte. But
if instead FBS were, for example, 4096 bytes and that 1 smidgen <=
FBS. Then in the example you cite, the block that malloc would
use would be only 1MB + 4096 bytes, which is significantly less
than 2MB.

Click to expand...

... and is also not a power of two.

Aah, I misunderstood what you meant by "fundamental block size".
I see what you mean.

kynn
--

Keith Thompson · Jun 5, 2009

kj said:
I'd very much appreciate it if you'd share this code.

Actually, the C program itself didn't compute the differences; it just
printed the addresses using "%p", and I computed the differences
manually using a separate calculator program. (I'd had an earlier
version that did the computation, but decided it was easier to do it
myself.)

#include <stdio.h>
#include <stddef.h>
#include <stdlib.h>
#define MEGABYTE (1U<<20)
int main(void)
{
char *p0 = malloc(MEGABYTE);
char *p1 = malloc(MEGABYTE);
char *p2 = malloc(MEGABYTE);
char *p3 = malloc(MEGABYTE);
printf("%p\n%p\n%p\n%p\n", p0, p1, p2, p3);
return 0;
}

There are several sloppy things that I would normally have fixed
before posting the code. In particular, it doesn't use <stddef.h>,
(1U<<20) can invoke undefined behavior if unsigned int is narrower
than 21 bits, and I didn't bother checking whether malloc succeeded.
The latter doesn't really matter, since "%p" will still work. And
"%p" expects void* rather than char*, but the standard (almost)
guarantees that it will still work.

Here's a more elaborate version that does the computations itself and
avoids UB except in one place where it's unavoidable:

#include <stdio.h>
#include <stddef.h>
#include <stdlib.h>
#define MEGABYTE ((size_t)1 << 20)
#define COUNT 4
int main(void)
{
char *p[COUNT];
int i;

for (i = 0; i < COUNT; i ++) {
p = malloc(MEGABYTE);
if (p == NULL) {
fprintf(stderr, "Allocation failure\n");
exit(EXIT_FAILURE);
}
}

for (i = 1; i < COUNT; i ++) {
ptrdiff_t delta = p - p[i-1]; /* UNDEFINED BEHAVIOR! */
if (delta < 0) delta = -delta;
printf("delta = %ld, overhead = %ld\n",
(unsigned long)delta,
(unsigned long)(delta - MEGABYTE));
}

return 0;
}

Note that the program indirectly assumes that the chunks of memory
will be allocated sequentially. It won't fail if the assumption is
violated, but the output won't make much sense.

Guest · Jun 5, 2009

In this program I'm writing I want to pre-allocate space for a [lot?] but
an indeterminate (but potentially large) number of instances of a
given structure. And if this amount of memory proves insufficient,
I'll add another large chunk with realloc. Etc.

So somewhere I my code I'll have an initial pre-allocation that
looks like (very schematically):

pile_o_foo = (foo *)malloc(size);

the cast is unnecessary and may hide an error (forgetting to include
stdlib)

while (size <= next_slot * sizeof foo) {
size *= 2;
pile_o_foo = (foo *)realloc(pile_o_foo, size);
/* error check */

again, no need for the cast. If realloc() fails then pile_o_foo
has lost the old block of memory. Use a temporary

foo *tmp;
if ((tmp = realloc (pile_o_foo, size)) == NULL)
/* handle error */
else
pile_o_foo = tmp;

Is there a way to choose a good value for INITIAL_SIZE? Should I
just pick a number out of a hat? 101? 1001? 5091? Or should I
pick a "nice power of 2" out of a hat, like 1024 or 4096? Or is
there a more clueful way to go about this?

We'd need to know more about foo and what your application
does with it. It might be installation dependent (number
of users might vary from a literal handful to thousands).
One possibility is read the system size from some external
source (ini file, data base, environmental variable etc)

--
Nick Keighley

Server rooms should be unfriendly! They should look dangerous,
they should be uncomfortably cold, they should be inexplicably noisy.
If you can arrange for the smell of burning transistors, that's good
too.
If anything, I'd rather see rackmounted servers designed in dark
foreboding
colors with lots of metal spikes sticking out of them.
Mike Sphar (on news//alt.sysadmin.recovery)

James Kuyper · Jun 5, 2009

Dominik said:
for (i = 1; i < COUNT; i ++) {
ptrdiff_t delta = p - p[i-1]; /* UNDEFINED BEHAVIOR! */
if (delta < 0) delta = -delta;
printf("delta = %ld, overhead = %ld\n",
(unsigned long)delta,
(unsigned long)(delta - MEGABYTE));
}

return 0;
}

Click to expand...

<snip>

Hi,
Could you explain me why is that UB? Isnt ptrdiff_t made to keep
"distances" between pointers? Or its valid only when difference is taken
from the same arrary/chunk of memory?

"When two pointers are subtracted, both shall point to elements of the
same array object, or one past the last element of the array object ..."
(6.5.6p9).

kj · Jun 5, 2009

In said:
#include <stdio.h>
#include <stddef.h>
#include <stdlib.h>
#define MEGABYTE ((size_t)1 << 20)
#define COUNT 4
int main(void)
{
char *p[COUNT];
int i;

for (i = 0; i < COUNT; i ++) {
p = malloc(MEGABYTE);
if (p == NULL) {
fprintf(stderr, "Allocation failure\n");
exit(EXIT_FAILURE);
}
}

for (i = 1; i < COUNT; i ++) {
ptrdiff_t delta = p - p[i-1]; /* UNDEFINED BEHAVIOR! */
if (delta < 0) delta = -delta;
printf("delta = %ld, overhead = %ld\n",
(unsigned long)delta,
(unsigned long)(delta - MEGABYTE));
}

Click to expand...

return 0;
}

Click to expand...

Note that the program indirectly assumes that the chunks of memory
will be allocated sequentially. It won't fail if the assumption is
violated, but the output won't make much sense.

Click to expand...

Thanks for posting this. I found it instructive.

kynn

Richard Bos · Jun 8, 2009

Malcolm McLean said:
A power of two suggests to your reader that "this is a decent buffer size".
POwers of ten do the same thing, but not so much in C, since it is a low
level language.

Magic numbers like 1008 (1024-16) leave the reader wondering why the buffer
needs to be 1008 places long.

That's what manifest constants are for. _Commented_ manifest constants.

Richard

How to choose buffer size?	7	Apr 20, 2012
How to choose buffer size?	4	Jun 5, 2010
How to determine the size of an output buffer	19	Apr 13, 2007
how to choose a belt (gurtel) gift	0	Oct 12, 2007
ifstream buffer size conversion from size_t to std::streamsize --> Is this OK?	9	Jan 21, 2007
Buffered _read and _write... how to get more buffer memory?	2	Feb 20, 2004
I need to adjust the size of a bitmap in a buffer	2	Aug 4, 2006
How to read HTTP chunk-size from servlet?	4	Jan 17, 2007

How to choose buffer size?

kj

Keith Thompson

Antoninus Twink

cr88192

cr88192

kj

kj

Eric Sosman

kj

Keith Thompson

Guest

James Kuyper

kj

Richard Bos

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads