Buffer growing strategy?

websnarf · Sep 18, 2009

There's a simple rule for optimisation: Don't optimise unless you have
measured it and found that optimising is needed.

Indeed, but that's just a corollary for what H.L. Mencken said: "For
every complex problem, there is an answer that is clear, simple--and
wrong." And you definitely got that last part as I will explain:

[...] In my test, I did one
billion realloc's, with the size growing from one byte to one billion
byte, and the time was 55 ns per realloc (time printed every 1 million
iterations, so it could easily be that a few reallocs took
significantly longer), with no apparent growth as the block size got
bigger.

Not surprising; most memory allocator strategies will behave well in
view of trivial scenarios. But if you do a *sequence* of mallocs and
then try to resize them at random, then tell me what kind of
performance you get.

Anyway, if _you_ can find a clever strategy to choose sizes for
allocations, why shouldn't the author of the library?

The library author doesn't know which trade off the application
developer has in mind. If the system library favors performance then
it *cannot* favor memory tightness, and does not allow the application
developer to work around the system's implementation. On the other
hand if the system library provider provides a memory efficient
implementation, then either of Jef Driesen's first two strategies
works very well no matter what kind of realloc implementation has been
provided. In other words, the app developer can deal with performance
on his own.

[...] He's in a much better position.

How can a position of ignorance be a better position? The system
library provider can never know which trade off the application
developer wants.

[...] For example, he can allocate more memory than needed
for a malloc block to make realloc faster, but cut away from that
block when needed.

That is rarely relevant. You just have to implement a next block
pointer and see if the adjacent block is unallocated and contains
enough space to be extended into. That's not the issue. The issue is
how often the next block is really going to be allocated or not. With
tight allocation strategies, which are the norm, you can never know.
If you preallocate too much space, then you will just run out of space
faster (as if you had less ram in your system than you actually paid
for and installed). That's not something that can ever be worked
around, and I would never want to use a system that implemented such a
trade off.

Even as an app developer, if you want to use a strategy where you are
willing to eat excess memory to improve allocation performance, you
are going to want to use a pool based system, which is a totally
different strategy (that is very bad for re-use but good for
allocation speed) and needs to be custom developed by the app
developer anyways.

websnarf · Sep 18, 2009

If you don't know the size of your buffer already, checking the allocated
space is the wrong way to find out.

If you got it from malloc, you typically only know the lower bound of
what the real size is. The real size is known only to the
implementation.

William Hughes · Sep 18, 2009

void foo(int x, int y, int z=0) { ... }
void bar(int x, int y, int z=1) { ... }

typedef void (*func_taking_3_ints)(int, int, int);

func_taking_3_ints p1 = rand()%1 ? foo : bar;
(*p1)(0, 0);

What happens? Does the caller have to figure out which function is being
called, and provide the default value for the missing optional argument? Or
do you just make optional arguments mandatory when calling a function through
a pointer? And how would you do that, since even a plain function name
evaluates as a pointer? There's no such thing as a non-pointer function call,
so you'll have to invent one.

Or does the default value figure into the function's type, making it illegal
to assign foo or bar to p1? How could I declare a pointer to a function with
default argument? Could be something like this I suppose:

void (*p1)(int, int, int=0) = foo; /* OK */
p1 = bar; /* Mismatch! */

Possible, but your function pointers are going
to get *very* big, especially if you allow default values
to be expressions.

The simplest solution is to make
calling a function with default arguments in
such a way that the compiler
cannot determine which function is
being called
undefined behavior.

(This is not much of a problem. Most of the time
if you need to use a run time determined
function pointer to a function with optional
arguments you can use a simple wrapper)

- William Hughes

Keith Thompson · Sep 18, 2009

void foo(int x, int y, int z=0) { ... }
void bar(int x, int y, int z=1) { ... }

typedef void (*func_taking_3_ints)(int, int, int);

func_taking_3_ints p1 = rand()%1 ? foo : bar;
(*p1)(0, 0);

What happens? Does the caller have to figure out which function is being
called, and provide the default value for the missing optional argument? Or
do you just make optional arguments mandatory when calling a function through
a pointer? And how would you do that, since even a plain function name
evaluates as a pointer? There's no such thing as a non-pointer function call,
so you'll have to invent one.

Or does the default value figure into the function's type, making it illegal
to assign foo or bar to p1? How could I declare a pointer to a function with
default argument? Could be something like this I suppose:

void (*p1)(int, int, int=0) = foo; /* OK */
p1 = bar; /* Mismatch! */

Off the top of my head, I'd say that function types that differ
only in default arguments should be compatible, so ``p1 = bar;''
is legal, and it's the default arguments in the function type
actually being used in a call that affect the call. Which means
that, given an existing function without default arguments, you
could provide function pointers that provide different defaults.

Or we could just follow the C++ rules, which apparently don't allow
default arguments other than in function declarations.

Jef Driesen · Sep 18, 2009

Richard said:
In each case the size will be the initial block size times a
power of the factor. #3 repeats each work that was done in
computing #2.

You are right that the second loop requires less iterations. But that
was not really the point I wanted to show. Anyway, it's trivial to
rewrite the #3 to take advantage of the previously calculated value.

newsize = (oldsize ? oldsize : init);
while (n > newsize)
newsize *= FACTOR;

Replace "init" with the requested number of bytes "n" to get #2, or a
fixed value (for instance a power of two) to get #3.

In some contexts it makes for cache line misses.

Allocators often use powers of two size blocks. The catch is
that they chip some of that space off for their own use. You may
get a lot more space instead.

I didn't think of that.

It sounds as though you could do with a wrapper between you and
malloc that creates a buffer that looks like this:

{free space | space in use | free space }

An append eats the beginning of the trailing free space and
leaves the buffer pointer alone; a prepend eats the end of the
leading free space and moves the buffer pointer back. You only
need to resize if you run out of free space at one end or the
other.

That's indeed what I'm doing, except that in my implementation there is
only one block of free space, located at either the beginning or the
end. In my application I always need to append or prepend a series of
data packets to the buffer, but not a mixture of appending and
prepending. Thus having a single block of free space (at the right end)
is more space efficient. The free space only needs to move when
switching from appending to prepending (or vice versa), but that does
not happen in my code.

Jef Driesen · Sep 18, 2009

websnarf said:
Indeed, this retains the guarantee that the number of reallocs is O(log
(FACTOR, MAXLENGTH)) and its a finite cost to calculate the size of
the new size itself, so long as you never *shrink* the memory size of
the buffer. This strategy may save your maximum memory usage if your
largest allocation comes before any allocations at least half its
size.

In practice this does just as well as above, its just that a few cases
where you do a few too many multiplies. It also doesn't deal with the
start up case. But its all window dressing. This strategy will tend
to have fewer overall calls to realloc if your memory allocations tend
to all be roughly the same size.

The startup case could be done as follows:

newsize = (oldsize ? oldsize : n);

The data is received from an external device and the entire data stream
is usually transmitted in many smaller packets with a fixed length. Thus
all chunks of data that are added to the buffer have usually exactly the
same size.

Oh -- this performs shrinks, because it doesn't retain the oldsize as
a lower bound. The problem is that depending on your implementation
of realloc() this can either cause your system to thrash (see exercise
below) or it will behave as well as the two good solutions you pose
above. The number of real data moves depends on whether or not the
size oscillates between values that are in different FACTOR power
ranges. The solutions above actually guarantee that the number of
data moves is finite and uses no more than twice (assuming FACTOR = 2)
your memory bandwidth for the largest allocation.

In my implementation it doesn't shrink, because the buffer only supports
appending/prepending data packets. Thus the requested size 'n' is always
larger than the old size (n = oldsize + packet_size).

But I think it is indeed better to use #1 or #2, because in that case
the new size is based on the prior size, not on some arbitrary value.

For instance suppose I want to add chunks of 10 bytes. When doubling the
old size, the size grows as 10, 20, 40 and my 10 byte chunks will always
fit in the free space (if there is free space). But with powers of two
the size will grow as 16, 32, 64 and there is always some space that
can't be used. If my chunks happens to be powers of two, I get powers of
two with both methods anyway.

Between the first two is a matter of personal preferences. You may be
able to see a slight advantage of one over the other depending on the
characteristics of your application, but both are suitable no matter
what. The third strategy is an example of what not to do.

Thanks for the info.

Giorgos Keramidas · Sep 26, 2009

Richard Harter a écrit :

[snip discussion about reallocation strategies]

The real trouble is that the allocator has no way of
knowing that you have a resizable array.

Click to expand...

And why not?

Why do we have a malloc that can't receive a set of flags that
would indicate it what use will be done for the allocated
memory?

If we could tell the malloc/realloc/free system

#define ALLOC_FLAG_RESIZABLE 1 // Can be resized
#define ALLOC_FLAG_STATIC 2 // Never resized
#define ALLOC_FLAG_NOALIGN 4 // No alignment requirement
#define ALLOC_FLAG_FILLZERO 8
#define ALLOC_FLAG_PERMANENT 16 // Never freed

There are specialized allocators that support these and even more
tunables. libumem and the slab allocator of Solaris kernels supports
some of these; the zone allocator of the BSD kernels supports some of
these too; and so on.

Since it is quite feasible to `port' one of the special allocators to
almost any platform, the motivation for modifying malloc() is rather
small and it *does* risk breaking a lot of existing code.

Tim Rentsch · Sep 30, 2009

Possible rules.
Optional arguments have to come last and must have default
values assigned. Optional arguments and "..." cannot be used
in the same function. A function for which you cannot tell
from the prototype how many non optional arguments
there are cannot have optional arguments.
Non-optional arguments must be passed.
Optional arguments are detected by count and any optional
argument that is not passed gets the default value.

Click to expand...

void foo(int x, int y, int z=0) { ... }
void bar(int x, int y, int z=1) { ... }

typedef void (*func_taking_3_ints)(int, int, int);

func_taking_3_ints p1 = rand()%1 ? foo : bar;
(*p1)(0, 0);

What happens? [other questions]

The call through p1 can be implemented easily and still
reasonably cheaply.

Or does the default value figure into the function's type, making it illegal
to assign foo or bar to p1? [also how to declare the types]

The rule that naturally suggests itself, at least to me, is that
if the two sets of parameter types match, then it's allowed to
convert function pointer type F1 to function pointer type F2 as
long as F2 has at least as many non-optional parameters as F1 has
non-optional parameters, and F1 has at least as many parameters
as F2 has parameters. (As above, optional arguments are allowed
only at the right end, and not allowed with '...' functions.) So
for example, if we have

void one_plus_two( int x, int y = 1, int z = 2 ) { ... }

void (*f13)( int, int =, int = );
void (*f12)( int, int = );
void (*f11)( int );

void (*f23)( int, int, int = );
void (*f22)( int, int );

void (*f33)( int, int, int );

then these would all be legal

f13 = one_plus_two;
f12 = one_plus_two;
f11 = one_plus_two;
f23 = one_plus_two;
f22 = one_plus_two;
f33 = one_plus_two;

as would

f12 = f13;
f11 = f13;
f23 = f13;
f22 = f13;
f33 = f13;

f11 = f12;
f22 = f12;

f22 = f23;
f33 = f23;

but none of the other conversions between fxx's would be allowed
(ie, doing one of those conversions would be undefined behavior).

This scheme can be implemented easily and still reasonably
cheaply (and should allow efficient code generation when calling
functions directly rather than through a pointer, but I haven't
thought through any of the details there carefully).

Tim Rentsch · Sep 30, 2009

William Hughes said:
Possible, but your function pointers are going
to get *very* big, especially if you allow default values
to be expressions.

Nonsense. The call through function pointers like
p1 above can be implemented with function pointers
no larger than they are now.

Tim Rentsch · Sep 30, 2009

Keith Thompson said:
Off the top of my head, I'd say that function types that differ
only in default arguments should be compatible, so ``p1 = bar;''
is legal, and it's the default arguments in the function type
actually being used in a call that affect the call. Which means
that, given an existing function without default arguments, you
could provide function pointers that provide different defaults.

That's an interesting idea. But it seems pretty useful to have
a single type that uses the default argument(s) of the actual
function definition (outlined briefly in my response to OP).
Not clear how these two ideas should (or can?) be combined.

reading the source of calling a singly-linked list	4	Feb 12, 2010
type-punning?	15	May 6, 2008
What is "the instance structure must be manually initialized"?	2	Jan 15, 2014
Weird Behavior with Rays in C and OpenGL	4	Feb 13, 2024
Increase memory buffer with realloc or malloc?	7	Sep 7, 2009
Lexical Analysis on C++	1	Oct 31, 2023
unit test strategy	9	Sep 15, 2012
SENTINEL CONTROL LOOP WHEN DEALING WITH TWO ARRAYS	1	Oct 26, 2023

Buffer growing strategy?

websnarf

websnarf

William Hughes

Keith Thompson

Jef Driesen

Jef Driesen

Giorgos Keramidas

Tim Rentsch

Tim Rentsch

Tim Rentsch

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads