malloc vs. realloc

Pushkar Pradhan · Nov 25, 2003

I've a code in which I don't know how many elements an array may
contain, BUT I know the max. no. of values it may have. So I do this
malloc(MAXLEN), however I can use realloc(...) each time I add a new
element to this array.

Now my question is: will doing realloc everytime slow down the code very
much? My project is concerned about high performance.
Pushkar Pradhan

Sheldon Simms · Nov 25, 2003

I've a code in which I don't know how many elements an array may
contain, BUT I know the max. no. of values it may have. So I do this
malloc(MAXLEN), however I can use realloc(...) each time I add a new
element to this array.

Now my question is: will doing realloc everytime slow down the code very
much?

Yes, but that's not a problem. Don't realloc for every element. Each time
you run out of elements, use realloc to double the size of the array.

nobody · Nov 25, 2003

Sheldon Simms said:
Yes, but that's not a problem. Don't realloc for every element. Each time
you run out of elements, use realloc to double the size of the array.

< not on-topic, but maybe to the point>
But if all elements will have to be in memory soon or later,
why not malloc for all of them at the beginning (possibly
with realloc after initial assumption about size won't hold)?
While later on there may not be enough space to allocate
bigger continuous chunk, it can be on the beginning. If it is
not, why bother to start crunching? All later subsequent
allocations (for other stuff) could be smaller and "fit" into
possibly fragmented memory.
</ not on-topic, but maybe to the point>

Jack Klein · Nov 25, 2003

I've a code in which I don't know how many elements an array may
contain, BUT I know the max. no. of values it may have. So I do this
malloc(MAXLEN), however I can use realloc(...) each time I add a new
element to this array.

Now my question is: will doing realloc everytime slow down the code very
much? My project is concerned about high performance.
Pushkar Pradhan

The C standard does not specify performance of anything. Much depends
on your particular compiler and operating system, which you can either
test or ask about in a group that supports your system.

There is a good chance that at least some reallocation calls will be
expensive in terms of time. Perhaps a better memory allocation
strategy would be in order.

If you really need to shrink the memory block, is it possible that you
can do it after all elements are added, instead of after each one?

If that is not possible, you could start out with an allocation for
some fraction of the maximum size, for example 25%. If you used that
up you could allocate another 25%, and another and another.

Other possibilities are to pick some likely size and always double it
or multiply it by 1.5 each time you need to grow the block.

You should investigate the typical needs of the program, if possible.
The maximum number of elements might be 1000, but perhaps most of the
time it will need 100 or less.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++ ftp://snurse-l.org/pub/acllc-c++/faq

Dan Pop · Nov 25, 2003

In said:
I've a code in which I don't know how many elements an array may
contain, BUT I know the max. no. of values it may have. So I do this
malloc(MAXLEN), however I can use realloc(...) each time I add a new
element to this array.

Now my question is: will doing realloc everytime slow down the code very
much? My project is concerned about high performance.

If you know the maximum number of elements, try to allocate memory for
all of them. When you know the exact number, call realloc to release
the unused memory. This way, the whole job takes one malloc and one
realloc call. This is important if performance is a concern, because
malloc and friends tend to be expensive in terms of CPU cycles.

Dan

Ekkehard Morgenstern · Nov 26, 2003

Sheldon Simms said:
Yes, but that's not a problem. Don't realloc for every element. Each time
you run out of elements, use realloc to double the size of the array.

That's actually very good strategy with which I made good experiences.

If he uses a structure type, he can write a set of functions to handle the
case universally, like

#include <stddef.h>
#include <string.h>
#include <stdlib.h>
#include <limits.h>
typedef struct { void* buf; size_t elsz; size_t size; size_t used; }
dynamic_array_t;
typedef dynamic_array_t* dynamic_array_pointer_t;
dynamic_array_pointer_t create_dynamic_array( size_t elsz, size_t
initial ) {
dynamic_array_pointer_t da = (dynamic_array_pointer_t) malloc(
sizeof(dynamic_array_t) );
if ( da == 0 ) return 0;
da->buf = malloc( elsz * initial ); da->elsz = elsz; da->size =
initial; da->used = 0; return da;
}
void delete_dynamic_array( dynamic_array_pointer da ) { free( da.buf );
free( da ); }
void* address_dynamic_array_element_readonly( dynamic_array_pointer da,
size_t index ) {
if ( index >= da->used ) return 0;
{ char* p = (char*) da->buf; return p+( da->elsz * index ); }
}
void* address_dynamic_array_element_readwrite( dynamic_array_pointer da,
size_t index ) {
if ( index >= da->used ) {
if ( index >= da->size ) {
size_t newsz; void* newbf;
if ( index >= UINT_MAX / da->elsz ) return 0;
if ( da->size >= UINT_MAX / 2U ) newsz = UINT_MAX /
da->elsz; else newsz = da->size * 2U;
if ( index >= newsz ) newsz = index + 1U;
newbf = malloc( da->elsz * newsz ); if ( newbf == 0 ) return
0;
if ( da->used ) memcpy( newbf, da->buf, da->used *
da->elsz );
free( da->buf ); da->buf = newbf; da->size = newsz;
}
da->used = index + 1U;
}
{ char* p = (char*) da->buf; return p+( da->elsz * index ); }
}

Here, the functions create_dynamic_array() and delete_dynamic_array() handle
array creation / destruction, and address_dynamic_array_element_readonly()
and address_dynamic_array_element_readwrite() access the array for reading
and writing and return a pointer to an indexed array element. If during
write indexing, the index goes beyond the array size, it is automatically
resized to either twice the size, or a size including the indexed element.
This way, you can have arbitrary access to the array and can use it for many
kinds of purposes.

Ekkehard Morgenstern · Nov 26, 2003

Ekkehard Morgenstern said:
void delete_dynamic_array( dynamic_array_pointer da ) { free( da.buf );
free( da ); }

of course this should read " free( da->buf ) " not " free( da.buf ); ".

Dan Pop · Nov 26, 2003

In said:
void* address_dynamic_array_element_readonly( dynamic_array_pointer da,
void* address_dynamic_array_element_readwrite( dynamic_array_pointer da,

1 2 3
12345678901234567890123456789012345

Not even C99 guarantees that many significant initial characters in
an external identifier. Apart from being brain dead from both a stylistic
and a practical point of view, such identifiers are not even portable.

6 Any identifiers that differ in a significant character are
different identifiers. If two identifiers differ only in
nonsignificant characters, the behavior is undefined.

Dan

Ekkehard Morgenstern · Nov 26, 2003

Dan Pop said:
da,
1 2 3
12345678901234567890123456789012345

Not even C99 guarantees that many significant initial characters in

Not true. The IEC 9899:1999 (C99) standard states that identifiers are
unlimited in length:

(6.4.2.1.2) "There is no specific limit on the maximum length of an
identifier."

Alex · Nov 27, 2003

Not true. The IEC 9899:1999 (C99) standard states that identifiers are
unlimited in length:

(6.4.2.1.2) "There is no specific limit on the maximum length of an
identifier."

No, but there is a limit on the /significant/ characters
in the identifier. The relevant text is:

5.2.4.1 Translation limits

[#1] The implementation shall be able to translate and
execute at least one program that contains at least one
instance of every one of the following limits:13)

...

-- 63 significant initial characters in an internal
identifier or a macro name (each universal character
name or extended source character is considered a
single character)

-- 31 significant initial characters in an external
identifier (each universal character name specifying a
short identifier of 0000FFFF or less is considered 6
characters, each universal character name specifying a
short identifier of 00010000 or more is considered 10
characters, and each extended source character is
considered the same number of characters as the
corresponding universal character name, if any)14)

Alex

Severian · Nov 27, 2003

Not true. The IEC 9899:1999 (C99) standard states that identifiers are
unlimited in length:

(6.4.2.1.2) "There is no specific limit on the maximum length of an
identifier."

5.2.4.1.1 The implementation shall be able to translate and execute at
least one program that contains at least one instance of every one of
the following limits:
[...]
- 31 significant initial characters in an external identifier...

6.4.2.1.5 As discussed in 5.2.4.1, an implementation may limit the
number of significant initial characters in an identifer; the limit
for an external name [...] may be more restrictive than for an
internal name [...]. The number of significant characters in an
identifer is implementation-defined.

- Sev

CBFalconer · Nov 27, 2003

Ekkehard said:
That's actually very good strategy with which I made good
experiences.

That is often a good strategy, but not always. I use it in
hashlib when expanding the hash table size, but not in ggets when
expanding an interactive (usually) input buffer, when I expand by
a constant increment. I consider the future expectations to be
widely different. Both available on my site below.

Gordon Burditt · Nov 27, 2003

Yes, but that's not a problem. Don't realloc for every element. Each time

That's actually very good strategy with which I made good experiences.

You may wish to have a failure-backoff algorithm in place. It may
be very annoying that the program fails (rather than slowing down)
trying to process 1,200,000 elements because it doesn't have enough
room for 2,048,000 elements. This can get especially annoying on
implementations that, due to other uses of dynamic memory, really
needs to have 3x memory to grow from x to 2x because of fragmentation.

Then again, it may be that if the program is that tight on memory
loading the data, it's going to fail soon anyway doing something
with it, so, depending on the application, it may not matter.

Gordon L. Burditt

glen herrmannsfeldt · Nov 27, 2003

Gordon said:
You may wish to have a failure-backoff algorithm in place. It may
be very annoying that the program fails (rather than slowing down)
trying to process 1,200,000 elements because it doesn't have enough
room for 2,048,000 elements. This can get especially annoying on
implementations that, due to other uses of dynamic memory, really
needs to have 3x memory to grow from x to 2x because of fragmentation.

I usually do exponential growth with a constant less than 2, maybe 5/3
or 3/2. new=(5*old)/3;

A compromise on the amount of excess memory needed, and the time to
reallocate it.

-- glen

Dan Pop · Nov 27, 2003

Not true. The IEC 9899:1999 (C99) standard states that identifiers are
unlimited in length:

(6.4.2.1.2) "There is no specific limit on the maximum length of an
identifier."

You may want to read the standard before starting quoting from it.
You may also want to engage your brain while reading the quotes I have
posted (there was no mention of the maximum length of an identifier
there, was it? and why do you think I've stopped counting before reaching
the end of your identifiers?).

Otherwise, all you can achieve is making a fool of yourself.

Dan

Ekkehard Morgenstern · Nov 28, 2003

Dan Pop said:
You may also want to engage your brain while reading the quotes I have
posted (there was no mention of the maximum length of an identifier
there, was it? and why do you think I've stopped counting before reaching
the end of your identifiers?).

If the significance of identifiers is a problem at link time, I can choose
to use a different linker for which all characters in an identifier are
significant, or modify the source code accordingly.

To limit the identifier length to make code more unreadable is not the
wisest of choices.

Perhaps you should read a programming style guide, or something.

In that particular case, it would be sufficient to move the key words
distinguishing the purpose of the functions to the beginning of the
identifier name, like "access_readwrite_" and "access_readonly_". It would
not be a good idea to make the identifiers short just for the purpose of
gratifying a particular linker.

Otherwise, all you can achieve is making a fool of yourself.

Why, thank you.

Dan Pop · Nov 28, 2003

If the significance of identifiers is a problem at link time, I can choose

Too late, you have already invoked undefined behaviour. That is, assuming
you understand the concept.

to use a different linker for which all characters in an identifier are
significant, or modify the source code accordingly.

To limit the identifier length to make code more unreadable is not the
wisest of choices.

To use identifiers longer than 31 characters is the most idiotic choice.

Perhaps you should read a programming style guide, or something.

Take your own advice. Very long identifiers impair the source code
readability, even if the implementation can handle them. They also
impair the code's portability, since there is no guarantee that another
implementation will support them, too. And if it doesn't, it's not even
require to warn you about the fact.

In that particular case, it would be sufficient to move the key words
distinguishing the purpose of the functions to the beginning of the
identifier name, like "access_readwrite_" and "access_readonly_". It would
not be a good idea to make the identifiers short just for the purpose of
gratifying a particular linker.

You really have no clue.

Dan

Irrwahn Grausewitz · Nov 28, 2003

Ekkehard Morgenstern said:
If the significance of identifiers is a problem at link time, I can choose
to use a different linker for which all characters in an identifier are
significant, or modify the source code accordingly.

Which will unnecessarily limit the portability of your code.

To limit the identifier length to make code more unreadable is not the
wisest of choices.

Trading portability for readability is definitely worse.

Perhaps you should read a programming style guide, or something.

You may want to follow your own advice, too.

In that particular case, it would be sufficient to move the key words
distinguishing the purpose of the functions to the beginning of the
identifier name, like "access_readwrite_" and "access_readonly_".

Then why didn't you do so in the first place?

It would
not be a good idea to make the identifiers short just for the purpose of
gratifying a particular linker.

And instead make them long and protect a vast number of linkers from
being able to process your code correctly? Sorry, but that sounds
odd to me.

It's a Bad Idea[tm] to unnecessarily constrain portability to
implementations where linkers are available that guarantee more
significant characters than required by the standard. Gaining
portability for free is IMO a Good Idea[tm].

Regards

Ekkehard Morgenstern · Nov 28, 2003

Dan Pop said:
You really have no clue.

Well, definitely more than you. I've been programming in C since 1986.
Perhaps you've been playing with lego's then, if you were born at all.

Most C compilers nowadays come in C/C++ compiler packages. A C++ compiler
generates names that are up to 256 characters in size, and the linker can
handle them, of course.

In C++ there's a requirement for identifiers to be significant for all
characters, which usually is 256 max. or unlimited.

Have you ever looked at the mangled identifier names generated by C++
compilers? They're almost all longer than 31 characters. If a C/C++ linker
couldn't handle that, you couldn't use C++ with it.

So whatever archaic C compiler you're using, it's none of our times.

When I began programming in C, identifiers were significant only to 6
characters. I'm glad that such limits don't exist anymore. You're talking
about pink elephants, I know of no linker that can handle only 31 characters
of significance.

Ekkehard Morgenstern · Nov 28, 2003

Irrwahn Grausewitz said:
[copy of Dan Pop's post snipped]

What I said to him also applies to you. No C/C++ package linker has the
limits you talked about. So there's no "undefined behaviour".

C program: memory leak/ segmentation fault/ memory limit exceeded	0	Nov 12, 2022
malloc realloc and pointers	22	Nov 29, 2007
Malloc question	9	Nov 30, 2010
using realloc for a dynamically growing array	14	Jun 27, 2008
realloc causing segfault (under Electric Fence)	3	Aug 10, 2006
Problems with realloc()	7	Jul 22, 2007
malloc and realloc problem	1	May 19, 2008
Realloc and pointer arithmetics	6	Apr 18, 2008

malloc vs. realloc

Pushkar Pradhan

Sheldon Simms

nobody

Jack Klein

Dan Pop

Ekkehard Morgenstern

Ekkehard Morgenstern

Dan Pop

Ekkehard Morgenstern

Alex

Severian

CBFalconer

Gordon Burditt

glen herrmannsfeldt

Dan Pop

Ekkehard Morgenstern

Dan Pop

Irrwahn Grausewitz

Ekkehard Morgenstern

Ekkehard Morgenstern

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads