Internal implementation of realloc

K

Kislay

How is realloc implemented internally ? If there is not enough memory
in place to allocate , is new memory allocated somewhere else and the
2 regions linked via a pointer , OR , is the old region copied to a
place where there is enough memory to fit both old and new region ? In
the second case , wouldn't this method be inefficient if the old
region was large ?
 
E

ediebur

Pseudo code

get block size
get space after block
if(blocksize + space < new blocksize)
  adjust block
else
    allocate new block
    copy data
    free old block

You can implement a rather inefficient realloc() with simply a
getblocksize() function. If you can peek into the internals of the malloc(0
system, you can extend the blocks if possible, which avoids the copy.

It has been a long, long time since I looked at the code but I believe
that some sort of realloc is the basic mechanism for buffer managment
in emacs-- I could be wrong
 
R

Richard Tobin

In the second case , wouldn't this method be inefficient if the old
region was large ?
[/QUOTE]
It could be.
It could make expanding an array to N elements,
into an O(N*N) operation.

If you can find any real implementation with this behaviour,
I'll be very surprised.

-- Richard
 
G

Gene

Richard said:
I was thinking of a situation where the array becomes
expanded as the data comes in, ultimately to N elements;
not a situation where realloc is called only once.

That's the situation with the typical use of
Richard Heathfield's fgetline function athttp://www.cpax.org.uk/prg/writings/fgetline.c

That's also the situation with the typical use of my get_line functionhttp://www.mindspring.com/~pfilandr/C/get_line/get_line.c

and also with the evil Dr. Sosman's getline function,http://www.cpax.org.uk/prg/portable/c/libs/sosman/index.php

It makes sense to me that as the array grows,
each subsequent realloc should take proportionally longer.
And that when you have K*N allocations,
the total building of the N element array
becomes an O(N*N) operation.

This is why nearly all code for expanding an array "on demand"
_multiplies_ the size of the array by a factor greater than 1 (2 being
very popular) rather than adding a constant. It isn't hard to prove
that the amortized cost of copying with this method is O(N) for an
array that has grown to size N.
 
R

Richard Tobin

If you can find any real implementation with this behaviour,
I'll be very surprised.
[/QUOTE]
I was thinking of a situation where the array becomes
expanded as the data comes in, ultimately to N elements;
not a situation where realloc is called only once.

So was I.

You would get your N^2 behaviour if every realloc() required copying,
or if a fixed fraction of them did. But a sensible implementation
will not work like that. It will allocate in such a way that there is
more spare space with large allocations.

If the block sizes it uses increase by a constant factor (power-of-two
is a special case of this), then the copying required will be such
that it's still O(N).

-- Richard
 
S

Stephen Sprunk

Eric said:
It's true that realloc() *could* be implemented this way,
but I've never seen such an implementation. Market forces
seem to favor a higher QoI.

That's how I assumed realloc() was normally implemented; logically,
that's what it appears to do...

While probably not what you're thinking of, I've seen a simple realloc()
implementation in a book that _always_ relocated the block. The purpose
was to help diagnose intermittent errors that were caused by callers
forgetting that realloc() _could_ relocate the block because, in some
circumstances, it is a rare occurrence.

S
 
K

Keith Thompson

Eric Sosman said:
You've overlooked a fairly important use of realloc():

char *ptr = malloc(1073741824);
...
char *new = realloc(ptr, 1);

Your implementation uses a gigabyte (or so) to store one byte
of "payload." No others I've seen have been so wasteful.

Malcolm's pseudo code was:

get block size
get space after block
if(blocksize + space < new blocksize)
adjust block
else
allocate new block
copy data
free old block

I assumed the "adjust block" would include shrinking the allocated
space, making the unused portion available.
 
G

Gene

There are other tricks. For instance if memory is paged it might be possible
to remap the page addresses without physically copying data. However that
can't be done in standard C, and not on every platform.

This occurred to me, too.

If you think about it, though, it won't work unless malloc()
cooperates. The expansion takes place in _virtual memory space_ not
physical. I.e. if malloc() hands out blocks that are contiguous in
virtual space, remapping won't help because when you try to grow a
block, malloc() has probably already given up the virtual addresses
immediately above.

Now 64-bit addresses open a new possibility. If every malloc (at
least up up to 4 billion of them) returns an address with the lower
(say) 32-bits zero, then every block can be grown to 4gb by remapping
pages.

With 32-bit addresses, the possibilities are less exciting: e.g. up to
64K of mallocs() that can each be expanded up to 64kb. Not good for
the general case, but perhaps for special ones...
 
C

CBFalconer

Stephen said:
.... snip ...

While probably not what you're thinking of, I've seen a simple
realloc() implementation in a book that _always_ relocated the
block. The purpose was to help diagnose intermittent errors that
were caused by callers forgetting that realloc() _could_ relocate
the block because, in some circumstances, it is a rare occurrence.

To the contrary, because moving the storage implies moving the
data, which may be long and time consuming, some malloc packages go
to special efforts to avoid any moves, and thus modification of the
pointer to the block. You can see the source of such a package at:

<http://cbfalconer.home.att.net/download/nmalloc.zip>

written for the DJGPP system. Very little need be done to adapt it
to most systems.
 
C

Chris Torek

... if memory is paged it might be possible to [implement realloc()
via] remap[ping] the page addresses without physically copying data.

This occurred to me, too.

If you think about it, though, it won't work unless malloc()
cooperates.

Well, certainly; but malloc() and realloc() must always cooperate.
(This is one of the pitfalls of attempting to substitute in a
different malloc(). Even if you handle malloc()+free()+realloc(),
you may not realize that you also had to gimmick the __vmalloc()
and __pagealloc() functions, which are attempting to cooperate with
malloc() and which are called directly from, e.g., the stdio
routines, for I/O-via-page-swapping.)
The expansion takes place in _virtual memory space_ not
physical. I.e. if malloc() hands out blocks that are contiguous in
virtual space, remapping won't help because when you try to grow a
block, malloc() has probably already given up the virtual addresses
immediately above.

This is not a problem, because the caller must write:

original_region = malloc(original_size); /* called O and S below */
... do appropriate work with it ...
new_region = realloc(original, new_size); /* called N and T below */

The realloc() call can obtain any available (and suitable, if there
are restrictions) set of virtual addresses, then ask the OS to move
the physical pages from the old address-space -- the region underlying
the virtual address range in [O..O+S-1] -- to the new range [N..N+T-1]
(with "new" pages added at the end if needed, or "old" pages removed
if T < S; and of course S and T are page-rounded and O and N must be
page-aligned).

If the pages are *moved* (as opposed to simply multiply-mapped),
this also gives you a Feature: subsequent access to original_region
will fail ("segfault" or "bus error" or whatever), which will help
the programmer find any "stale" pointers. This technique is quite
useful for debugging, and hence doing page-moving into "fresh"
virtual space on *every* memory allocation -- even those that can
be done without such a move -- can be valuable. Similarly, one
can unmap the page(s) backing any region that has been free()d.

(I used this trick to find a bug in the 4.3BSD kernel once.)
Now 64-bit addresses open a new possibility. If every malloc (at
least up up to 4 billion of them) returns an address with the lower
(say) 32-bits zero, then every block can be grown to 4gb by remapping
pages.

If you parcel out virtual addresses on 4GB boundaries, you can
expand in-place even *without* page-remapping. (Well, assuming
the physical page size is less than 4GB, at least. :) )
 
C

CBFalconer

Chris said:
Gene said:
Malcolm McLean said:
... if memory is paged it might be possible to [implement realloc()
via] remap[ping] the page addresses without physically copying data.

This occurred to me, too. If you think about it, though, it
won't work unless malloc() cooperates.

Well, certainly; but malloc() and realloc() must always cooperate.
(This is one of the pitfalls of attempting to substitute in a
different malloc(). Even if you handle malloc()+free()+realloc(),
you may not realize that you also had to gimmick the __vmalloc()
and __pagealloc() functions, which are attempting to cooperate with
malloc() and which are called directly from, e.g., the stdio
routines, for I/O-via-page-swapping.)

What __vmalloc and __pagealloc functions? No such thing in std C.
 
C

CBFalconer

Eric said:
CBFalconer said:
Chris said:
... if memory is paged it might be possible to [implement
realloc() via] remap[ping] the page addresses without
physically copying data.

This occurred to me, too. If you think about it, though, it
won't work unless malloc() cooperates.

Well, certainly; but malloc() and realloc() must always cooperate.
(This is one of the pitfalls of attempting to substitute in a
different malloc(). Even if you handle malloc()+free()+realloc(),
you may not realize that you also had to gimmick the __vmalloc()
and __pagealloc() functions, which are attempting to cooperate with
malloc() and which are called directly from, e.g., the stdio
routines, for I/O-via-page-swapping.)

What __vmalloc and __pagealloc functions? No such thing in std C.

Typos: Chris actually meant _vmalloc() and _pagealloc().

I still don't find those names, with or without leading '_', in the
standard.
 
I

Ian Collins

CBFalconer said:
Eric said:
CBFalconer said:
Chris Torek wrote:

... if memory is paged it might be possible to [implement
realloc() via] remap[ping] the page addresses without
physically copying data.
This occurred to me, too. If you think about it, though, it
won't work unless malloc() cooperates.
Well, certainly; but malloc() and realloc() must always cooperate.
(This is one of the pitfalls of attempting to substitute in a
different malloc(). Even if you handle malloc()+free()+realloc(),
you may not realize that you also had to gimmick the __vmalloc()
and __pagealloc() functions, which are attempting to cooperate with
malloc() and which are called directly from, e.g., the stdio
routines, for I/O-via-page-swapping.)
What __vmalloc and __pagealloc functions? No such thing in std C.
Typos: Chris actually meant _vmalloc() and _pagealloc().

I still don't find those names, with or without leading '_', in the
standard.
Isn't it obvious from the context what they do? It's hard to tell if
you are being dense or obtuse.
 
K

Keith Thompson

CBFalconer said:
Eric said:
CBFalconer said:
Chris Torek wrote:

... if memory is paged it might be possible to [implement
realloc() via] remap[ping] the page addresses without
physically copying data.

This occurred to me, too. If you think about it, though, it
won't work unless malloc() cooperates.

Well, certainly; but malloc() and realloc() must always cooperate.
(This is one of the pitfalls of attempting to substitute in a
different malloc(). Even if you handle malloc()+free()+realloc(),
you may not realize that you also had to gimmick the __vmalloc()
and __pagealloc() functions, which are attempting to cooperate with
malloc() and which are called directly from, e.g., the stdio
routines, for I/O-via-page-swapping.)

What __vmalloc and __pagealloc functions? No such thing in std C.

Typos: Chris actually meant _vmalloc() and _pagealloc().

I still don't find those names, with or without leading '_', in the
standard.

I'm 99.99% certain that Chris Torek is aware of that -- and that's
probably an underestimate.

Replacing one or more of the standard library functions causes
undefined behavior. Chris was illustrating one way in which that
undefined behavior can manifest itself as things going badly wrong
rather than as things working as one might naively expect. Think of
_vmalloc() and _pagealloc() as *examples* of things within the
internals of a standard library implementation that can cause problems
when you start mucking around with replacements for standard library
functions.
 
C

CBFalconer

Keith said:
CBFalconer said:
Eric said:
CBFalconer wrote:
Chris Torek wrote:

... if memory is paged it might be possible to [implement
realloc() via] remap[ping] the page addresses without
physically copying data.

This occurred to me, too. If you think about it, though, it
won't work unless malloc() cooperates.

Well, certainly; but malloc() and realloc() must always cooperate.
(This is one of the pitfalls of attempting to substitute in a
different malloc(). Even if you handle malloc()+free()+realloc(),
you may not realize that you also had to gimmick the __vmalloc()
and __pagealloc() functions, which are attempting to cooperate with
malloc() and which are called directly from, e.g., the stdio
routines, for I/O-via-page-swapping.)

What __vmalloc and __pagealloc functions? No such thing in std C.

Typos: Chris actually meant _vmalloc() and _pagealloc().

I still don't find those names, with or without leading '_', in the
standard.

I'm 99.99% certain that Chris Torek is aware of that -- and that's
probably an underestimate.

Replacing one or more of the standard library functions causes
undefined behavior. Chris was illustrating one way in which that
undefined behavior can manifest itself as things going badly wrong
rather than as things working as one might naively expect. Think of
_vmalloc() and _pagealloc() as *examples* of things within the
internals of a standard library implementation that can cause problems
when you start mucking around with replacements for standard library
functions.

If you qualify it like that, fine. There are simpler illustrations
of the dangers of replacing malloc. For example, it may be used in
the initialization code, and prelinked there.
 
R

Richard

Ian Collins said:
CBFalconer said:
Eric said:
CBFalconer wrote:
Chris Torek wrote:

... if memory is paged it might be possible to [implement
realloc() via] remap[ping] the page addresses without
physically copying data.
This occurred to me, too. If you think about it, though, it
won't work unless malloc() cooperates.
Well, certainly; but malloc() and realloc() must always cooperate.
(This is one of the pitfalls of attempting to substitute in a
different malloc(). Even if you handle malloc()+free()+realloc(),
you may not realize that you also had to gimmick the __vmalloc()
and __pagealloc() functions, which are attempting to cooperate with
malloc() and which are called directly from, e.g., the stdio
routines, for I/O-via-page-swapping.)
What __vmalloc and __pagealloc functions? No such thing in std C.
Typos: Chris actually meant _vmalloc() and _pagealloc().

I still don't find those names, with or without leading '_', in the
standard.
Isn't it obvious from the context what they do? It's hard to tell if
you are being dense or obtuse.

Both. It's called being "chuckish".
 
K

Kenny McCormack

Isn't it obvious from the context what they do? It's hard to tell if
you are being dense or obtuse.

Both. It's called being "chuckish".[/QUOTE]

By the way, who is older: CBF or John McCain?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top