realloc, copy and VM

J

Jean-Claude Arbaut

When you realloc for more size, it may be necessary to allocate a new block,
copy memory and deallocate the old, for example if there is not enough free
space after the original block (maybe the only example ?). But, is it really
necessary to _copy_ bytes ? When in a virtual memory environment (off-topic
here, be I still think the question is worth), you don't access directly
physical memory, but allocated pages, so would it be possible on some
implementations, to just change page allocation ? Some kind of immobile
trip. It could permit optimizations even when not knowing beeforehand the
exact amount of memory needed.
 
G

Gordon Burditt

When you realloc for more size, it may be necessary to allocate a new block,
copy memory and deallocate the old, for example if there is not enough free
space after the original block (maybe the only example ?). But, is it really
necessary to _copy_ bytes ?
When in a virtual memory environment (off-topic
here, be I still think the question is worth), you don't access directly
physical memory, but allocated pages, so would it be possible on some
implementations, to just change page allocation ?

*PAGE* allocation. In a typical virtual memory setup, you can re-arrange
PAGES by re-mapping. You don't get to re-map individual bytes that way.
So if the new memory and the old memory aren't lined up right, you may
be doing copying anyway. And it would seem that the chances of having
them lined up right might be so low that it's not worth checking for.

This presumes that you're not in the situation where your pages are
so small, or you've got so much memory, that malloc() aligns EVERYTHING
to a page boundary, wasting a lot of virtual address space, and possibly
physical memory in the process. Then you could do what you suggest.
Some kind of immobile
trip. It could permit optimizations even when not knowing beeforehand the
exact amount of memory needed.

Gordon L. Burditt
 
J

Jean-Claude Arbaut

*PAGE* allocation. In a typical virtual memory setup, you can re-arrange
PAGES by re-mapping. You don't get to re-map individual bytes that way.
So if the new memory and the old memory aren't lined up right, you may
be doing copying anyway. And it would seem that the chances of having
them lined up right might be so low that it's not worth checking for.

This presumes that you're not in the situation where your pages are
so small, or you've got so much memory, that malloc() aligns EVERYTHING
to a page boundary, wasting a lot of virtual address space, and possibly
physical memory in the process. Then you could do what you suggest.

It would be possible to align on page when allocating large blocks. I'm not
sure if MacOSX does that (pages are 4K). And if the block is not well
aligned, it's still cheaper to copy parts from 1 page (the first), than
copying all the others. There is a little hole after that (less than 1 page)
but, for large blocks, it's much faster.
 
A

Anonymous 7843

When you realloc [...] would it be possible on some
implementations, to just change page allocation ? Some kind of immobile
trip. It could permit optimizations even when not knowing beeforehand the
exact amount of memory needed.

That's a nifty idea!

Actually, I think to some degree it was an old multics trick.
If you had a few precious items that you knew were going to
to grow and shrink drastically, you could put them into
their own individual segment(s) and resize the segments
as needed. You'd be limited to the segment size (often
128KB back then) or at least must have had sufficient sequential
segments set aside. But in any case, you could in effect
resize the object w/o having to move it, and 128KB was a
lot of memory back then.

A secondary point is that 64 bit machines are becoming
more commonplace...some operating systems will let you
skip the brk/sbrk stuff and just put something at
byte 72036854775808 or wherever: just some implementation-
defined place beyond the heap and below the stack.
It wouldn't be too much trouble to make malloc/realloc
put large items, far apart, in 64-bit space.
 
A

Anonymous 7843

When you realloc [...] would it be possible on some
implementations, to just change page allocation ?

It wouldn't be too much trouble to make malloc/realloc
put large items, far apart, in 64-bit space.

Now I remember...I experimented with something like this
using mmap of /dev/zero at a carefully chosen location.

It was on one big old 64-bit honker of a box with...
wait for it...4G of RAM installed.
 
K

Keith Thompson

Jean-Claude Arbaut said:
When you realloc for more size, it may be necessary to allocate a new block,
copy memory and deallocate the old, for example if there is not enough free
space after the original block (maybe the only example ?).
[...]

Others have addressed your main question. There are cases other than
not having enough free space where it could make sense to copy the
block. For example, if the object is between two large chunks of free
space, the implementation might use a realloc() call as an opportunity
to move the object somewhere else and coalesce the free chunks into a
single larger chunk.

For that matter, the C99 standard actually says that realloc()
allocates a new chunk of memory, copies the old to the new, and
deallocates the old. Leaving the object where it is and skipping the
copy is merely an allowed optimization. (The C90 standard meant to
say the same thing, but the wording was poor.)

As a user of realloc(), you should always assume that the old address
may be invalid after a realloc() call (unless the realloc() fails, in
which case it leaves the object alone).
 
J

Jean-Claude Arbaut

Jean-Claude Arbaut said:
When you realloc for more size, it may be necessary to allocate a new block,
copy memory and deallocate the old, for example if there is not enough free
space after the original block (maybe the only example ?).
[...]

Others have addressed your main question. There are cases other than
not having enough free space where it could make sense to copy the
block. For example, if the object is between two large chunks of free
space, the implementation might use a realloc() call as an opportunity
to move the object somewhere else and coalesce the free chunks into a
single larger chunk.

For that matter, the C99 standard actually says that realloc()
allocates a new chunk of memory, copies the old to the new, and
deallocates the old. Leaving the object where it is and skipping the
copy is merely an allowed optimization. (The C90 standard meant to
say the same thing, but the wording was poor.)

As a user of realloc(), you should always assume that the old address
may be invalid after a realloc() call (unless the realloc() fails, in
which case it leaves the object alone).

Thanks. I didn't know, but I would never have used old address without
checking. Crazy, but not too much ;-)
 
M

Michael Wojcik

When you realloc for more size, it may be necessary to allocate a new block,
copy memory and deallocate the old, for example if there is not enough free
space after the original block (maybe the only example ?).

There are other possible cases. Not all allocators simply grab the
next available suitable free chunk. Some group objects into size
classes, for example, and your realloc may change which class the
object fits in.

The allocator scheme is not specified by the standard. It can be
*anything*, as long as the required semantics are preserved.
But, is it really
necessary to _copy_ bytes ? When in a virtual memory environment (off-topic
here, be I still think the question is worth), you don't access directly
physical memory, but allocated pages, so would it be possible on some
implementations, to just change page allocation ?

That *is* extending the original block.

Take the typical case: the implementation uses a linear virtual
address space for all allocated objects, and the contents of C object
pointers are addresses as used by the virtual memory manager. If an
object occupies the page at address n*pagesize, and you attempt to
extend it with realloc, then:

- There may be no page mapped at address (n+1)*pagesize, and the
implementation could request that the VMM map a new page at that
address. Aside from performing its internal housekeeping, the
realloc is done; no copying needs to take place.

- There may be a page mapped at address (n+1)*pagesize. If it's part
of some other object, then obviously the implementation cannot steal
that address for the object being resized, because pointers into the
other object will refer to that portion of the virtual address space.
The realloc'd object will have to be moved.

- Some other case may apply (eg the implementation is unable to
secure additional memory).
Some kind of immobile
trip. It could permit optimizations even when not knowing beeforehand the
exact amount of memory needed.

The malloc implementations I'm familiar with are already capable of
doing this. (Actually, for many of them it happens automatically
in many cases, via lazy allocation and copy-on-write.)

In short: yes, it's a good idea, for implementations where it's
appropriate; but it's one they typically already use.

On another off-topic note: in a large (eg 64-bit) virtual address
space, it's usually possible to space objects sparsely, so that they
nearly always can be extended this way. That's one of the advantages
of a large address space.
 
C

CBFalconer

Michael said:
.... snip ...

Take the typical case: the implementation uses a linear virtual
address space for all allocated objects, and the contents of C object
pointers are addresses as used by the virtual memory manager. If an
object occupies the page at address n*pagesize, and you attempt to
extend it with realloc, then:

- There may be no page mapped at address (n+1)*pagesize, and the
implementation could request that the VMM map a new page at that
address. Aside from performing its internal housekeeping, the
realloc is done; no copying needs to take place.

- There may be a page mapped at address (n+1)*pagesize. If it's part
of some other object, then obviously the implementation cannot steal
that address for the object being resized, because pointers into the
other object will refer to that portion of the virtual address space.
The realloc'd object will have to be moved.

There are a lot of possibilities. My nmalloc for DJGPP, available
on my site, mallocs into a linear virtual memory map and trys to
handle all the various possibilities of adjacent free space. The
objective is to avoid memory copies, which it can do when expanding
into free space above. It also checks for free space below, which
may avoid having to assign a new larger junk (and possibly
failing). At any rate it is a simple model, yet has many possible
cases. Available at:

<http://cbfalconer.home.att.download/nmalloc.zip>

--
"I'm a war president. I make decisions here in the Oval Office
in foreign policy matters with war on my mind." - GWB 2004-2-8
"If I knew then what I know today, I would still have invaded
Iraq. It was the right decision" - G.W. Bush, 2004-08-02
"This notion that the United States is getting ready to attack
Iran is simply ridiculous. And having said that, all options
are on the table." - George W. Bush, Brussels, 2005-02-22
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
473,787
Messages
2,569,629
Members
45,332
Latest member
LeesaButts

Latest Threads

Top