boost alternative to realloc

A

Aaron Graham

I was wondering if there is a boost wrapper or alternative that
essentially provided malloc/realloc/free in a RAII wrapper. Any ideas?
 
A

Aaron Graham

Why don't you just use std::vector instead of trying to mix C and C++
techniques?

/Leigh

For several reasons:
- std::vector initializes the entire buffer. This is bad, not only
because of the time it takes to initialize the buffer, but also
because it requires the OS to map all pages into physical memory even
if they never end up being used.
- You can't shrink the memory footprint of a std::vector without
copying the data.

I'm aware of std::vector, I love it and I use it all the time, but
it's not appropriate for this situation. It's a video application on
an embedded system, and it's one of those areas of the code that I've
profiled and determined that I need maximum performance with minimal
memory overhead. With a few minor changes, std::vector could fit the
bill. But it doesn't.

This has taken more time than it would have taken to write my own RAII
wrapper, so I guess I'll do that. Thanks anyway.
 
A

Alf P. Steinbach

* Leigh Johnston:
Yes I know how reserve() works thanks. I meant so say resize() but said
reserve() as I was thinking about malloc/realloc. It is UB to access
the allocated but uninitialized parts of a std::vector.

If understand you correctly you meant resize + no-construction-op because you
thought just doing a reseve would have to yield UB when accessing the elements.

Happily that's not the case for POD elements. C++03 guaranteed a contiguous
buffer for std::vector, and C++0x will additionally do so for std::string (in
practice you already have also that). Using uninitialized values is UB, as ever,
but you can store data into those elements, and then read from them.

For a std::vector v, v is defined to have the same semantics as
*(v.begin()+i). This can theoretically be troublesome with some pedantically
correct C++ implementation, and if used with i >= v.size() it may well be UB or
implementation defined, I haven't checked (e.g. think range checking). But since
it's a contiguous buffer you can do (&v[0]), with here being the built-in
indexing operator, which is well-defined for the complete raw array.


Cheers,

- Alf
 
A

Alf P. Steinbach

* Leigh Johnston:
That doesn't help as it is UB to access the uninitialized memory so the
memory still needs to be initialized before you can use it hence
std::vector is not suitable for the OP.

See my reply to you else-thread, but in short: that's overly pessimistic.


Cheers & hth.,

- Alf
 
G

Gert-Jan de Vos

For a std::vector v, v is defined to have the same semantics as
*(v.begin()+i). This can theoretically be troublesome with some pedantically
correct C++ implementation, and if used with i >= v.size() it may well be UB or
implementation defined, I haven't checked (e.g. think range checking). But since
it's a contiguous buffer you can do (&v[0]), with here being the built-in
indexing operator, which is well-defined for the complete raw array.


I use such an implementation (MSVC2005 and _SECURE_SCL). resize()
initializes the
entire vector<T> with T(). I often need large arrays of POD, that's
why I made a
dynamic_array<T> class with an interface very close to std::vector but
not using
allocators. I don't think an allocator can support uninitialized POD
allocation
like new T[] does. The nice thing of new T[] is that it does not
initialize PODs
while it does default construct true classes.

G-J
 
A

Alf P. Steinbach

* Leigh Johnston:
Alf P. Steinbach said:
* Leigh Johnston:
On 04/16/2010 11:20 PM, Leigh Johnston wrote:
A combination of reserve() and a custom allocator whose construct is a
no-op might work although std::vector will probably still emit a
construct loop.

That's not how std::vector works.

reserve() (including the implicit reserving which happens when the
vector grows) allocates *uninitialized* memory. Even if you do a
reserve(100000000) (on an empty vector), no constructors will be called
*at all* (and consequently no loop is performed). Only those elements
will be constructed which are actually added to the vector.


Yes I know how reserve() works thanks. I meant so say resize() but
said reserve() as I was thinking about malloc/realloc. It is UB to
access
the allocated but uninitialized parts of a std::vector.

If understand you correctly you meant resize + no-construction-op
because you thought just doing a reseve would have to yield UB when
accessing the elements.

Happily that's not the case for POD elements. C++03 guaranteed a
contiguous buffer for std::vector, and C++0x will additionally do so
for std::string (in practice you already have also that). Using
uninitialized values is UB, as ever, but you can store data into those
elements, and then read from them.

For a std::vector v, v is defined to have the same semantics as
*(v.begin()+i). This can theoretically be troublesome with some
pedantically correct C++ implementation, and if used with i >=
v.size() it may well be UB or implementation defined, I haven't
checked (e.g. think range checking). But since it's a contiguous
buffer you can do (&v[0]), with here being the built-in
indexing operator, which is well-defined for the complete raw array.


std::vector<T>::eek:perator[](i) is UB if i >= std::vector<T>::size()


It may be.

If you were thinking of backing me up on that possibility you'd better cite
chapter and verse from the standard instead of plain asserting.

Anyway, it isn't relevant to what you're replying to; did you read it?

- do
not even think about writing such crap code, this is about the correct
use of containers of any element type not just POD so POD is a red
herring

I can't comment on the details of whatever it is you're imagining here; did you
read what you responded to?

(and I believe the only valid operation on an uninitialized POD
is it initialize it via assignment).

Or by other means such as placement new or memcpy, but in the end they all
reduce to machine code level assignments, so with a sufficiently abstract notion
of "assignment" you're right about that.

Did you read what you responded to?


Cheers & hth.,

- Alf
 
Ö

Öö Tiib

I was wondering if there is a boost wrapper or alternative that
essentially provided malloc/realloc/free in a RAII wrapper. Any ideas?

Lightest might be to use boost::shared_ptr. Then you achieve RAII with
your malloc'ed pointer since shared_ptr can be set to call custom
deleter in destructor (free on your case).
 
A

Alf P. Steinbach

* Leigh Johnston:
Alf P. Steinbach said:
If you were thinking of backing me up on that possibility you'd better
cite chapter and verse from the standard instead of plain asserting.

I cannot find an explicit reference to it being UB in the draft
standard, it might be covered by description of iterators, i.e. creating
an iterator past end() or dereferencing end() being UB (it equates
operator[]() semantics to being equivalent to *(a.begin() + n)).

Yes, I think that's where you'd have to look.

From MSDN std::vector reference:

"If the position specified is greater than the size of the container,
the result is undefined."

MSDN needs to be consulted very carefully. It's more of an implementation's
documentation than a language reference.

SGI website asserts that 0 <= n < a.size() is a precondition for a[n].
Violating a precondition would be UB in my book.

Writing such code would still be an extremely retarded thing to do even
if the standard didn't explicitly state that it was UB.

"retarded" is probably too strong. It all depends.

But since you are stating this POV in reply to me I'm wondering how the idea of
using std::vector::eek:perator with i > size popped into your mind?

Are you next going to very heatedly warn me about the dangers of dereferencing
null pointers (outside of typeid expressions), if I causally mention that that
might not be a good idea, or e.g. reply to me stating that integer division by
zero is "retarded", if I should ever say that it may be UB, or what?


Cheers & hth.,

- Alf
 
A

Alf P. Steinbach

* Leigh Johnston:
* Alf P. Steinbach:
"retarded" is probably too strong. It all depends.

But since you are stating this POV in reply to me I'm wondering how
the idea of using std::vector::eek:perator with i > size popped into
your mind?

Are you next going to very heatedly warn me about the dangers of
dereferencing null pointers (outside of typeid expressions), if I
causally mention that that might not be a good idea, or e.g. reply to
me stating that integer division by zero is "retarded", if I should
ever say that it may be UB, or what?


You said:

"If understand you correctly you meant resize + no-construction-op
because you
thought just doing a reseve would have to yield UB when accessing the
elements.

Happily that's not the case for POD elements."


Yes. And I added more detailed discussion. See that discussion for practical
details, see below for less informal reasoning.

This is incorrect

Happily that's incorrect.

and retarded.

I'm sorry, that's empty name calling.

Only accessing the first size()
elements of a std::vector is well defined and correct.

Happily that's incorrect, see below.

The fact that
you can access the "raw array" allocated by std::vector does not make it
a correct thing to do

That's right, in the sense that the fact that you can jump up and down doesn't
always make it the correct thing to do.

, even if it works on all known implementations.

I'm sorry, but that's very misleading: it's guaranteed to work with a conforming
compiler. This was much of the point of the C++03 change in wording to require a
contiguous buffer, namely to support direct access of the buffer as a raw array.

The standard could have been more clear, though: with the wording in e.g. the
2008 draft (n2008) the only /explicit/ guarantee for direct buffer access is
that it's well-defined for the range 0 through size-1 (not capacity-1) -- this
is almost a defect in the draft, if it's still that way, since it's less of a
guarantee than the one provided implicitly by the behavior discussed below.

The wording that lets you go on to capacity (i.e. that says that there's only
one buffer, and that that buffer is at &v[0]) is in n2800 §23.2.4.2/5, namely
that (1) after reserve you have at least the requested capacity, (2)
reallocation invalidates all iterators, pointers & references, and (3) that an
insert can only cause reallocation when it would make the size of the vector
exceed the current capacity.

This means that reserve() can't allocate a buffer on the side deferring the
replacement of the current buffer, because such later buffer replacement would
invalidate pointers and references at a time when that isn't allowed.

Hence, after any reallocation you have that >=capacity-size buffer at &v[0].

In passing, about "less than clear": also §23.2.4.2/5 has a direct defect, at
least in n2800, namely, it refers to "the size specified in the most recent call
to reserve()" when it should say "the capacity established by the most recent
call to reserve". For it's possible to specify a capacity less than the current,
in which case reserve is guaranteed to not reduce the capacity. I.e. the n2800
wording allows insert() to reallocate after a call to reserve(0), which is
unintended and incorrect.

I don't know whether it's wise to mention such defects here, but it might help
you to understand that the standard is an evolving document, not 100% prefect.

Only std::vector is "allowed" to access the unused allocated memory it
owns.

Happily that's also incorrect. But there is a subtle issue once discussed in
great detail, about whether the buffer starting at &v[0] necessarily is
zero-terminated after a .c_str() call; it was Andrei who raised the issue. I
can't recall the conclusion, sorry again.

UB can "silently work" but it is still UB.

That's right.


Cheers, & hth.,

- Alf
 
A

Alf P. Steinbach

* Leigh Johnston:
* Alf P. Steinbach:
<snip>

Unless I am mistaken you are basically advocating that the following
code is correct according to the standard:

void foo()
{
std::vector<int> v;
v.reserve(2);
v.push_back(41);
*(&v[0]+1) = 42;
}

The above code is plain wrong.

No, it's meaningless, but it's not wrong.

After the reserve(2) you are guaranteed a contiguous buffer of at least 2 ints,
and you're guaranteed that that buffer is the one accessible via &v[0]; see the
posting you replied to for the formal details.

If I understand this correctly what you're reacting to is that you've been
proven wrong.

If I you disagree then your position is untenable.

On the contrary, your only argument for your position is that you maintain that
you're right, reasserting that again and again as you've done with other issues,
and as before you disregard and snip all facts and arguments to the contrary.

It's now clear that that is a habit you have.

I already regret wasting time providing you with references and reasoning, only
to have that snipped and replaced, as now usual, with childish assertions.


- Alf
 
K

Kai-Uwe Bux

Leigh said:
<snip>

Unless I am mistaken you are basically advocating that the following code
is correct according to the standard:

void foo()
{
std::vector<int> v;
v.reserve(2);
v.push_back(41);
*(&v[0]+1) = 42;
}

The above code is plain wrong.

Please define your terms: what do you take "correct" to mean, and what is
"plain wrong"? Does the code have UB according to the standard?
If I you disagree then your position is
untenable. Anyone with an once of common sense would use std::vector in
the ways it was designed for and not abuse it like you are suggesting.

Keep the context in mind: Alf was pondering this option in the context of
the OP's request: If the alternative is to use malloc(), free(), and
realloc() manually, why not use the above? Or, why not do something like the
above inside the implementation of a little wrapper that, to the outside,
looks like the wrapper for malloc(), free(), and realloc() that the OP was
looking for?


Best

Kai-Uwe Bux
 
A

Alf P. Steinbach

* Kai-Uwe Bux:
Leigh said:
<snip>

Unless I am mistaken you are basically advocating that the following code
is correct according to the standard:

void foo()
{
std::vector<int> v;
v.reserve(2);
v.push_back(41);
*(&v[0]+1) = 42;
}

The above code is plain wrong.

Please define your terms: what do you take "correct" to mean, and what is
"plain wrong"? Does the code have UB according to the standard?
If I you disagree then your position is
untenable. Anyone with an once of common sense would use std::vector in
the ways it was designed for and not abuse it like you are suggesting.

Keep the context in mind: Alf was pondering this option in the context of
the OP's request: If the alternative is to use malloc(), free(), and
realloc() manually, why not use the above? Or, why not do something like the
above inside the implementation of a little wrapper that, to the outside,
looks like the wrapper for malloc(), free(), and realloc() that the OP was
looking for?

Well, I was pondering this not in the context of the OP's request but in the
context of Leigh's explanation for his statement

"A combination of reserve() and a custom allocator whose construct is a no-op
might work although std::vector will probably still emit a construct loop."

namely

"I meant so say resize() but said reserve() as I was thinking about
malloc/realloc. It is UB to access the allocated but uninitialized parts of a
std::vector."

There's no UB for writing to and subsequently reading those parts when the
element type is POD. It's just a raw array. And it's guaranteed there.

On the other hand, in the context of the OP's request, I don't think that
there's any guarantee that reserve() won't touch all parts of the buffer by e.g.
zeroing, and that could transform it from virtual address space to actually
allocated memory, but with a quality implementation that would be unlikely.


Cheers,

- Alf
 
A

Alf P. Steinbach

* Leigh Johnston:
Kai-Uwe Bux said:
Leigh said:
<snip>

Unless I am mistaken you are basically advocating that the following
code
is correct according to the standard:

void foo()
{
std::vector<int> v;
v.reserve(2);
v.push_back(41);
*(&v[0]+1) = 42;
}

The above code is plain wrong.

Please define your terms: what do you take "correct" to mean, and what is
"plain wrong"? Does the code have UB according to the standard?
If I you disagree then your position is
untenable. Anyone with an once of common sense would use std::vector in
the ways it was designed for and not abuse it like you are suggesting.

Keep the context in mind: Alf was pondering this option in the context of
the OP's request: If the alternative is to use malloc(), free(), and
realloc() manually, why not use the above? Or, why not do something
like the
above inside the implementation of a little wrapper that, to the outside,
looks like the wrapper for malloc(), free(), and realloc() that the OP
was
looking for?


Best

Kai-Uwe Bux

It is wrong in the sense that it is bad practice irrespective of whether
it is UB or not. std::vector is a container, it is not a general
purpose memory allocator. The correct solution is to write a class
designed for the specific requirements the OP had and I am glad to say
that the OP had sufficient common sense to realize this straight away,
common sense that Alf seems to lack.

Leigh, let me remind you that the suggestion of using a std::vector was *yours*,
not mine:

<quote author="Leigh">
Why don't you just use std::vector instead of trying to mix C and C++
techniques?

...

A combination of reserve() and a custom allocator whose construct is a
no-op might work although std::vector will probably still emit a construct
loop.
</quote>

I have only corrected your misguided belief (or perhaps after the fact creative
rationalization) about what's well-defined and undefined behavior in this.

So your criterion for lack of common sense, one who advocates using std::vector
as a solution to the OP's problem, applies to you, not me; see your own
statements quoted above.


Cheers & hth.,

- Alf

PS: Please stop misrepresenting me and others.
 
K

Kai-Uwe Bux

Alf said:
* Kai-Uwe Bux:
Leigh said:
<snip>

Unless I am mistaken you are basically advocating that the following
code is correct according to the standard:

void foo()
{
std::vector<int> v;
v.reserve(2);
v.push_back(41);
*(&v[0]+1) = 42;
}

The above code is plain wrong.

Please define your terms: what do you take "correct" to mean, and what is
"plain wrong"? Does the code have UB according to the standard?
If I you disagree then your position is
untenable. Anyone with an once of common sense would use std::vector in
the ways it was designed for and not abuse it like you are suggesting.

Keep the context in mind: Alf was pondering this option in the context of
the OP's request: If the alternative is to use malloc(), free(), and
realloc() manually, why not use the above? Or, why not do something like
the above inside the implementation of a little wrapper that, to the
outside, looks like the wrapper for malloc(), free(), and realloc() that
the OP was looking for?

Well, I was pondering this not in the context of the OP's request but in
the context of Leigh's explanation for his statement

"A combination of reserve() and a custom allocator whose construct is a
no-op might work although std::vector will probably still emit a construct
loop."

namely

"I meant so say resize() but said reserve() as I was thinking about
malloc/realloc. It is UB to access the allocated but uninitialized parts
of a std::vector."

There's no UB for writing to and subsequently reading those parts when the
element type is POD. It's just a raw array. And it's guaranteed there.

Sorry, I realized that soon after I posted and I was about to write a
retraction. But you stepped in before.
On the other hand, in the context of the OP's request, I don't think that
there's any guarantee that reserve() won't touch all parts of the buffer
by e.g. zeroing, and that could transform it from virtual address space to
actually allocated memory, but with a quality implementation that would be
unlikely.

Another thing, I wondered about are the guarantees of reallocation. If you
want std::vector<> to emulate realloc(), you might try:

std::vector< int > v;
v.reserve( 100 );
int * array = &v[0];
// realloc:
v.reserve( 200 ); // hm...

Off hand, I don't see a way to make sure that the first 100 elements will be
copied.

So for implementing the wrapper class the OP was looking for, this won't do
either.


Best

Kai-Uwe Bux
 
A

Alf P. Steinbach

* Leigh Johnston:
You are not considering the fact that UB can be hierarchical, i.e.
something which is UB at one level of abstraction is not UB when you
simply consider its constituent parts at a lower level.

Sorry, I can't see the relevance. But anyway, what you thought was UB is not. I
don't think it's very relevant to the OP's problem, but it is relevant in some
other context, e.g. it's pretty handy when you call an API function where you
need a raw buffer, and this is down at a call level where efficiency matters.

If the standard
does not make it clear that v[v.size()] is UB then I would consider that
a defect (omission).

This is a different UB, perhaps this is what you meant by abstration level.

We discussed this particular UB & its specification earlier, and I tried to hint
a little, since it's Work to find the details.

But after reviewing that discussion a good place to start in the C++98 standard
might be §24.1/6 and then §24.1.5.

There is no section in the standard (current draft)
for std::vector's element access functions whilst there is for
std::basic_string's - this looks like an omission to me.

In the C++98 standard you find the element access functions specified in
$23.1.1/12 and $23.1.1/13.

But I share your sentiment that that spec should have been explicitly referenced
from each relevant place, e.g. from the std::vector specification, especially
considering that it's a forward reference...

It is, unfortunately, by historical accident (in the evolution TCPL -> TCPPPL ->
std), a spaghetti standard. On the bright side, that complexity ("complexity" is
a euphemism for "spaghetti") allows room for much entertaining discussion. We've
not yet reached the level where all C++ programmers subscribe to the religion of
the Flying Spaghetti Monster, but we do refer to the Holy Standard. ;-)


Cheers & hth.,

- Alf
 
H

Howard Hinnant

For several reasons:
- std::vector initializes the entire buffer.  This is bad, not only
because of the time it takes to initialize the buffer, but also
because it requires the OS to map all pages into physical memory even
if they never end up being used.
- You can't shrink the memory footprint of a std::vector without
copying the data.

I'm aware of std::vector, I love it and I use it all the time, but
it's not appropriate for this situation. It's a video application on
an embedded system, and it's one of those areas of the code that I've
profiled and determined that I need maximum performance with minimal
memory overhead.  With a few minor changes, std::vector could fit the
bill. But it doesn't.

This has taken more time than it would have taken to write my own RAII
wrapper, so I guess I'll do that.  Thanks anyway.

Good move. So what I'm about to say is probably too late for your
needs. But I'm posting it anyway because it might help the next guy.

The new C++0X std::unique_ptr might be a good tool to use in this
situation. You might use it like:

std::unique_ptr<int[]> v(new int[N]);
v[0] = ...
v[1] = ...
....

If you really need to work with malloc/free, that can be done with:

std::unique_ptr<int[], void(*)(void*)>
v((int*)std::malloc(N*sizeof(int)), std::free);
v[0] = ...
v[1] = ...
....

You need to keep track of N yourself. std::unique_ptr (and N) might
be handy members of that wrapper.

Advantages of unique_ptr:

* It maintains unique_ownership of the pointer (enforced at compile
time). It can't be copied (it can be moved in c++0x).
* You still get the handy array access syntax.
* You control the allocation and initialization of the buffer.
* The version using new has an overhead of 1 word. The version using
malloc/free has an overhead of 2 words.
* The malloc version can work with realloc:

v.reset(std::realloc(v.release(), new_size)));

Though if realloc fails (returns 0), the above will leak. If that is
a danger that you want to guard against on your platform, an extra
temp should do it:

std::unique_ptr<int[], void(*)(void*)> tmp(std::realloc(v.get(),
new_size), std::free);
if (!tmp)
// deal with out of memory
v.swap(tmp);
tmp.release();

You probably don't have std::unique_ptr available in your toolbox at
the moment. I believe there's a boost version that will work fine
today.

-Howard
 
J

Juha Nieminen

Gert-Jan de Vos said:
I don't think an allocator can support uninitialized POD allocation
like new T[] does.

On the contrary. std::allocator() always returns uninitialized memory
which you then have to explicitly initialize (by either calling
std::allocator::construct() or placement new).
 
J

Juha Nieminen

Leigh Johnston said:
That doesn't help as it is UB to access the uninitialized memory so the
memory still needs to be initialized before you can use it hence std::vector
is not suitable for the OP.

Why would you want to access uninitialized memory? As I said: If you want
to add elements to the vector, then do so. Only those elements will be
initialized, not the entire internal buffer which was reserved.

So if you want to be prepared to handle, for example 10000 elements,
then do a reserve(10000). Then start adding elements with push_back()
or whatever. I don't see the problem.
 
G

Gert-Jan de Vos

Gert-Jan de Vos said:
I don't think an allocator can support uninitialized POD allocation
like new T[] does.

  On the contrary. std::allocator() always returns uninitialized memory
which you then have to explicitly initialize (by either calling
std::allocator::construct() or placement new).

Of course you are right. What I meant is that an allocator in
combination
with an STL container can not support uninitialized POD allocation.

new T[] has the nice property of not initializing PODs but default
initializing other objects. If you want this (for efficiency in
allocating very large POD arrays), then you can not do it by using,
e.g.
vector. Even a custom allocator can not do it.

When I say "using vector", I mean that you can rely on the normal
vector
semantics: size() returns the true size and begin()/end() return the
true
range.

An example of when you may need a large uninitialized array is a char
array for use with istream::read(). I regularly need to pass large
arrays
to C functions that will fill the array with PODs. Sometimes, vectors
default initialization gets in the way then.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top