Calling destructor on fundamental types and other stuff about placementnew

F

Francesco S. Carta

Hi there,
in order to get a better grip on the stuff about "operator new", "new
operator", "placement new" and so forth I went back to the relevant
sections of TC++PL and of the FAQ, then I've implemented a simple Vector
template trying to follow the implementation of std::vector that I found
in my implementation (ehm... OK, you know what I mean despite that
repetition).

So, among other things, I've created allocate(), deallocate(), create()
and destroy() private methods within my template, and everything went
fine, more or less.

The destroy() function consists of just this:

void destroy(value_type* ptr) {
ptr->~value_type();
}

....where value_type is the type parameter of my template (well, a
typedef of it).

I've created an "item" struct and I added the various ctor, cctor,
assignment and dtor to it.

Then I instantiated the template with something like this:

Vector<item> vec(10);

....I fiddled with some other operations (push_back(), reserve(), clear()
etc) and I run the program to verify that the various methods of "item",
dtor included, were being called the right number of times and at the
right time.

At some point I tested it with a fundamental type:

Vector<int> vec(10);

....and the template instantiation went fine, the program compiled and
ran as expected.

But then I thought: wait, I am calling the equivalent of "ptr->~int()"
within my destroy(), but if I write it directly, say something like this:

int* pi = new int;
pi->~int();

....then the compiler - as I expected - rejects it.

My conclusion is that the template mechanism is able to understand that
I'm going to do something useless with fundamental types and simply
ignores the "ptr->~value_type()" line when instantiating the template
with "value_type == some fundamental type".

Any comment or further insight on the above?

Sooner or later I'll post here the complete implementation of my Vector
to get some advice (and some corrections, very likely), but in the mean
time I'm most concerned with the basic storage management internals that
I'm pasting here below, so please have a look and point out any wrong or
silly thing I could be doing:

value_type* allocate(size_t n) {
return
reinterpret_cast<value_type*> (
new char[sizeof(value_type) * n]
);
}

void deallocate(void* ptr) {
delete[] reinterpret_cast<char*>(ptr);
}

value_type* create(void* ptr) {
return reinterpret_cast<value_type*>(new(ptr) value_type());
}

value_type* create(void* ptr, const value_type& t) {
return reinterpret_cast<value_type*>(new(ptr) value_type(t));
}

void destroy(value_type* ptr) {
ptr->~value_type();
}

Thank you very much for your attention.
 
J

Johannes Schaub (litb)

Francesco said:
Hi there,
in order to get a better grip on the stuff about "operator new", "new
operator", "placement new" and so forth I went back to the relevant
sections of TC++PL and of the FAQ, then I've implemented a simple Vector
template trying to follow the implementation of std::vector that I found
in my implementation (ehm... OK, you know what I mean despite that
repetition).

So, among other things, I've created allocate(), deallocate(), create()
and destroy() private methods within my template, and everything went
fine, more or less.

The destroy() function consists of just this:

void destroy(value_type* ptr) {
ptr->~value_type();
}

...where value_type is the type parameter of my template (well, a
typedef of it).

I've created an "item" struct and I added the various ctor, cctor,
assignment and dtor to it.

Then I instantiated the template with something like this:

Vector<item> vec(10);

...I fiddled with some other operations (push_back(), reserve(), clear()
etc) and I run the program to verify that the various methods of "item",
dtor included, were being called the right number of times and at the
right time.

At some point I tested it with a fundamental type:

Vector<int> vec(10);

...and the template instantiation went fine, the program compiled and
ran as expected.

But then I thought: wait, I am calling the equivalent of "ptr->~int()"
within my destroy(), but if I write it directly, say something like this:

int* pi = new int;
pi->~int();

...then the compiler - as I expected - rejects it.

My conclusion is that the template mechanism is able to understand that
I'm going to do something useless with fundamental types and simply
ignores the "ptr->~value_type()" line when instantiating the template
with "value_type == some fundamental type".

Any comment or further insight on the above?

The ~int is invalid because "int" is a simple-type-specifier, as opposed to
be a "type-name", syntactically. The following will work:

typedef int Int;
(0).~Int();

Because syntactically, "Int" is a type-name. Likewise, in C++0x this will
work:

(0).~decltype(0)();

Both are invocations of a pseudo-destructor-name. These are defined to be
no-ops and evaluate to "void".
 
G

Gennaro Prota

Hi there,
in order to get a better grip on the stuff about "operator new", "new
operator", "placement new" and so forth I went back to the relevant
sections of TC++PL and of the FAQ, then I've implemented a simple Vector
template trying to follow the implementation of std::vector that I found
in my implementation (ehm... OK, you know what I mean despite that
repetition).

[...]
At some point I tested it with a fundamental type:

Vector<int> vec(10);

....and the template instantiation went fine, the program compiled and
ran as expected.

But then I thought: wait, I am calling the equivalent of "ptr->~int()"
within my destroy(), but if I write it directly, say something like this:

int* pi = new int;
pi->~int();

....then the compiler - as I expected - rejects it.

My conclusion is that the template mechanism is able to understand that
I'm going to do something useless with fundamental types and simply
ignores the "ptr->~value_type()" line when instantiating the template
with "value_type == some fundamental type".

Well, no, not the template mechanism. It's more like Johannes
said :)

Note the note (!) in [class.dtor]/15, too (who said the C++
standard doesn't have a rationale; it just "inline" :))
Any comment or further insight on the above?

Sooner or later I'll post here the complete implementation of my Vector
to get some advice (and some corrections, very likely), but in the mean
time I'm most concerned with the basic storage management internals that
I'm pasting here below, so please have a look and point out any wrong or
silly thing I could be doing:

I had a quick look. The abundance of reinterpret_casts is what
jumped most to my eye. (Well, duh... they have been invented to
stand out. I meant that you don't need them.)
value_type* allocate(size_t n) {
return
reinterpret_cast<value_type*> (
new char[sizeof(value_type) * n]
);
}

I think the best option for this function, if you want to have
it, is to just return a void *, which is in general the way to
warn/inform the reader that he is dealing with (yet) "untyped"
memory:

void *
allocate( size_t n )
{
return new char[ n * sizeof... ] ; // or even just
// operator new( n * sizeof...)
// which I would prefer.
}

And I'd use unsigned char, although in C++ char probably works
too (I have never dug into whether it really does. IIRC char may
have trap representations in C. And --this is the part I've not
dug into-- it perhaps cannot have them in C++. Since unsigned
char is the de facto standard to signal "raw bytes", I just use
unsigned char and go on. In fact, the last time I tried to dig,
I seemed to find several oddities in the standard, so why
risking.)

Back to your exercise, I'd recommend to mentally separate the
places where you can assume that there is an object constructed
in the buffer from the ones where you just have raw memory. And
code the details from there, with some private functions being
in fact the "transitions" between these two kinds of states.

I'll be back on this in a while.
void deallocate(void* ptr) {
delete[] reinterpret_cast<char*>(ptr);
}

I'd call the operator delete[] function. No need to
reinterpret_cast.
value_type* create(void* ptr) {
return reinterpret_cast<value_type*>(new(ptr) value_type());
}

This is a placement new *expression*: you already get a
value_type * (so this is another reinterpret_cast that goes
away). And do you need to return anything from the function?
value_type* create(void* ptr, const value_type& t) {
return reinterpret_cast<value_type*>(new(ptr) value_type(t));
}
Likewise.

void destroy(value_type* ptr) {
ptr->~value_type();
}

For your first experiments it is probably easier to use a
non-heap array, with one object. Just something like a

unsigned char m_buffer[ sizeof( T ) ]

member in your class template (which, given that it contains at
most one object, you'd probably no longer call "vector" :)).
That would get rid of allocate() and deallocate() and perhaps
even show more clearly that you are not doing any "placement
delete" (see [lib.new.delete.placement]). Then, for create() and
destroy() I think I'd just have:

void construct_object( T const & ) ;
void destroy_object() ;

And for their implementation, off the top of my head:

void *
address() // NOTE: depending on what you do, you might need a
// const version of this, too
{
return m_buffer ;
}

template< typename T >
void
...::construct_object( T const & t ) // PRE: first call, or first call
{ // after destroy_object
new ( address() ) T( t ) ;
}

template< typename T > // PRE: an object exists (its lifetime
void // has not ended)
...::destroy_object()
{
T * p( static_cast< T * >( address() ) ) ;
p->T::~T() ;
}

Note the separation I was talking about: before entering
construct_object you have to assume that there's no object in
the buffer (call it twice in a row and you have a problem :)).
At its return though, you can assume that *there is* an object
(leaving the function with an exception doesn't count as a
"return", of course). destroy_object() does the opposite
transition: it must be entered when there is an object (and so
you can static_cast) and at its return you have raw memory.
Note, too, that if address() returned a char *, you couldn't
static_cast it to T * directly.
 
G

Gennaro Prota

For your first experiments it is probably easier to use a
non-heap array, with one object. Just something like a

unsigned char m_buffer[ sizeof( T ) ]

member in your class template

Ah, of course you'd have to align it. Something like this in the
class template definition should be enough:

union
{
max_aligned dummy ;
unsigned char m_buffer[ sizeof( T ) ] ;
} ;

where, just to be on the safe side, max_aligned is:

class one_class ;
union max_aligned
{
char c ;
short s ;
int i ;
long l ;
float f ;
double d ;
long double ld ;
void * p ;
int ( * pf)() ;
int one_class::* pm ;
int (one_class::* pmf)() ;
} ;

Wow, I didn't have this handy for copy/paste. It was a lot of
typing :)
 
F

Francesco S. Carta

on 27/08/2010 02:19:34 said:
Francesco S. Carta wrote:


The ~int is invalid because "int" is a simple-type-specifier, as opposed to
be a "type-name", syntactically. The following will work:

typedef int Int;
(0).~Int();

Because syntactically, "Int" is a type-name. Likewise, in C++0x this will
work:

(0).~decltype(0)();

Both are invocations of a pseudo-destructor-name. These are defined to be
no-ops and evaluate to "void".

Thank you very much for the explanation Johannes, when I read your code
examples I was surprised - just looking at them, due to the "(0).", I
thought "/that/ shouldn't compile!", well, I had to try it to convince
myself, it looked too much like some interpreted language's syntax ;-)
 
F

Francesco S. Carta

Hi there,
in order to get a better grip on the stuff about "operator new", "new
operator", "placement new" and so forth I went back to the relevant
sections of TC++PL and of the FAQ, then I've implemented a simple Vector
template trying to follow the implementation of std::vector that I found
in my implementation (ehm... OK, you know what I mean despite that
repetition).
[...]
At some point I tested it with a fundamental type:

Vector<int> vec(10);

....and the template instantiation went fine, the program compiled and
ran as expected.

But then I thought: wait, I am calling the equivalent of "ptr->~int()"
within my destroy(), but if I write it directly, say something like this:

int* pi = new int;
pi->~int();

....then the compiler - as I expected - rejects it.

My conclusion is that the template mechanism is able to understand that
I'm going to do something useless with fundamental types and simply
ignores the "ptr->~value_type()" line when instantiating the template
with "value_type == some fundamental type".

Well, no, not the template mechanism. It's more like Johannes
said :)

Note the note (!) in [class.dtor]/15, too (who said the C++
standard doesn't have a rationale; it just "inline" :))

I see now, I shall force myself to refer to the standard before asking
such kind of questions here :-(
Any comment or further insight on the above?

Sooner or later I'll post here the complete implementation of my Vector
to get some advice (and some corrections, very likely), but in the mean
time I'm most concerned with the basic storage management internals that
I'm pasting here below, so please have a look and point out any wrong or
silly thing I could be doing:

I had a quick look. The abundance of reinterpret_casts is what
jumped most to my eye. (Well, duh... they have been invented to
stand out. I meant that you don't need them.)
value_type* allocate(size_t n) {
return
reinterpret_cast<value_type*> (
new char[sizeof(value_type) * n]
);
}

I think the best option for this function, if you want to have
it, is to just return a void *, which is in general the way to
warn/inform the reader that he is dealing with (yet) "untyped"
memory:

I forgot to make it explicit that those functions are private members of
my Vector template - I also forgot to make them static, but I'll fix it,
although it shouldn't change much. The purpose of this exercise is to
create a Vector as self-contained as possible.

By centralizing the conversion from void* to value_type* I can simply write:

start = allocate(n);

....in the implementation of other methods such as reserve(), where
"start" is a private value_type* data member of Vector, pointing to the
beginning of the allocated space.
void *
allocate( size_t n )
{
return new char[ n * sizeof... ] ; // or even just
// operator new( n * sizeof...)
// which I would prefer.
}

And I'd use unsigned char, although in C++ char probably works
too (I have never dug into whether it really does. IIRC char may
have trap representations in C. And --this is the part I've not
dug into-- it perhaps cannot have them in C++. Since unsigned
char is the de facto standard to signal "raw bytes", I just use
unsigned char and go on. In fact, the last time I tried to dig,
I seemed to find several oddities in the standard, so why
risking.)

I'm not so sure I really need to switch to unsigned char, the standard
makes explicit examples with "raw" char for this technique, I think it's
required to work in all cases.
Back to your exercise, I'd recommend to mentally separate the
places where you can assume that there is an object constructed
in the buffer from the ones where you just have raw memory. And
code the details from there, with some private functions being
in fact the "transitions" between these two kinds of states.

I'll be back on this in a while.
void deallocate(void* ptr) {
delete[] reinterpret_cast<char*>(ptr);
}

I'd call the operator delete[] function. No need to
reinterpret_cast.

Uh... actually, before adding that cast, I wrote:

delete[] ptr;

....and the compiler warned about "deleting void* is undefined"... it did
not cross my mind that I could explicitly call operator delete[] as you
suggest (it didn't just because I didn't know that!).

Now my code reads:

operator delete[] (ptr);

....in that line. I wonder why the compiler doesn't resolve "delete[]
ptr" to "operator delete[] (ptr)" and get rid of the warning... I need
more study to understand the issue, any further insight will be more
than welcome.
This is a placement new *expression*: you already get a
value_type * (so this is another reinterpret_cast that goes
away). And do you need to return anything from the function?


Likewise.

Eh, of course I don't need to cast them... silly me... but thanks for
pointing it out.

About the return value, I used it for a check at the calling place,
something like this:

assert(ptr == create(ptr));

....because at some point I thought that the new expression could
actually mangle the address to align it properly... now I almost sure
that such a thing could never happen.
void destroy(value_type* ptr) {
ptr->~value_type();
}

For your first experiments it is probably easier to use a
non-heap array, with one object. Just something like a

unsigned char m_buffer[ sizeof( T ) ]

member in your class template (which, given that it contains at
most one object, you'd probably no longer call "vector" :)).
That would get rid of allocate() and deallocate() and perhaps
even show more clearly that you are not doing any "placement
delete" (see [lib.new.delete.placement]). Then, for create() and
destroy() I think I'd just have:

void construct_object( T const& ) ;
void destroy_object() ;

And for their implementation, off the top of my head:

void *
address() // NOTE: depending on what you do, you might need a
// const version of this, too
{
return m_buffer ;
}

template< typename T>
void
...::construct_object( T const& t ) // PRE: first call, or first call
{ // after destroy_object
new ( address() ) T( t ) ;
}

template< typename T> // PRE: an object exists (its lifetime
void // has not ended)
...::destroy_object()
{
T * p( static_cast< T *>( address() ) ) ;
p->T::~T() ;
}

Note the separation I was talking about: before entering
construct_object you have to assume that there's no object in
the buffer (call it twice in a row and you have a problem :)).
At its return though, you can assume that *there is* an object
(leaving the function with an exception doesn't count as a
"return", of course). destroy_object() does the opposite
transition: it must be entered when there is an object (and so
you can static_cast) and at its return you have raw memory.
Note, too, that if address() returned a char *, you couldn't
static_cast it to T * directly.

The Vector implementation is already advanced enough to need dynamic
management of the storage, as I already have working reserve(),
push_back() and clear(), now I'm working on insert() and erase(), but I
think I will stop when I'll reach begin() and end() without implementing
reverse_iterator - more about this (and about its rationale) in a
further post, where I'll show my complete implementation for the public
delight && dissection - assuming short-circuit behavior at that "logical
and" ;-)

Thanks a lot for your notes Gennaro.
 
F

Francesco S. Carta

For your first experiments it is probably easier to use a
non-heap array, with one object. Just something like a

unsigned char m_buffer[ sizeof( T ) ]

member in your class template

Ah, of course you'd have to align it. Something like this in the
class template definition should be enough:

union
{
max_aligned dummy ;
unsigned char m_buffer[ sizeof( T ) ] ;
} ;

where, just to be on the safe side, max_aligned is:

class one_class ;
union max_aligned
{
char c ;
short s ;
int i ;
long l ;
float f ;
double d ;
long double ld ;
void * p ;
int ( * pf)() ;
int one_class::* pm ;
int (one_class::* pmf)() ;
} ;

Wow, I didn't have this handy for copy/paste. It was a lot of
typing :)

Luckily I don't need to worry about alignments as the new expression is
handling them for me, but your note will serve to complete the
digression you made in the previous post - and actually comes useful to
me to ensure correct alignment for non-dynamic buffers in some other
project, thanks for pointing out this technique Gennaro.
 
G

Gennaro Prota

Hi there,
in order to get a better grip on the stuff about "operator new", "new
operator", "placement new" and so forth I went back to the relevant
sections of TC++PL and of the FAQ, then I've implemented a simple Vector
template trying to follow the implementation of std::vector that I found
in my implementation (ehm... OK, you know what I mean despite that
repetition).
[...]
At some point I tested it with a fundamental type:

Vector<int> vec(10);

....and the template instantiation went fine, the program compiled and
ran as expected.

But then I thought: wait, I am calling the equivalent of "ptr->~int()"
within my destroy(), but if I write it directly, say something like this:

int* pi = new int;
pi->~int();

....then the compiler - as I expected - rejects it.

My conclusion is that the template mechanism is able to understand that
I'm going to do something useless with fundamental types and simply
ignores the "ptr->~value_type()" line when instantiating the template
with "value_type == some fundamental type".

Well, no, not the template mechanism. It's more like Johannes
said :)

Note the note (!) in [class.dtor]/15, too (who said the C++
standard doesn't have a rationale; it just "inline" :))

I see now, I shall force myself to refer to the standard before asking
such kind of questions here :-(
Any comment or further insight on the above?

Sooner or later I'll post here the complete implementation of my Vector
to get some advice (and some corrections, very likely), but in the mean
time I'm most concerned with the basic storage management internals that
I'm pasting here below, so please have a look and point out any wrong or
silly thing I could be doing:

I had a quick look. The abundance of reinterpret_casts is what
jumped most to my eye. (Well, duh... they have been invented to
stand out. I meant that you don't need them.)
value_type* allocate(size_t n) {
return
reinterpret_cast<value_type*> (
new char[sizeof(value_type) * n]
);
}

I think the best option for this function, if you want to have
it, is to just return a void *, which is in general the way to
warn/inform the reader that he is dealing with (yet) "untyped"
memory:

I forgot to make it explicit that those functions are private members of
my Vector template

I understood that. And that this is a throw-away experiment. It
was more of a general point. Or rather, two general points. One
point is that there's a reader of the private parts too. The
reader may be you or another person, now or (for things that you
don't throw away) five years from now, but regardless of who the
person is, the more you can do to communicate the better. The
other point is that the more you let the compiler help with the
type system the better. A T * that doesn't actually point to a T
(because you haven't constructed it yet) is usable in limited
ways but the compiler won't help you with limiting those ways.
- I also forgot to make them static, but I'll fix it,
although it shouldn't change much. The purpose of this exercise is to
create a Vector as self-contained as possible.

By centralizing the conversion from void* to value_type* I can simply write:

start = allocate(n);

....in the implementation of other methods such as reserve(), where
"start" is a private value_type* data member of Vector, pointing to the
beginning of the allocated space.

Then, at least, do the conversion with static_cast. You are
using reinterpret_cast because you have a char *, but if you
call an operator new function you can use static_cast.
void *
allocate( size_t n )
{
return new char[ n * sizeof... ] ; // or even just
// operator new( n * sizeof...)
// which I would prefer.
}

And I'd use unsigned char, although in C++ char probably works
too (I have never dug into whether it really does. IIRC char may
have trap representations in C. And --this is the part I've not
dug into-- it perhaps cannot have them in C++. Since unsigned
char is the de facto standard to signal "raw bytes", I just use
unsigned char and go on. In fact, the last time I tried to dig,
I seemed to find several oddities in the standard, so why
risking.)

I'm not so sure I really need to switch to unsigned char, the standard
makes explicit examples with "raw" char for this technique, I think it's
required to work in all cases.

You probably don't "need" to switch, as I said.
Back to your exercise, I'd recommend to mentally separate the
places where you can assume that there is an object constructed
in the buffer from the ones where you just have raw memory. And
code the details from there, with some private functions being
in fact the "transitions" between these two kinds of states.

I'll be back on this in a while.
void deallocate(void* ptr) {
delete[] reinterpret_cast<char*>(ptr);
}

I'd call the operator delete[] function. No need to
reinterpret_cast.

Uh... actually, before adding that cast, I wrote:

delete[] ptr;

....and the compiler warned about "deleting void* is undefined"... it did
not cross my mind that I could explicitly call operator delete[] as you
suggest (it didn't just because I didn't know that!).

Now my code reads:

operator delete[] (ptr);

....in that line.

OK. I just forgot that I had given a "non-matching"
recommendation for the allocate() function though.

To match:

void *
allocate( size_t n )
{
return operator new[]( n * sizeof... ) ;
}
I wonder why the compiler doesn't resolve "delete[]
ptr" to "operator delete[] (ptr)" and get rid of the warning... I need
more study to understand the issue, any further insight will be more
than welcome.

Oh, I hadn't sensed that you tried the delete expression on a
void pointer.

When the compiler sees your

delete[] ptr ;

it doesn't know what you really wanted to delete, and warns. So,
if you wanted to destroy objects (calling destructors) you'll
fix the type of ptr; if you just wanted to release storage
you'll call an operator delete function, directly.

It would really be a bad thing if the compiler did the sort of
dubious transformations that you suggest.
Eh, of course I don't need to cast them... silly me... but thanks for
pointing it out.

About the return value, I used it for a check at the calling place,
something like this:

assert(ptr == create(ptr));

....because at some point I thought that the new expression could
actually mangle the address to align it properly... now I almost sure
that such a thing could never happen.

You mean a difference between the address returned by the
operator new function and the address yielded by the new
expression?

In general there may be a difference for the array forms. Note
that if you do

operator new( n * sizeof( T ) )

you are not using an array form.

When you use array new through a new expression:

p = new T[ n ] ;

the compiler will often require something more than n * sizeof(
T ) and the additional space will be used for runtime
bookkeeping.

This difference may vary from one allocation to another (and be
zero for some of them --or all of them).

For arrays of char or unsigned char the difference is
constrained by the requirement in [expr.new]/10 (C++03), so that
to place your T's, you can use a new expression

new unsigned char[ ... ]

in alternative to the obvious

operator new[](...)

(This must be either a case where they didn't want to introduce
a subtle difference, or where they have noticed that a lot of
code used the former. I'll call my favorite C++ historian, here.
James? :))

The fact that requesting additional space is allowed only for
the array forms is spelled out in the already cited
[expr.new]/10:

A new-expression passes the amount of space requested to the
allocation function as the first argument of type std::size_t.
That argument shall be no less than the size of the object
being created; it may be greater than the size of the object
being created only if the object is an array.

This is confirmed by two notes: the one in bullet 14 and note
211 in clause 18.
void destroy(value_type* ptr) {
ptr->~value_type();
}

For your first experiments it is probably easier to use a
non-heap array, with one object. Just something like a

unsigned char m_buffer[ sizeof( T ) ]

member in your class template (which, given that it contains at
most one object, you'd probably no longer call "vector" :)).
That would get rid of allocate() and deallocate() and perhaps
even show more clearly that you are not doing any "placement
delete" (see [lib.new.delete.placement]). Then, for create() and
destroy() I think I'd just have:

void construct_object( T const& ) ;
void destroy_object() ;

And for their implementation, off the top of my head:

void *
address() // NOTE: depending on what you do, you might need a
// const version of this, too
{
return m_buffer ;
}

template< typename T>
void
...::construct_object( T const& t ) // PRE: first call, or first call
{ // after destroy_object
new ( address() ) T( t ) ;
}

template< typename T> // PRE: an object exists (its lifetime
void // has not ended)
...::destroy_object()
{
T * p( static_cast< T *>( address() ) ) ;
p->T::~T() ;
}

Note the separation I was talking about: before entering
construct_object you have to assume that there's no object in
the buffer (call it twice in a row and you have a problem :)).
At its return though, you can assume that *there is* an object
(leaving the function with an exception doesn't count as a
"return", of course). destroy_object() does the opposite
transition: it must be entered when there is an object (and so
you can static_cast) and at its return you have raw memory.
Note, too, that if address() returned a char *, you couldn't
static_cast it to T * directly.

The Vector implementation is already advanced enough to need dynamic
management of the storage, as I already have working reserve(),
push_back() and clear(), now I'm working on insert() and erase(), but I
think I will stop when I'll reach begin() and end() without implementing
reverse_iterator - more about this (and about its rationale) in a
further post, where I'll show my complete implementation

Well, consider that it may be difficult to get good commenting
(or even any commenting) if the amount of code is high.
for the public
delight && dissection - assuming short-circuit behavior at that "logical
and" ;-)

Dissection of who? :p
Thanks a lot for your notes Gennaro.

No problem. I'm over the amount of time that I can devote to
Usenet. Otherwise I'd have commented more.
 
F

Francesco S. Carta

On 27/08/2010 1.43, Francesco S. Carta wrote:
Hi there,
in order to get a better grip on the stuff about "operator new", "new
operator", "placement new" and so forth I went back to the relevant
sections of TC++PL and of the FAQ, then I've implemented a simple Vector
template trying to follow the implementation of std::vector that I found
in my implementation (ehm... OK, you know what I mean despite that
repetition).

[...]
At some point I tested it with a fundamental type:

Vector<int> vec(10);

....and the template instantiation went fine, the program compiled and
ran as expected.

But then I thought: wait, I am calling the equivalent of "ptr->~int()"
within my destroy(), but if I write it directly, say something like this:

int* pi = new int;
pi->~int();

....then the compiler - as I expected - rejects it.

My conclusion is that the template mechanism is able to understand that
I'm going to do something useless with fundamental types and simply
ignores the "ptr->~value_type()" line when instantiating the template
with "value_type == some fundamental type".

Well, no, not the template mechanism. It's more like Johannes
said :)

Note the note (!) in [class.dtor]/15, too (who said the C++
standard doesn't have a rationale; it just "inline" :))

I see now, I shall force myself to refer to the standard before asking
such kind of questions here :-(
Any comment or further insight on the above?

Sooner or later I'll post here the complete implementation of my Vector
to get some advice (and some corrections, very likely), but in the mean
time I'm most concerned with the basic storage management internals that
I'm pasting here below, so please have a look and point out any wrong or
silly thing I could be doing:

I had a quick look. The abundance of reinterpret_casts is what
jumped most to my eye. (Well, duh... they have been invented to
stand out. I meant that you don't need them.)

value_type* allocate(size_t n) {
return
reinterpret_cast<value_type*> (
new char[sizeof(value_type) * n]
);
}

I think the best option for this function, if you want to have
it, is to just return a void *, which is in general the way to
warn/inform the reader that he is dealing with (yet) "untyped"
memory:

I forgot to make it explicit that those functions are private members of
my Vector template

I understood that. And that this is a throw-away experiment. It
was more of a general point. Or rather, two general points. One
point is that there's a reader of the private parts too. The
reader may be you or another person, now or (for things that you
don't throw away) five years from now, but regardless of who the
person is, the more you can do to communicate the better. The
other point is that the more you let the compiler help with the
type system the better. A T * that doesn't actually point to a T
(because you haven't constructed it yet) is usable in limited
ways but the compiler won't help you with limiting those ways.
- I also forgot to make them static, but I'll fix it,
although it shouldn't change much. The purpose of this exercise is to
create a Vector as self-contained as possible.

By centralizing the conversion from void* to value_type* I can simply write:

start = allocate(n);

....in the implementation of other methods such as reserve(), where
"start" is a private value_type* data member of Vector, pointing to the
beginning of the allocated space.

Then, at least, do the conversion with static_cast. You are
using reinterpret_cast because you have a char *, but if you
call an operator new function you can use static_cast.

OK, I think I'm beginning to understand what you're telling me, but I'll
post my further questions as a separate discussion - this one is heavily
messed up due to my poor understanding of the matter.
void *
allocate( size_t n )
{
return new char[ n * sizeof... ] ; // or even just
// operator new( n * sizeof...)
// which I would prefer.
}

And I'd use unsigned char, although in C++ char probably works
too (I have never dug into whether it really does. IIRC char may
have trap representations in C. And --this is the part I've not
dug into-- it perhaps cannot have them in C++. Since unsigned
char is the de facto standard to signal "raw bytes", I just use
unsigned char and go on. In fact, the last time I tried to dig,
I seemed to find several oddities in the standard, so why
risking.)

I'm not so sure I really need to switch to unsigned char, the standard
makes explicit examples with "raw" char for this technique, I think it's
required to work in all cases.

You probably don't "need" to switch, as I said.

Sorry, I badly expressed myself: as I put it, it seemed that you were
telling me I "needed" to switch, which in fact you weren't.
Back to your exercise, I'd recommend to mentally separate the
places where you can assume that there is an object constructed
in the buffer from the ones where you just have raw memory. And
code the details from there, with some private functions being
in fact the "transitions" between these two kinds of states.

I'll be back on this in a while.

void deallocate(void* ptr) {
delete[] reinterpret_cast<char*>(ptr);
}

I'd call the operator delete[] function. No need to
reinterpret_cast.

Uh... actually, before adding that cast, I wrote:

delete[] ptr;

....and the compiler warned about "deleting void* is undefined"... it did
not cross my mind that I could explicitly call operator delete[] as you
suggest (it didn't just because I didn't know that!).

Now my code reads:

operator delete[] (ptr);

....in that line.

OK. I just forgot that I had given a "non-matching"
recommendation for the allocate() function though.

To match:

void *
allocate( size_t n )
{
return operator new[]( n * sizeof... ) ;
}
I wonder why the compiler doesn't resolve "delete[]
ptr" to "operator delete[] (ptr)" and get rid of the warning... I need
more study to understand the issue, any further insight will be more
than welcome.

Oh, I hadn't sensed that you tried the delete expression on a
void pointer.

When the compiler sees your

delete[] ptr ;

it doesn't know what you really wanted to delete, and warns. So,
if you wanted to destroy objects (calling destructors) you'll
fix the type of ptr; if you just wanted to release storage
you'll call an operator delete function, directly.

It would really be a bad thing if the compiler did the sort of
dubious transformations that you suggest.

Sorry, once more, my poor understanding of the matter made all the
discussion cloudy, I didn't catch the difference between calling a plain
"delete" and calling "operator delete". I know, the FAQ explains it, but
one can read the explanation again and again and still miss to catch the
point.
You mean a difference between the address returned by the
operator new function and the address yielded by the new
expression?

If I got your question right, the answer is negative.

What I meant to say is that I was afraid that the following assert could
fail:

value_type* place = start;
value_type* result = new(place) value_type;
assert(place == result);

But reading again the requirements for placement new I see that (among
other things) I'm expected to pass a pointer properly aligned for
value_type - by inference, that should mean that placement new will not
mangle the address and the assert should never fail.
In general there may be a difference for the array forms. Note
that if you do

operator new( n * sizeof( T ) )

you are not using an array form.

When you use array new through a new expression:

p = new T[ n ] ;

the compiler will often require something more than n * sizeof(
T ) and the additional space will be used for runtime
bookkeeping.

This difference may vary from one allocation to another (and be
zero for some of them --or all of them).

For arrays of char or unsigned char the difference is
constrained by the requirement in [expr.new]/10 (C++03), so that
to place your T's, you can use a new expression

new unsigned char[ ... ]

in alternative to the obvious

operator new[](...)

(This must be either a case where they didn't want to introduce
a subtle difference, or where they have noticed that a lot of
code used the former. I'll call my favorite C++ historian, here.
James? :))

The fact that requesting additional space is allowed only for
the array forms is spelled out in the already cited
[expr.new]/10:

A new-expression passes the amount of space requested to the
allocation function as the first argument of type std::size_t.
That argument shall be no less than the size of the object
being created; it may be greater than the size of the object
being created only if the object is an array.

This is confirmed by two notes: the one in bullet 14 and note
211 in clause 18.
void destroy(value_type* ptr) {
ptr->~value_type();
}

For your first experiments it is probably easier to use a
non-heap array, with one object. Just something like a

unsigned char m_buffer[ sizeof( T ) ]

member in your class template (which, given that it contains at
most one object, you'd probably no longer call "vector" :)).
That would get rid of allocate() and deallocate() and perhaps
even show more clearly that you are not doing any "placement
delete" (see [lib.new.delete.placement]). Then, for create() and
destroy() I think I'd just have:

void construct_object( T const& ) ;
void destroy_object() ;

And for their implementation, off the top of my head:

void *
address() // NOTE: depending on what you do, you might need a
// const version of this, too
{
return m_buffer ;
}

template< typename T>
void
...::construct_object( T const& t ) // PRE: first call, or first call
{ // after destroy_object
new ( address() ) T( t ) ;
}

template< typename T> // PRE: an object exists (its lifetime
void // has not ended)
...::destroy_object()
{
T * p( static_cast< T *>( address() ) ) ;
p->T::~T() ;
}

Note the separation I was talking about: before entering
construct_object you have to assume that there's no object in
the buffer (call it twice in a row and you have a problem :)).
At its return though, you can assume that *there is* an object
(leaving the function with an exception doesn't count as a
"return", of course). destroy_object() does the opposite
transition: it must be entered when there is an object (and so
you can static_cast) and at its return you have raw memory.
Note, too, that if address() returned a char *, you couldn't
static_cast it to T * directly.

The Vector implementation is already advanced enough to need dynamic
management of the storage, as I already have working reserve(),
push_back() and clear(), now I'm working on insert() and erase(), but I
think I will stop when I'll reach begin() and end() without implementing
reverse_iterator - more about this (and about its rationale) in a
further post, where I'll show my complete implementation

Well, consider that it may be difficult to get good commenting
(or even any commenting) if the amount of code is high.
for the public
delight&& dissection - assuming short-circuit behavior at that "logical
and" ;-)

Dissection of who? :p

Well, you are kidding but I'm starting to feel that somebody could
actually think about doing me "serious bodily harm" - I'm having hard
time wrapping my head around this stuff. As I said above, I'll post a
completely new thread and I'll forget about this one :)
No problem. I'm over the amount of time that I can devote to
Usenet. Otherwise I'd have commented more.

Don't worry, I'll think my next thread will be more concise and sharp,
and hopefully I'll dissipate all my doubts about it in very short time :)

Thanks a lot, once more.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,042
Latest member
icassiem

Latest Threads

Top