Class invariants and implicit move constructors (C++0x)

S

Scott Meyers

Consider a class with two containers, where the sum of the sizes of the
containers is cached. The class invariant is that as long as the cache is
claimed to be up to date, the sum of the sizes of the containers is accurately
cached:

class Widget {
public:
...
private:
std::vector<int> v1;
std::vector<int> v2;
mutable std::size_t cachedSumOfSizes;
mutable bool cacheIsUpToDate;

void checkInvariant() const
{ assert(!cacheIsUpToDate || cachedSumOfSizes == v1.size()+v2.size()); }
};

Assume that checkInvariant is called at the beginning and end of every public
member function. Further assume that the class declares no copy or more
operations, i.e., no copy or move constructor, no copy or move assignment
operator.

Suppose I have an object w where v1 and v2 have nonzero sizes, and
cacheIsUpToDate is true. Hence cachedSumOfSizes == v1.size()+v2.size(). If w
is copied, the compiler-generated copy operation will be fine, in the sense that
w's invariant will remain fulfilled. After all, copying w does not change it in
any way.

But if w is moved, the compiler-generated move operation will "steal" v1's and
v2's contents, setting their sizes to zero. That same compiler-generated move
operation will copy cachedSumOfSizes and cacheIsUpToDate (because moving
built-in types is the same as copying them), and as a result, w will be left
with its invariant unsatisfied: v1.size()+v2.size() will be zero, but
cachedSumOfSizes won't be, and cacheIsUpToDate will remain true.

When w is destroyed, the assert inside checkInvariant will fire when it's called
from w's destructor. That means that the compiler-generated move operation for
Widget broke the Widget invariant, even though the compiler-generated copy
operations for Widget left it intact.

The above scenario suggests that compiler-generated move operations may be
unsafe even when the corresponding compiler-generated copy operations are safe.
Is this a valid analysis?

Scott
 
A

Alf P. Steinbach /Usenet

* Scott Meyers, on 15.08.2010 06:07:
Consider a class with two containers, where the sum of the sizes of the
containers is cached. The class invariant is that as long as the cache
is claimed to be up to date, the sum of the sizes of the containers is
accurately cached:

class Widget {
public:
...
private:
std::vector<int> v1;
std::vector<int> v2;
mutable std::size_t cachedSumOfSizes;
mutable bool cacheIsUpToDate;

void checkInvariant() const
{ assert(!cacheIsUpToDate || cachedSumOfSizes == v1.size()+v2.size()); }
};

Assume that checkInvariant is called at the beginning and end of every
public member function. Further assume that the class declares no copy
or more operations, i.e., no copy or move constructor, no copy or move
assignment operator.

Suppose I have an object w where v1 and v2 have nonzero sizes, and
cacheIsUpToDate is true. Hence cachedSumOfSizes == v1.size()+v2.size().
If w is copied, the compiler-generated copy operation will be fine, in
the sense that w's invariant will remain fulfilled. After all, copying w
does not change it in any way.

But if w is moved, the compiler-generated move operation will "steal"
v1's and v2's contents, setting their sizes to zero. That same
compiler-generated move operation will copy cachedSumOfSizes and
cacheIsUpToDate (because moving built-in types is the same as copying
them), and as a result, w will be left with its invariant unsatisfied:
v1.size()+v2.size() will be zero, but cachedSumOfSizes won't be, and
cacheIsUpToDate will remain true.

When w is destroyed, the assert inside checkInvariant will fire when
it's called from w's destructor. That means that the compiler-generated
move operation for Widget broke the Widget invariant, even though the
compiler-generated copy operations for Widget left it intact.

The above scenario suggests that compiler-generated move operations may
be unsafe even when the corresponding compiler-generated copy operations
are safe. Is this a valid analysis?

As far as it goes, it seems so, yes, at least wrt. n3090 which is the draft I have.

A move from an object X leaves X as zombie where there's logically nothing left
to destroy, and class needs to be designed to deal with it, in order to deal
properly with it.

And so, given that & my general wait-n-see ignorance of C++0x, I'm surprised to
hear that a move constructor is implicitly generated (n3090 §12.8/10). It breaks
existing code. It should not have been done.

HOWEVER, it does not seem to be a problem with actual compilers for Windows.

A simpler example than yours, with explicitly declared move constructor:


<code>
#include <assert.h>
#include <string>
#include <iostream>

class Blah
{
private:
std::string blah;
public:
Blah(): blah( "blah" ) {}
Blah( Blah&& other ): blah( std::move( other.blah ) )
{
std::cout << "[" << other.blah << "]" << std::endl;
}
~Blah() { assert( blah == "blah" ); }
};

void foo( Blah&& a )
{
Blah* p = new Blah( std::move( a ) );
delete p;
}

int main()
{
foo( Blah() );
}
</code>


<example>
C:\test> g++ --version
g++ (TDM-2 mingw32) 4.4.1
Copyright (C) 2009 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


C:\test> g++ -std=c++0x x.cpp

C:\test> a
[blah]

C:\test> cl
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 16.00.30319.01 for 80x86
Copyright (C) Microsoft Corporation. All rights reserved.

usage: cl [ option... ] filename... [ /link linkoption... ]

C:\test> cl /nologo /EHsc /GR /Zc:forScope,wchar_t x.cpp
x.cpp

C:\test> x
[]
Assertion failed: blah == "blah", file x.cpp, line 15

C:\test> _
</example>


From the above, evidently MSVC 10.0 implements move constructors, while g++
4.4.1 does not, or it doesn't have an optimized std::string.

But when the move constructor definition is commented out, the assert does not
kick in, indicating that a move constructor is not automatically generated:


<example>
C:\test> cl /nologo /EHsc /GR /Zc:forScope,wchar_t x.cpp
x.cpp

C:\test> x

C:\test> _
</example>


So perhaps the standard could just be amended to reflect current practice? <g>


Cheers & hth.,

- Alf
 
S

Scott Meyers

Alf said:
A move from an object X leaves X as zombie where there's logically
nothing left to destroy, and class needs to be designed to deal with it,
in order to deal properly with it.

Moves don't leave objects in "a zombie state." They must leave objects in a
state where their invariants are satisfied. Note that not all objects being
moved from will be destroyed immediately thereafter. Consider:

std::thread t1([]{ doSomething(); };
std::thread t2(std::move(t2)); // move t2's state to t1

At this point, t1 has been moved from, but it still exists and can still be
used. It will be destroyed at the end of its scope, as usual, but until then,
it can have any std::thread member function invoked on it. I wouldn't call t1 a
zombie, but I agree that whatever state it has, the class must be designed to
deal with it.
HOWEVER, it does not seem to be a problem with actual compilers for
Windows.

I don't know of any compiler for Windows that implements the part of draft C++0x
pertaining to the implicit generation of move operations, but many draft C++0x
features are currently available in gcc 4.5 and MSVC 10. Check out
http://www.aristeia.com/C++0x/C++0xFeatureAvailability.htm .
A simpler example than yours, with explicitly declared move constructor:

I think the fundamental problem here is that a copy operation, because it
doesn't modify its source, can't invalidate a class invariant, but a move
operation can. Unless, of course, I'm overlooking something.

Scott
 
S

Scott Meyers

Scott said:
Consider:

std::thread t1([]{ doSomething(); };
std::thread t2(std::move(t2)); // move t2's state to t1

Arghh, typing too fast. I meant:

std::thread t2(std::move(t1)); // move t1's state to t2

Scott
 
S

SG

[...]
The above scenario suggests that compiler-generated move operations may be
unsafe even when the corresponding compiler-generated copy operations are safe.
Is this a valid analysis?

Yes, I think so. Thanks for pointing out this example. Actually, I've
recently written a class with a member of type vector<deque<float> >
where I also keep track of the minimum deque length and how many
deques have this minimum length. But I don't check this invariant
prior to destruction so it would only fail if the moved-from object is
still used without any kind of "re-initialization".

I guess the important question is: Are these tolerable corner cases?

Under the current rules, a solution for your example doesn't require
writing your own move operations or deleting them. It could look like
this:

class Widget {
public:
...
private:
std::vector<int> v1;
std::vector<int> v2;
mutable std::size_t cachedSumOfSizes;
mutable replace_on_move<bool,false> cacheIsUpToDate;
...
};

where replace_on_move<bool,false> is a class that wraps a bool and
sets it to false when it is moved from. I guess such a class template
would come in handy.

Cheers!
SG
 
A

Alf P. Steinbach /Usenet

* Scott Meyers, on 15.08.2010 10:00:
Alf said:
A move from an object X leaves X as zombie where there's logically
nothing left to destroy, and class needs to be designed to deal with
it, in order to deal properly with it.

Moves don't leave objects in "a zombie state." They must leave objects
in a state where their invariants are satisfied. Note that not all
objects being moved from will be destroyed immediately thereafter.
Consider:

std::thread t1([]{ doSomething(); };
std::thread t2(std::move(t2)); // move t2's state to t1

At this point, t1 has been moved from, but it still exists and can still
be used. It will be destroyed at the end of its scope, as usual, but
until then, it can have any std::thread member function invoked on it. I
wouldn't call t1 a zombie, but I agree that whatever state it has, the
class must be designed to deal with it.

Zombie does not mean indeterminate. It means non-usable after being logically
destroyed. You have that state for most Java objects (and in general for object
in garbage collected languages). You can view it as the desired class invariant
D augmented with "or zombie" Z, so that the effective class invariant is D||Z.
The augmented class invariant is satisfied, it's just very impractical because
it has to be checked for at each operation; ideally you'd want just D.

Thus, move semantics does not come at zero cost unless the class in question has
a natural "empty" state for objects as part of D.

std::string and std::vector are simple containers and have such state, other
classes may not have it naturally, and then after being moved from such an
object is zombie (satisfying only the augmented class invariant D||Z).

I don't know of any compiler for Windows that implements the part of
draft C++0x pertaining to the implicit generation of move operations,
but many draft C++0x features are currently available in gcc 4.5 and
MSVC 10. Check out
http://www.aristeia.com/C++0x/C++0xFeatureAvailability.htm .


I think the fundamental problem here is that a copy operation, because
it doesn't modify its source, can't invalidate a class invariant, but a
move operation can. Unless, of course, I'm overlooking something.

Huh. Pardon my french, but the above does not parse.


Cheers & hth.,

- Alf
 
S

SG

* Scott Meyers, on 15.08.2010 10:00:
Moves don't leave objects in "a zombie state." They must leave objects
in a state where their invariants are satisfied. Note that not all
objects being moved from will be destroyed immediately thereafter.
[...]

Zombie does not mean indeterminate. It means non-usable after being logically
destroyed.

"destroyed" is a strong word that is typically associated with the
destruction of an object (ending its life-time). In my opinion neither
"non-usable" nor "logically destroyed" are appropriate terms here.
Either, you're not familiar with the idea of how move semantics is
supposed to work or you have a funny definition of "non-usable" and
"logically destroyed" in mind.

Take a string class as example:

class string {
char *begin_;
char *end_;
char *capacity_;
public:
...
};

with the invariant

(1) all pointers are zero
OR (2) all pointers point to elements of the same array
with begin_ < end_ <= capacity_ and the array
is owned by the object.

Empty string values can be represented with (1) and non-empty string
values with (2). A reasonable implementation of a move constructor
just copies the pointers and sets the pointers in the object we "move
from" to zero, so it represents an empty string value. This object is
still in a "usable" state and I wouldn't describe this state as
"logically destroyed". It has obviously been mutated -- still obeying
its invariant but nothing happened that would qualify as "destruction"
in terms of the object's life-time.
You have that state for most Java objects (and in general for object
in garbage collected languages). You can view it as the desired class invariant
D augmented with "or zombie" Z, so that the effective class invariant is D||Z.
The augmented class invariant is satisfied, it's just very impractical because
it has to be checked for at each operation; ideally you'd want just D.

Thus, move semantics does not come at zero cost unless the class in question has
a natural "empty" state for objects as part of D.

std::string and std::vector are simple containers and have such state, other
classes may not have it naturally, and then after being moved from such an
object is zombie (satisfying only the augmented class invariant D||Z).

Then, you've written a very bad move constructor. You're not supposed
to let objects deteriorate in a way that makes them useless and only
ready for actual destruction!
Huh. Pardon my french, but the above does not parse.

A compiler-generated copy construction can, of course, invalidate
invariants as well. In generall, we call this a bug. See the example
from above. Without a user-defined copy constructor the last part of
the invariant is violated for both objects (source and target),
namely, "the [pointed-to] array is owned by the object". So, we could
reasonably say Scott's example class is just buggy in the sense that
"the rule of three" could be renamed to "the rule of five".

From what I can tell it really boils down to a trade-off. I like
implicitly generated move operations when they're correct. Obviously,
the current rules could break old classes and they could introduce
some gotchas. The question is, are these "tolerable corner cases" or
do we need more restrictive rules? For example, avoiding implicitly
generated move operations in case there are private data members. But
this additional restriction would require users to explicitly
"default" move operations in many classes. That's also undesirable in
my opinion.

Cheers!
SG
 
A

Alf P. Steinbach /Usenet

* SG, on 15.08.2010 13:49:
* Scott Meyers, on 15.08.2010 10:00:
Moves don't leave objects in "a zombie state." They must leave objects
in a state where their invariants are satisfied. Note that not all
objects being moved from will be destroyed immediately thereafter.
[...]

Zombie does not mean indeterminate. It means non-usable after being logically
destroyed.

"destroyed" is a strong word that is typically associated with the
destruction of an object (ending its life-time).

No, you need to get out of C++98 in order to discuss wider language features.

"Destroy" is a common term and method name for logical destruction in languages
where that operation has been necessary (it hasn't been in C++ until now).

I'm including Java and C#; you might care to look it up.

In my opinion neither
"non-usable" nor "logically destroyed" are appropriate terms here.
Either, you're not familiar with the idea of how move semantics is
supposed to work or you have a funny definition of "non-usable" and
"logically destroyed" in mind.

Hm, either you're not familiar with the the point of move semantics or you have
a funny idea of usability and are not grasping the idea of logical destruction.

Moving means pilfering resources. That's the point, it's all about efficiency.
Moving without efficiency is pointless, extra work for no gain. And not all
classes can support pilfering of resources (the efficiency aspect that is the
reason for move semantics) and leave usable objects around. For at least some
classes instances need resources in order to be usable, in the sense of
operations succeeding and actually doing things.

The best that can be done in the general case is an artificial nullstate. You
can articially define an object in an articial nullstate as "usable" --
well-defined errors on nearly all ops, like a std::whateverstream in error state
-- but that's just wordplay. It's then a zombie.

That does not mean that I'm advocating zombies.

I'm just pointing out that that's a logical consequence of supporting move
semantics where the natural class invariant lacks a natural empty state.

There is no way around a logical consequence.

Take a string class as example:

class string {
char *begin_;
char *end_;
char *capacity_;
public:
...
};

with the invariant

(1) all pointers are zero
OR (2) all pointers point to elements of the same array
with begin_< end_<= capacity_ and the array
is owned by the object.

A string class is a class with a natural empty state as part of its normal class
invariant.

What you should look for is an example that doesn't have that.

Like a fixed size matrix implemented in terms of std::vector, say.

Empty string values can be represented with (1) and non-empty string
values with (2). A reasonable implementation of a move constructor
just copies the pointers and sets the pointers in the object we "move
from" to zero, so it represents an empty string value. This object is
still in a "usable" state and I wouldn't describe this state as
"logically destroyed". It has obviously been mutated -- still obeying
its invariant but nothing happened that would qualify as "destruction"
in terms of the object's life-time.

Right, it's a class that has a natural empty state.

That is not the general case.

It's not the droid you should be looking for.

Then, you've written a very bad move constructor. You're not supposed
to let objects deteriorate in a way that makes them useless and only
ready for actual destruction!

Or ready for re-initialization, that's a common technique in GC languages where
such objects abound.

You're right that that's Bad.

Wrt. move semantics you might classify classes as

A) Trivially supporting move semantics, an automatically generated move
constructor does the job.

B) Needing explicit support. This was the case with Scott's example class
with caching. It just needed a defined move constructor.

C) Not naturally supporting move semantics: no natural empty state, move
semantics support would introduce artifical zombie state.

I hope you're not proposing that all designs should be limited to classes that
are naturally compatible with move semantics, (A) and (B).

The presence of (C)-like classes in many/most designs is why the draft's (the
one I looketh at) /automatically generated/ move constructor is a very bad idea.
It's a trade-off that the programmer should explicitly have to make, because an
automatically generated move constructor means that those objects need to be
handled with extreme care. The default is the wrong way.

You'll probably agree with that when you've thought about it a little.

Huh. Pardon my french, but the above does not parse.

A compiler-generated copy construction can, of course, invalidate
invariants as well. In generall, we call this a bug. See the example
from above. Without a user-defined copy constructor the last part of
the invariant is violated for both objects (source and target),
namely, "the [pointed-to] array is owned by the object". So, we could
reasonably say Scott's example class is just buggy in the sense that
"the rule of three" could be renamed to "the rule of five".

From what I can tell it really boils down to a trade-off. I like
implicitly generated move operations when they're correct. Obviously,
the current rules could break old classes and they could introduce
some gotchas.
Yes.


The question is, are these "tolerable corner cases"

No, they're intolerable common cases, and as you note above, they're in
/existing code/.

Recompile with C++0x and you break things.

or
do we need more restrictive rules? For example, avoiding implicitly
generated move operations in case there are private data members. But
this additional restriction would require users to explicitly
"default" move operations in many classes. That's also undesirable in
my opinion.

Why? As Scott's example shows, default move semantics does the Wrong Thing for
non-trivial classes, for classes that have class invariants. I would not be
surprised if that means for most classes.


Cheers & hth.,

- Alf
 
H

Howard Hinnant

Consider a class with two containers, where the sum of the sizes of the
containers is cached.  The class invariant is that as long as the cache is
claimed to be up to date, the sum of the sizes of the containers is accurately
cached:

The above scenario suggests that compiler-generated move operations may be
unsafe even when the corresponding compiler-generated copy operations are safe.
  Is this a valid analysis?

I believe so.

I don't have much to add that others haven't already said, except to
add a reference link for those who would like to dive deeper into how
we got here:

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3053.html

-Howard
 
S

SG


As for not getting the move semantics concept: I'm well aware of the
rules of the language in that regard. The rest is just a matter of
terminology and point of view (me not liking your zombie term). Of
course, as a class author, you're free to define any semantics you
want. The trouble I see here is in the word "use" or "non-usable".
It's not exactly clear what you consider a "use" and what not. I'm
guessing that operator= would not qualify as "use" but as some kind of
"re-initialization" that establishes again the "natural invariant" (in
case the source was not a "zombie").

What I don't like about this is the "double standard". You're making
the distinction between "natural invariant" (that is not necessarily
valid in case of a "zombie state") and the actual class invariant
(including the "zombie state") that is supposed to be valid during the
whole object's life-time. I see no point in this distinction between
"natural null state" and "zombie null state". For all I care there is
a single invariant that includes the state you refer to as "zombie
state" or "null state". Now, if you define some operations to be
illegal in this state or not is up to you as a class designer. At the
very least this state should not make any differences in terms of
assignability (which could arguably be considered a "use").
A compiler-generated copy construction can, of course, invalidate
invariants as well. In generall, we call this a bug. See the example
from above. Without a user-defined copy constructor the last part of
the invariant is violated for both objects (source and target),
namely, "the [pointed-to] array is owned by the object". So, we could
reasonably say Scott's example class is just buggy in the sense that
"the rule of three" could be renamed to "the rule of five".
From what I can tell it really boils down to a trade-off. I like
implicitly generated move operations when they're correct. Obviously,
the current rules could break old classes and they could introduce
some gotchas.
Yes.

The question is, are these "tolerable corner cases"

No, they're intolerable common cases, and as you note above, they're in
/existing code/.

Existing code, yes.
"intolerable common cases", I don't know.

Because, obviously, I like not having to write

myclass(myclass&&) = default;
myclass& operator=(myclass&&) = default;

in almost all my classes to make them "move-enabled".
As Scott's example shows, default move semantics does the Wrong Thing for
non-trivial classes, for classes that have class invariants. I would not be
surprised if that means for most classes.

I would be surprized if that means "for most classes". The compiler
WON'T implicitly declare a move ctor for classes with user-defined
copy ctor as it is. I don't think there are lots of classes without
user-defined copy ctor where the implicitly generated move ctor would
break the program. I don't know for sure. But I think you're
overestimating the severity of this a little.

Cheers!
SG
 
H

Howard Hinnant

I would be surprized if that means "for most classes". The compiler
WON'T implicitly declare a move ctor for classes with user-defined
copy ctor as it is. I don't think there are lots of classes without
user-defined copy ctor where the implicitly generated move ctor would
break the program. I don't know for sure. But I think you're
overestimating the severity of this a little.

<nitpick>
Fwiw, a "user declared" copy constructor is sufficient to inhibit the
implicitly defined move constructor. I.e.

Widget(const Widget&) = default;

is sufficient. (the above is considered a declaration, not a
definition).
</nitpick>

--

I've been exploring the ramifications of Scott's example:

When Widget is used in generic std::code, such as vector<Widget> and
std::algorithm(sequence of Widgets), I believe it will continue to
work as in C++03, only faster /almost/ all of the time.

Explanation: The following members of Widget are implicitly defined:

Widget();
Widget(const Widget&);
Widget& operator=(const Widget&);
Widget(Widget&&);
Widget& operator=(Widget&&);
~Widget();

Implicitly defined members will not perform the invariant check at
their beginning and end. In almost all std-defined generic code, a
moved-from Widget will have exactly one of the following operations
applied to it:

Widget& operator=(const Widget&);
Widget& operator=(Widget&&);
~Widget();

Except for ~Widget(), these operations will restore the invariant. In
the case of ~Widget(), the invariant likely won't matter. However if
~Widget() is defined with an assertion check, then we have a noisy
error that can be corrected.

Example: One can insert into, and erase from vector<Widget>. Though
there will be intermediate stages of broken invariants of Widget,
these will not be exposed unless the insert throws. After a normally-
ending erase and insert, all Widgets in the vector will have satisfied
invariants. Similarly for all std::algorithms except for remove and
remove_if. And even in those, the typical idiomatic usage won't
expose a Widget with a broken invariant:

v.erase(remove(v.begin(), v.end(), w), v.end()); // All Widgets
in v still have intact invariants

In summary: Yes, there is breakage here. But I think it will be
uncommonly rare. Not only do you need a class like Scott's Widget,
but you also need to do something with a moved-from value besides
destruct it or assign it a new value. And this would have to happen
without an explicit move (since move doesn't exist in C++03):

vector<Widget>::iterator i = remove(v.begin(), v.end(), w);
if (i != v.end())
i->checkInvariant(); // bang, you're dead!

Thus I don't find Scott's example a show stopper, considering the
benefits that special move members bring (everything is an engineering
tradeoff). After all, even the addition of features as innocuous (and
as necessary) as decltype and static_assert have the potential to
break existing code (all existing code using those spellings).

I continue to believe the benefits outweigh the risks for special move
members. But no doubt, when upgrading to C++0X, thorough testing is
warranted, just as it is for any other change in your development
environment.

-Howard
 
S

Scott Meyers

SG said:
A compiler-generated copy construction can, of course, invalidate
invariants as well.

Yes, but we have two decades of experience in identifying classes where the
compiler-generated copy operations would not behave correctly. To me, the
interesting thing isn't that implicitly-generated copy operations can be
incorrect (that's old news), but that implicitly-generated move operations can
be incorrect even when implicitly-generated copy operations are fine. In other
words, our intuition from C++98 wrt compiler-generated functions is insufficient
in the world of C++0x.

We have developed good guidelines regarding when users need to define their own
copy operations, e.g., when classes have pointer data members, when classes have
user-defined destructors, etc. What are the corresponding guidelines for when
users should define their own move operations? Saying "when the
compiler-generated versions would be incorrect" is hardly helpful.

Given that C++0x now supports defaulted special functions, I'm inclined to think
that a potentially useful rule is simply "Always declare copy and move
operations." If the compiler-generated versions would be okay, just say so:

class Widget {
public:
Widget(const Widget&) = default;
Widget(Widget&&) = default;
Widget& operator=(const Widget&) = default;
Widget& operator=(Widget&&) = default;
...
};

This involves a certain amount of syntactic noise for simple classes, but it has
the advantage that both human readers and source code analysis tools can verify
that the person writing the class has thought about and verified that the
compiler-generated versions do the right thing. It also addresses the
longstanding C++ question about whether users should declare their own copy
operations even when the compiler-generated versions would do the right thing.
With this rule, the answer is yes.

To some degree, the syntactic noise can be muted via macro:

class Widget {
DEFAULT_COPY_AND_MOVE_ARE_OKAY(Widget);
...
};

Scott
 
S

Scott Meyers

Howard said:
Example: One can insert into, and erase from vector<Widget>. Though
there will be intermediate stages of broken invariants of Widget,
these will not be exposed unless the insert throws. After a normally-
ending erase and insert, all Widgets in the vector will have satisfied
invariants.

Unfortunately, any temporaries created for the insert may not. Consider:

std::vector<Widget> vw;

vw.push_back(Widget(x, y, z)); // asserts during destruction of temporary
> the case of ~Widget(), the invariant likely won't matter. However if
> ~Widget() is defined with an assertion check, then we have a noisy
> error that can be corrected.

I'm not sure how to correct this noisy error. We don't want to remove the
invariant check from ~Widget, because presumably it's there to detect corrupted
objects. I suppose we could do

vw.push_back((Widget&)Widget(x, y, z)); // force copy instead of move

but that seems pretty obscure.
In summary: Yes, there is breakage here. But I think it will be
uncommonly rare.

I agree that the problem is likely to be quite uncommon. But given the size of
the body of existing code that the committee trots out as its standard argument
for avoiding introducing breaking changes (especially silent ones), it seems odd
that the decision was made to ask developers to verify that all their old code
will continue to work rather than to have them manually add defaulted move
operations to new code. Yes, I realize that this would mean that old classes
would not magically benefit from move semantics when compiled with a C++0x compiler.

On the plus side, I suppose, static analysis tools can identify classes where
move operations will be automatically generated, so at least developers have a
way to find out where they need to check things.

Scott
 
A

Alf P. Steinbach /Usenet

* SG, on 15.08.2010 17:42:

As for not getting the move semantics concept: I'm well aware of the
rules of the language in that regard. The rest is just a matter of
terminology and point of view (me not liking your zombie term).

You've lost the discussion already when it is the terminology you object to.

"Zombie" is a somewhat derogatory term.

That's because it is undesirable feature: it's a good term.

Of
course, as a class author, you're free to define any semantics you
want. The trouble I see here is in the word "use" or "non-usable".
It's not exactly clear what you consider a "use" and what not. I'm
guessing that operator= would not qualify as "use" but as some kind of
"re-initialization" that establishes again the "natural invariant" (in
case the source was not a "zombie").

You're guessing correctly. No-one designs object for the purpose of being
assignable. If an object is assignable then that is in support of whatever the
object's purpose is.

What I don't like about this is the "double standard". You're making
the distinction between "natural invariant" (that is not necessarily
valid in case of a "zombie state") and the actual class invariant
(including the "zombie state") that is supposed to be valid during the
whole object's life-time.

Right, it's undesirable.

I see no point in this distinction between
"natural null state" and "zombie null state".

That's pretty stupid, sorry.

One can't discuss things without aknowledging their existence.

For example, you provided an example where the distinction did not matter.
That's a fallacy if you understood it. Now you snipped a more relevant example,
and that's a fallacy because there's no chance that you haven't understood it by
now.

For all I care there is
a single invariant that includes the state you refer to as "zombie
state" or "null state".

Then you end up with Microsoft-like code where you don't know whether your
window object has been initialized or not yet. You fail to make the critical
distinction, what has to be checked for at every operation, but make an
irrelevant technical distinction instead, as if a class invariant is something
to be determined after-the-fact of design, that MS classes with nullstates have
good class invariants. They do not have good class invariants. In short, your
point of view of here is to avert your eyes from the most relevant aspect, and
that's, again, just plain stupid as a real view.

However, it can make sense as argumentative technique, muddying the waters.

And I suspect that's what you're doing, given the quoting and snipping.

Now, if you define some operations to be
illegal in this state or not is up to you as a class designer. At the
very least this state should not make any differences in terms of
assignability (which could arguably be considered a "use").

I'm sorry but that's silly. Do you /like/ to check at every operation whether
your object is in usable state? Do you like the resulting bugs? Have you any
experience at all dealing with such objects? Your argument is just gibberish.

Huh. Pardon my french, but the above does not parse.
A compiler-generated copy construction can, of course, invalidate
invariants as well. In generall, we call this a bug. See the example
from above. Without a user-defined copy constructor the last part of
the invariant is violated for both objects (source and target),
namely, "the [pointed-to] array is owned by the object". So, we could
reasonably say Scott's example class is just buggy in the sense that
"the rule of three" could be renamed to "the rule of five".
From what I can tell it really boils down to a trade-off. I like
implicitly generated move operations when they're correct. Obviously,
the current rules could break old classes and they could introduce
some gotchas.
Yes.

The question is, are these "tolerable corner cases"

No, they're intolerable common cases, and as you note above, they're in
/existing code/.

Existing code, yes.
"intolerable common cases", I don't know.

You should know.

Because, obviously, I like not having to write

myclass(myclass&&) = default;
myclass& operator=(myclass&&) = default;

in almost all my classes to make them "move-enabled".

I doubt that almost all your classes are assignable.

If they are then your designs are pretty uncommmon ones.

Anyway, having to explicitly enable an /optimization/ that may be /incorrect/
for your class, is good, not bad.

And having it applied as default is bad, not good.

Correctness is more important than micro-efficiency, especially when said
micro-efficiency can be added by the programmer who has determined that it's safe.

I would be surprized if that means "for most classes". The compiler
WON'T implicitly declare a move ctor for classes with user-defined
copy ctor as it is.

User define copy constructors are not necessarily as common as you think.

Classes built from parts that are smart about copying (including smart pointers)
don't need them.

I don't think there are lots of classes without
user-defined copy ctor where the implicitly generated move ctor would
break the program.

You snipped one simple example.

Such examples abound.

Averting your eyes does not make them go away.

I don't know for sure. But I think you're
overestimating the severity of this a little.

Perhaps. It doesn't matter. The standard's default is the wrong way anyhow,
whether the code that's broken by it is 10% or 50%.


Cheers & hth.,

- Alf
 
S

SG

Howard said:
Example:  One can insert into, and erase from vector<Widget>.  Though
there will be intermediate stages of broken invariants of Widget,
these will not be exposed unless the insert throws.  After a normally-
ending erase and insert, all Widgets in the vector will have satisfied
invariants.

Unfortunately, any temporaries created for the insert may not.
[...]

Here, Howard was assuming that the destructor was also compiler-
generated.

Maybe, the rules could be just a tad more restrictive. For example,
also inhibiting the generation of move operations in case there is a
user-declared DESTRUCTOR seems like a reasonable idea to me at the
moment.
Consider:
   std::vector<Widget> vw;
   vw.push_back(Widget(x, y, z));  // asserts during destruction of temporary

This piece of code would be okay with the above rule change.
 > the case of ~Widget(), the invariant likely won't matter.  However if
 > ~Widget() is defined with an assertion check, then we have a noisy
 > error that can be corrected.

I'm not sure how to correct this noisy error. We don't want to remove the
invariant check from ~Widget, because presumably it's there to detect corrupted
objects.

The easy fix is to write your own move constructor that sets the
mutable cacheIsUpToDate member from the source to false. As
alternative, I suggested to replace the bool type with a class that
wraps a bool and is automatically set to false if you move from it
(see replace_on_move<bool,false>).

Cheers!
SG
 
H

Howard Hinnant

I agree that the problem is likely to be quite uncommon.  But given the size of
the body of existing code that the committee trots out as its standard argument
for avoiding introducing breaking changes (especially silent ones), it seems odd
that the decision was made to ask developers to verify that all their old code
will continue to work rather than to have them manually add defaulted move
operations to new code.  Yes, I realize that this would mean that old classes
would not magically benefit from move semantics when compiled with a C++0x compiler.

I'm disappointed that instead of following the links I supplied to
explain the motivation for this language feature you instead made
assumptions and then based on those incorrect assumptions disparaged
the hard work and long hours of many volunteers who have been doing
nothing more than try to improve the industry in which you make a
living.

The special move members were introduced to solve a correctness/
backwards compatibility problem. Not to get an automatic
optimization.

Here's the link again:

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3053.html

The references N3053 derives from are just as important.

-Howard
 
A

Alf P. Steinbach /Usenet

* Howard Hinnant, on 15.08.2010 23:00:
enum PositionState { empty, nought, cross };

class TicTacToeBoard
{
private:
std::vector< PositionState > positions_;
public:
TicTacToeBoard(): positions_( 9 ) {}

// Operations.
};

I'm disappointed that instead of following the links I supplied to
explain the motivation for this language feature you instead made
assumptions and then based on those incorrect assumptions disparaged
the hard work and long hours of many volunteers who have been doing
nothing more than try to improve the industry in which you make a
living.

Motivations good, final result bad. It's like Obama being Thorbjørned[1] into
getting the Nobel Peace Prise. Intentions of committee very good, I assure you.

The special move members were introduced to solve a correctness/
backwards compatibility problem. Not to get an automatic
optimization.

Sorry, that statement is logically inconsistent, and to boot addresses only
motivations, not final result (which result is very bad). Inconsistencies: (1)
there would be no correctness problem without moves and (2) there would be no
move semantics except for the motivation of efficiency. Plus as mentioned (3),
the motivation only matters for evaluating and choosing a fix, not for
discussing whether there is a problem, which there is: it breaks code.

Here's the link again:

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3053.html

The references N3053 derives from are just as important.

I think it's irrelevant, and that what is relevant is that people apparently
have painted themselves into position corners, unable to see solutions.

I'm pretty sure that resistance against fixing has to do with simple fear that
the benefit of efficiency will not be conferred on existing code to the degree
that it could have been, said fear based on inability to see solution to that.

One simple solution is how C++ has worked until now: each compiler vendor
provides options to let the programmer decide to apply optimizations that break
general standard C++ code.

For example, as default for direct invocation Microsoft's compiler applies the
optimization of not supporting RTTI (dynamic_cast, typeid) and not supporting
exception stack unwinding. These aspects can be turned on by compiler switches.
In Microsoft's IDE it's opposite: the language features are there by default,
and can be turned off by clicking here and there in suitable places.

Given that this programmer-decides-by-tool-invocation approach, while
problematic, has worked for practical programming work until now, the same
approach for letting the programmer decide to apply move semantics by default to
old code should work in practice. Especially when done in a better way than
backward compatibility has forced Microsoft to (wrong defaults for direct
compiler invocation, there not even supporting standard 'main'). On the other
hand, forcing it on the programmer by language rules is another matter; new
language rules that break existing code in spades are just Very Very Bad.


Cheers & hth.,

- Alf


Notes:
[1] The term "Thorbjørned" was introduced by the New York Times, after the
Norwegian politican Thorbjørn Jagland, former PM and member of the Nobel Peace
Prise committee, who pushed strongly for Obama receiveing the prise, which Obama
did not want. It's sort of a play on Norwegian "bjørnetjeneste", a bear trying
to help a squirrel getting rid of a fly on the squirrel's nose, by swatting the
fly. Good intention, good motivation, bad result...
 
S

Scott Meyers

Howard said:
The special move members were introduced to solve a correctness/
backwards compatibility problem. Not to get an automatic
optimization.

Here's the link again:

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3053.html

I've read the explanatory part of N3053. (I skipped the proposed normative
language.) My table at
http://www.aristeia.com/C++0x/C++0xFeatureAvailability.htm even refers to it.
I've also read the equivalent parts of N3044, N2904, N2855 (and possibly others
-- it's hard to keep track after a while). I really do try to do my homework
before posting.

As regards the correctness/backwards compatibility problem, I don't know what
problem you are referring to. Fundamentally, the move operations can be special
or not. In the CD (N2798), they were not. In the FCD (N3092), they are. I
would be grateful if you would point me to an explanation of what
correctness/backwards compatibility problem in the CD draft is solved by making
the move operations special.

Scott
 
H

Howard Hinnant

One simple solution is how C++ has worked until now: each compiler vendor
provides options to let the programmer decide to apply optimizations that break
general standard C++ code.

For example, as default for direct invocation Microsoft's compiler applies the
optimization of not supporting RTTI (dynamic_cast, typeid) and not supporting
exception stack unwinding. These aspects can be turned on by compiler switches.
In Microsoft's IDE it's opposite: the language features are there by default,
and can be turned off by clicking here and there in suitable places.

Given that this programmer-decides-by-tool-invocation approach, while
problematic, has worked for practical programming work until now, the same
approach for letting the programmer decide to apply move semantics by default to
old code should work in practice. Especially when done in a better way than
backward compatibility has forced Microsoft to (wrong defaults for direct
compiler invocation, there not even supporting standard 'main'). On the other
hand, forcing it on the programmer by language rules is another matter; new
language rules that break existing code in spades are just Very Very Bad.

If you would like to officially propose a solution, contact me
privately (with draft in hand) and I will help you get a paper
published.

-Howard
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,763
Messages
2,569,562
Members
45,038
Latest member
OrderProperKetocapsules

Latest Threads

Top