vector swap time complexity

F

Francesco S. Carta

Francesco said:
[..]
Just as a side note, Victor, I understand the points you mentioned
about the members, their size and their disposition in memory, and I
understand that such kind of cheats isn't absolutely on-topic here.

Oh, don't get me wrong.  The approach you used and the discussion that
flows from it, are on topic.  One's being wrong (or one's technique's
being non-portable or causing undefined behaviour) does not make one's
post off-topic.  Actually, au contraire.  I would rather have a chance
to discuss the approach and its portability, and provide my point of
view, and have my opinion questioned, than have a sterile newsgroup with
only "ethical" code posted.

My fault, as I said in my previous post, I didn't mean "ethical"
within its social implications.

I didn't consider that such kind of tricks can actually be righteous
in real life coding, hence I supposed it wasn't so good to speak about
them in this ng, being that it has an important place as an educative
reference.

I thought about digging deeper the "reading others' memory" subject
but I changed my mind.
Yet the drift as gone far enough, we'll have further occasions, maybe
into an ad-hoc thread.
I just like to fiddle with things to get better insights. [..]

And that's good!

:)

Cheers,
Francesco
____________________________
Francesco S. Carta, hobbyist
http://fscode.altervista.org
 
M

Michael DOUBEZ

Jiøí Paleèek a écrit :
Michael Doubez wrote:
for simple "int" structure, the time complexity of
"swap(&int,&int)"
is simply O(1);
Actually, for any type T vector<T>.swap(othervector) has
complexity of
O(1) - the data storages and other fields are exchanged between the
objects, no data outside of the actual 'vector<T>' object moves.
Well, I am no sure if we are at the same page about the meaning of O
(1).
I suspect it can be O(n), where "n" refers to the size of the type T
(such as a vector).
Of course, it's always O(1) regarding to the whole object T.

From §23.1/5, I imagine the swap operation could be o(N) - because of
the infamous Note A - but requirements of §23.1/10 (i.e. swap doesn't
invalidate pointer, reference and iterators to elements) makes it
impossible in practice.

Why? The pointer is not invalidated, it just points to an element in
another container (and so does a reference or an iterator). Consider:
[...]

My point was that a swap in o(n) is not likely to be possible (unless
you use some trick to copy the data and keep the old data somewhere to
be flush on next invalidating function call).

Why? The "old data" is kept in the other vector (the one swapped with).

Hmmm. Is my english that bad ?

IIUC the question from the OP was to know whether swap() was guaranteed
to be O(1). The answer I make is that, although it is not guaranteed by
a "must" in the standard, the constraints on vector<>::swap() don't
allow for other complexity (like O(n)) to exist.
 
V

Victor Bazarov

Michael said:
Jiøí Paleèek a écrit :
Michael Doubez wrote:
for simple "int" structure, the time complexity of
"swap(&int,&int)"
is simply O(1);
Actually, for any type T vector<T>.swap(othervector) has
complexity of
O(1) - the data storages and other fields are exchanged between the
objects, no data outside of the actual 'vector<T>' object moves.
Well, I am no sure if we are at the same page about the meaning of O
(1).
I suspect it can be O(n), where "n" refers to the size of the type T
(such as a vector).
Of course, it's always O(1) regarding to the whole object T.

From §23.1/5, I imagine the swap operation could be o(N) - because of
the infamous Note A - but requirements of §23.1/10 (i.e. swap
doesn't
invalidate pointer, reference and iterators to elements) makes it
impossible in practice.

Why? The pointer is not invalidated, it just points to an element in
another container (and so does a reference or an iterator). Consider:
[...]

My point was that a swap in o(n) is not likely to be possible (unless
you use some trick to copy the data and keep the old data somewhere to
be flush on next invalidating function call).

Why? The "old data" is kept in the other vector (the one swapped with).

Hmmm. Is my english that bad ?

IIUC the question from the OP was to know whether swap() was guaranteed
to be O(1). The answer I make is that, although it is not guaranteed by
a "must" in the standard, the constraints on vector<>::swap() don't
allow for other complexity (like O(n)) to exist.

I think the confusion is coming from your (rather strange) use of proof
by negation. Supposedly, there are some conditions that make O(n)
pretty much *impossible*, *therefore* only O(1) is possible. That just
doesn't sound right.

No conditions can exist that would make it impossible to implement a
slower algorithm while allowing for a faster one. I can always add a
superficial loop to repeat *any* algorithm N times, which would
contribute nothing to the result, and render the resulting algorithm N
times slower.

Or is my understanding of algorithm complexity that bad?

V
 
A

Alf P. Steinbach

* Victor Bazarov:
Michael said:
Jiøí Paleèek a écrit :
On Tue, 15 Sep 2009 16:32:48 +0200, Michael Doubez

Michael Doubez wrote:
for simple "int" structure, the time complexity of
"swap(&int,&int)"
is simply O(1);
Actually, for any type T vector<T>.swap(othervector) has
complexity of
O(1) - the data storages and other fields are exchanged between
the
objects, no data outside of the actual 'vector<T>' object moves.
Well, I am no sure if we are at the same page about the meaning
of O
(1).
I suspect it can be O(n), where "n" refers to the size of the
type T
(such as a vector).
Of course, it's always O(1) regarding to the whole object T.

From §23.1/5, I imagine the swap operation could be o(N) -
because of
the infamous Note A - but requirements of §23.1/10 (i.e. swap
doesn't
invalidate pointer, reference and iterators to elements) makes it
impossible in practice.

Why? The pointer is not invalidated, it just points to an element in
another container (and so does a reference or an iterator). Consider:
[...]

My point was that a swap in o(n) is not likely to be possible (unless
you use some trick to copy the data and keep the old data somewhere to
be flush on next invalidating function call).

Why? The "old data" is kept in the other vector (the one swapped with).

Hmmm. Is my english that bad ?

IIUC the question from the OP was to know whether swap() was
guaranteed to be O(1). The answer I make is that, although it is not
guaranteed by a "must" in the standard, the constraints on
vector<>::swap() don't allow for other complexity (like O(n)) to exist.

I think the confusion is coming from your (rather strange) use of proof
by negation. Supposedly, there are some conditions that make O(n)
pretty much *impossible*, *therefore* only O(1) is possible. That just
doesn't sound right.

No conditions can exist that would make it impossible to implement a
slower algorithm while allowing for a faster one.

Requiring a fast one is such a constraint. :)

I can always add a
superficial loop to repeat *any* algorithm N times, which would
contribute nothing to the result, and render the resulting algorithm N
times slower.

Or is my understanding of algorithm complexity that bad?

No, I think Michael is referring to the constraints on swap in §23.1/10, as
mentioned above, which if they were absolute constraints would make it
impossible for swap to allocate memory, since then it couldn't throw.

With respect to std::vector those constraints are not exactly clear, however.

§23.1/10 states "Unless otherwise specified (see ... 23.2.4.3) ...", but in
§23.2.4.3, which is about std::vector modifiers, one finds nothing about swap.

This seems like a defect in the 1998 standard.

I haven't checked whether it's been corrected or whether there is an active issue.


Cheers,

- Alf
 
J

Jiøí Paleèek

Jiøí Paleèek a écrit :
Michael Doubez wrote:
for simple "int" structure, the time complexity of
"swap(&int,&int)"
is simply O(1);
Actually, for any type T vector<T>.swap(othervector) has
complexity of
O(1) - the data storages and other fields are exchanged between
the
objects, no data outside of the actual 'vector<T>' object moves.
Well, I am no sure if we are at the same page about the meaning of
O
(1).
I suspect it can be O(n), where "n" refers to the size of the type
T
(such as a vector).
Of course, it's always O(1) regarding to the whole object T.

From §23.1/5, I imagine the swap operation could be o(N) - because
of
the infamous Note A - but requirements of §23.1/10 (i.e. swap
doesn't
invalidate pointer, reference and iterators to elements) makes it
impossible in practice.

Why? The pointer is not invalidated, it just points to an element in
another container (and so does a reference or an iterator). Consider:
[...]

My point was that a swap in o(n) is not likely to be possible (unless
you use some trick to copy the data and keep the old data somewhere to
be flush on next invalidating function call).
Why? The "old data" is kept in the other vector (the one swapped with).

Hmmm. Is my english that bad ?

No, your English is good. The misunderstanding came from something else.
IIUC the question from the OP was to know whether swap() was guaranteed
to be O(1). The answer I make is that, although it is not guaranteed by
a "must" in the standard, the constraints on vector<>::swap() don't
allow for other complexity (like O(n)) to exist.

You answered "I imagine the swap operation could be o(N) ... but
requirements ... makes it impossible in practice." In this sentence, you
used little-oh, which means roughly "sharply less then c*n", but you meant
big-oh (O(n)) which means roughly "less than _or_ _equal_ to c*n". With
this shift of meaning, the sentence has a totally different sense.

Also, your point along the lines of "manipulation with the data is
impossible, so it must be O(1)" is not really valid. First, nobody can
prevent me to create an algorithm as slow as I want - even without
violating any restrictions. Second, there may be (vector) implementations
that actually don't have swap with complexity O(n), or O(2^n), or anything
else depending solely on n. Imagine an implementation of vector that keeps
track of its iterators (by maintaining a list, or maybe an intrusive list
of them, and storing a reference to the vector in the actual iterator) for
debugging purposes or something. When you swap two of these vectors, all
of the iterators currently pointing to them have to be relabeled to keep
the info consistent. This will take time proportional to the number of
iterators you have, which is definitely not O(1).

Regards
Jiri Palecek
 
J

Jerry Coffin

[ ... ]
I think the confusion is coming from your (rather strange) use of
proof by negation. Supposedly, there are some conditions that make
O(n) pretty much *impossible*, *therefore* only O(1) is possible.
That just doesn't sound right.

No conditions can exist that would make it impossible to implement a
slower algorithm while allowing for a faster one. I can always add a
superficial loop to repeat *any* algorithm N times, which would
contribute nothing to the result, and render the resulting algorithm N
times slower.

Technically, you're right -- if somebody wanted to badly enough, they
could create a swap that had linear (or quadratic, cubic, etc.)
complexity.

I suppose you could argue that there might (conceivably) even be a
practical application. For example, consider if you had a string
class that allocated string space from a garbage collected pool. When
you swapped two vectors of strings, you might also decide to invoke
the garbage collector for the string pool space. In this case, the
complexity of the swap might be non-constant, and completely
unrelated to the size of either vector.

Of course, somebody might want to know why you'd invoke GC in
response to swapping to vectors of strings. I don't have a good
answer for that. While I can conceive of the possibility, I'm not
sure I have a good justification.
 
J

Joshua Maurice

No, never.  C-style casts are notorious for being hard to spot.  Use
'reinterpret_cast' or 'static_cast', at least whoever gets to maintain
your code can use "find in files" to identify questionable (potentially
dangerous) areas or techniques.

Also never use C-style casts because sometimes they're static_casts,
and sometimes they're reinterpret_casts, and they can change silently
based on context, like if you have a fully defined type in scope, but
someone changed the full definition to a forward declaration off
somewhere completely separate from your code in a header from a header
from a header.
//could be a reinterpret_cast or a static_cast.
A* a = (B*) b;
You almost never want a reinterpret_cast, and C-style casts written
intended as static_casts can silently change and break your code, all
without any compiler warnings or errors.
 
J

Jerry Coffin

[ ... ]
No, I think Michael is referring to the constraints on swap in
§23.1/10, as mentioned above, which if they were absolute
constraints would make it impossible for swap to allocate memory,
since then it couldn't throw.

With respect to std::vector those constraints are not exactly clear, however.

I think they are.
§23.1/10 states "Unless otherwise specified (see ... 23.2.4.3) ...", but in
§23.2.4.3, which is about std::vector modifiers, one finds nothing about swap.

The constraints in §23.1/10 also talk about (for example) push_back,
and 23.2.4.3 has some other specifications about push_back, which
modify the constraints on push_back when applied to a vector. Since
there is no other specification about swap for a vector, the
constraints in §23.1/10 are absolute for vector::swap().
 
A

Alf P. Steinbach

* Jerry Coffin:
[ ... ]
No, I think Michael is referring to the constraints on swap in
§23.1/10, as mentioned above, which if they were absolute
constraints would make it impossible for swap to allocate memory,
since then it couldn't throw.

With respect to std::vector those constraints are not exactly clear, however.

I think they are.
§23.1/10 states "Unless otherwise specified (see ... 23.2.4.3) ...", but in
§23.2.4.3, which is about std::vector modifiers, one finds nothing about swap.

The constraints in §23.1/10 also talk about (for example) push_back,
and 23.2.4.3 has some other specifications about push_back, which
modify the constraints on push_back when applied to a vector. Since
there is no other specification about swap for a vector, the
constraints in §23.1/10 are absolute for vector::swap().

Ah, thanks.

Conclusion: std::swap is O(1) guaranteed for vector (in terms of vector size),
*unless* the implementation adds in a do-nothing loop of higher complexity just
to be mean. :)


Cheers,

- Alf
 
J

Jerry Coffin

[ ... ]
Conclusion: std::swap is O(1) guaranteed for vector (in terms of
vector size), *unless* the implementation adds in a do-nothing loop
of higher complexity just to be mean. :)

.... or somebody thought it was a convenient place to do something
(almost) entirely unrelated.
 
M

Michael Doubez

Jiøí Paleèek a écrit :
for simple "int" structure, the time complexity of  
"swap(&int,&int)"
is simply O(1);
Actually, for any type T vector<T>.swap(othervector) has  
complexity of
O(1) - the data storages and other fields are exchanged between  
the
objects, no data outside of the actual 'vector<T>' object moves.
Well, I am no sure if we are at the same page about the meaning of  
O
(1).
I suspect it can be O(n), where "n" refers to the size of the type  
T
(such as a vector).
Of course, it's always O(1) regarding to the whole object T.
From §23.1/5, I imagine the swap operation could be o(N) - because  
of
the infamous Note A -  but requirements of §23.1/10 (i.e.. swap  
doesn't
invalidate pointer, reference and iterators to elements) makes it
impossible in practice.
Why?  The pointer is not invalidated, it just points to an element in
another container (and so does a reference or an iterator).  Consider:
[...]
My point was that a swap in o(n) is not likely to be possible (unless
you use some trick to copy the data and keep the old data somewhere to
be flush on next invalidating function call).
 Why? The "old data" is kept in the other vector (the one swapped with).
Hmmm. Is my english that bad ?

No, your English is good. The misunderstanding came from something else.
IIUC the question from the OP was to know whether swap() was guaranteed  
to be O(1). The answer I make is that, although it is not guaranteed by  
a "must" in the standard, the constraints on vector<>::swap() don't  
allow for other complexity (like O(n)) to exist.

You answered "I imagine the swap operation could be o(N) ... but
requirements ... makes it impossible in practice." In this sentence, you
used little-oh, which means roughly "sharply less then c*n", but you meant
big-oh (O(n)) which means roughly "less than _or_ _equal_ to c*n". With
this shift of meaning, the sentence has a totally different sense.

Also, your point along the lines of "manipulation with the data is
impossible, so it must be O(1)" is not really valid. First, nobody can
prevent me to create an algorithm as slow as I want - even without
violating any restrictions. Second, there may be (vector) implementations
that actually don't have swap with complexity O(n), or O(2^n), or anything
else depending solely on n. Imagine an implementation of vector that keeps
track of its iterators (by maintaining a list, or maybe an intrusive list
of them, and storing a reference to the vector in the actual iterator) for
debugging purposes or something. When you swap two of these vectors, all
of the iterators currently pointing to them have to be relabeled to keep
the info consistent. This will take time proportional to the number of
iterators you have, which is definitely not O(1).

Point taken. I was definitely unclear.
 
J

James Kanze

I disagree, if we understand that the goal is to investigate
what a particular implementation actually does, and not
something to be used in production code.

The only question I have to pose here is the use of size_t. In
practice, it's probably an adequate choice for almost any
implementation, but formally, anything but a character type
results in undefined behavior.

I have the following in my library, expressedly for such cases:

// Dump:
// =====
//
//! This is a simple class which can be used to generate a hex
//! dump of any object. (It generates the hex dump all on a
//! single line, as a sequence of two byte hex values,
separated
//! by spaces.) In almost all cases, the actual instance of
this
//! class will be a temporary, generated as the return value
of
//! the function template dump(), so that template type
deduction
//! can be used.
//!
//! \warning
//! This class does <strong>not</strong> save a copy of
the
//! object, but only a reference to it. This means that
it
//! will contain a dangling reference if the object ceases
to
//! exist before the instance of the class. On the other
//! hand, it also means that it is possible to dump the
memory
//! of objects which cannot be copied, a signaling NaN,
for
//! example. It is, however strong recommended that only
//! temporary instances of this class be used (and that no
//! instance be bound to a reference, other than as a
function
//! argument); the rules concerning the lifetime of
temporarys
//! make this always safe.
//
---------------------------------------------------------------------------
template< typename T >
class Dump : Gabi::Util::IOStreamOperators< Dump< T > >
{
public:
explicit Dump( T const& obj ) ;
void print( std::eek:stream& dest ) const ;

//!@cond implementation
private:
unsigned char const*myObj ;
//!@endcond
} ;

//! Allows the use of template type deduction, i.e.:
//! \code
//! std::cout << dump( someObject ) ;
//! \endcode
//! rather than
//! \code
//! std::cout << Dump< ObjectType >( someObject ) ;
//! \endcode
//!
//! \param obj
//! The object for which an instance of Dump<T> is
desired.
//!
//! \return
//! The instance of Dump<T>.
//
---------------------------------------------------------------------------
template< typename T >
inline Dump< T >
dump(
T const& obj )
{
return Dump< T >( obj ) ;
}

template< typename T >
Dump< T >::Dump(
T const& obj )
: myObj( reinterpret_cast< unsigned char const* >( &obj ) )
{
}

template< typename T >
void
Dump< T >::print(
std::eek:stream& dest ) const
{
IOSave saver( dest ) ;
dest.fill( '0' ) ;
dest.setf( std::ios::hex, std::ios::basefield ) ;
char const* baseStr = "" ;
if ( (dest.flags() & std::ios::showbase) != 0 ) {
baseStr = "0x" ;
dest.unsetf( std::ios::showbase ) ;
}
unsigned char const* const
end = myObj + sizeof( T ) ;
for ( unsigned char const* p = myObj ; p != end ; ++ p ) {
if ( p != myObj ) {
dest << ' ' ;
}
dest << baseStr << std::setw( 2 ) << (unsigned int)
( *p ) ;
}
}

(Note that the class template IOStreamOperators generates the <<
and >> operators automatically, and IOSave, of course, saves the
various formatting information, and restores it in its
destructor.)

Again, I don't think I've ever used it in production code, but
it's often useful for checking assumptions about the
implementation (big endian or little, are doubles IEEE, etc.).

[...]
BTW, should I have preferred a C-style cast such as
"(size_t*)pv" instead of reinterpret_cast, in such a cheat?

Absolutely not. The purpose of reinterpret_cast is to allow
such cheats. And to make them visible as such.
 
F

Francesco S. Carta

I disagree, if we understand that the goal is to investigate
what a particular implementation actually does, and not
something to be used in production code.

I missed that Victor's line before... I suppose he meant to speak
strictly about my original code - which, I admit, is not exactly good.
You're both right. It's good for investigating and it's bad because it
makes assumptions on an object without verifying them in the code (a
simple "sizeof(vector)", an adequate loop and the game was done).

Anyway, such things are better restricted to research code - as we all
agree :)
Unfortunately I still have no "production code" to worry about - but
I'd like to.
The only question I have to pose here is the use of size_t. In
practice, it's probably an adequate choice for almost any
implementation, but formally, anything but a character type
results in undefined behavior.

Well, I didn't say that before, when I replied to Victor: that was
less than a guess, 'twas a try ;-)

I knew that I could use chars - I did it with them the first time -
but since I would have had to explicitly group, shift and add them up
in order to get an address to compare, I have decided to use size_t
directly, without worrying about anything at all.

I'm using reinterpret_cast<uint8_t*> now in my code.

Thanks for the heads up, to both.

Best regards,
Francesco
____________________________
Francesco S. Carta, hobbyist
http://fscode.altervista.org
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,143
Latest member
SterlingLa
Top