vector.clear() and vector copying

J

Jess

Hello,

I tried to clear a vector "v" using "v.clear()". If "v" contains
those objects that are non-built-in (e.g. string), then "clear()" can
indeed remove all contents. However, if "v" contains built-in types
(e.g. int), then "clear()" doesn't remove anything at all. Why does
"clear()" have this behaviour?

Also, when I copy one vector "v1" from another vector "v2", with "v1"
longer than "v2" (e.g. "v1" has 2 elements and "v2" has one element),
then depending on the type of their contents, I found out:

1. if the type is built-in then, then v1[0] has the value of v2[0],
but v1[1] still has the old value of v1[1].
2. if the type is not built-in, then v1[0] has the value of v2[0], but
v1[1] has some garbage value.

This makes me think vector copying and clearing (clear.()) have
different effects to vectors, depending on the type. what is going on
exactly?

Thanks!
 
?

=?iso-8859-1?q?Erik_Wikstr=F6m?=

Hello,

I tried to clear a vector "v" using "v.clear()". If "v" contains
those objects that are non-built-in (e.g. string), then "clear()" can
indeed remove all contents. However, if "v" contains built-in types
(e.g. int), then "clear()" doesn't remove anything at all. Why does
"clear()" have this behaviour?

Also, when I copy one vector "v1" from another vector "v2", with "v1"
longer than "v2" (e.g. "v1" has 2 elements and "v2" has one element),
then depending on the type of their contents, I found out:

1. if the type is built-in then, then v1[0] has the value of v2[0],
but v1[1] still has the old value of v1[1].
2. if the type is not built-in, then v1[0] has the value of v2[0], but
v1[1] has some garbage value.

This makes me think vector copying and clearing (clear.()) have
different effects to vectors, depending on the type. what is going on
exactly?

Most likely your test is wrong not the implementation of the vector. A
quick guess is that you did something like this:

std::vector<int> v;
v.push_back(1);
v.push_back(2);
v.push_back(3);

v.clear();

std::cout << v[2] << std::endl;

and got 3 printed. However the above code is not correct, the []
operator is defined so that v[n] == *(v.begin() + n), but since you
ran clear() the iterator pointed to by v.begin() + n is no longer
valid so just about anything can happen. Try to reproduce your results
but use the v.at() instead of v[] and see if you still get the same
result.
 
G

Gavin Deane


Hello. Before I go on, have a look at the FAQ for advice on posting
questions about code that doesn't work.

http://www.parashift.com/c++-faq-lite/how-to-post.html#faq-5.8

The important point there is to post the code, not a description of
the code. You are having a problem in C++. You can describe that
problem precisely in C++, but only approximately in English. I am
having to guess a bit, which means I might be wrong.

For an example of what minimal means, see the code example below or
the one I posted in response to your question about returning
references to local variables. Enough to illustrates the point but
absolutely no more.

One final point. ALWAYS copy and paste DIRECTLY from your code editor
into your message. If you type code in to your message by hand, you
risk introducing other errors which distract from your actual problem.

Have a look at the rest of the FAQ too while you're there. Lots of
useful stuff.
I tried to clear a vector "v" using "v.clear()". If "v" contains
those objects that are non-built-in (e.g. string), then "clear()" can
indeed remove all contents. However, if "v" contains built-in types
(e.g. int), then "clear()" doesn't remove anything at all. Why does
"clear()" have this behaviour?

What makes you think nothing is removed?

#include <vector>
#include <iostream>
#include <string>
using namespace std;

int main()
{
vector<int> vi(2);
vector<string> vs(2);
cout << vi.size() << " " << vi.size() << "\n";

vi.clear();
vs.clear();
cout << vi.size() << " " << vi.size() << "\n";
}

If the above program prints out anything other than

2 2
0 0

then your compiler's broken.

Now, (and here I am speculating), maybe your development environment
allows you to look at the contents of the memory that, before you
called clear, was owned by the vector. Note that clear does not have
to *change* the contents of the memory, it just removes the elements
from the vector. As far as the vector is concerned, it is now empty.
What values happen to still exist in memory it no longer owns is of no
concern.
Also, when I copy one vector "v1" from another vector "v2", with "v1"
longer than "v2" (e.g. "v1" has 2 elements and "v2" has one element),
then depending on the type of their contents, I found out:

1. if the type is built-in then, then v1[0] has the value of v2[0],
but v1[1] still has the old value of v1[1].
2. if the type is not built-in, then v1[0] has the value of v2[0], but
v1[1] has some garbage value.

This time you'll have to show some code. There are lots of ways of
copying elements from one to another - assigment, calling insert,
std::copy, manually copying element by element. Those are just some
examples off the top of my head. Without knowing which technique you
are using it is impossible to know why you are seeing confusing
results. However, I'll have a guess again. Are you trying something
like this?

#include <vector>
#include <iostream>
using namespace std;

int main()
{
// Create two different sized vectors
// and give them values.
vector<int> v1(1);
vector<int> v2(2);

v1[0] = 42;

v2[0] = 100;
v2[1] = 200;

// Manually copy elements from v2 (size 2)
// to v1 (size 1).
v1[0] = v2[0]; // OK, now v1[0] == 100.
v1[1] = v2[1]; // NOOOOOO!! BAD!!
}

The last line v1[1] = v2[1]; can not work. v1 has size 1 so there is
no element with an index of 1 (which would be the second element) to
assign to. If you try and do something like that, the behaviour is
undefined. That means anything can happen. It can crash. It can appear
to work. It can behave differently for different contained types. It
can work in debug but not in release. It can stop working when you
change to a different compiler, or just change compiler settings. It
can work on Monday but not on Tuesday. Or anything else you can think
of.
This makes me think vector copying and clearing (clear.()) have
different effects to vectors, depending on the type. what is going on
exactly?

Rest assured that the effect of any operation on a vector is
independent of the type of object contained (as long as it is a type
that meets the requirements - copy constructible and assignible - for
being stored in a vector).

If I haven't answered your question yet, post some code, following the
guidelines in the FAQ, that illustrates the problem.

Gavin Deane
 
M

Michael Ekstrand

Hello,

I tried to clear a vector "v" using "v.clear()". If "v" contains
those objects that are non-built-in (e.g. string), then "clear()" can
indeed remove all contents. However, if "v" contains built-in types
(e.g. int), then "clear()" doesn't remove anything at all. Why does
"clear()" have this behaviour?

Also, when I copy one vector "v1" from another vector "v2", with "v1"
longer than "v2" (e.g. "v1" has 2 elements and "v2" has one element),
then depending on the type of their contents, I found out:

1. if the type is built-in then, then v1[0] has the value of v2[0],
but v1[1] still has the old value of v1[1].
2. if the type is not built-in, then v1[0] has the value of v2[0], but
v1[1] has some garbage value.

This makes me think vector copying and clearing (clear.()) have
different effects to vectors, depending on the type. what is going on
exactly?

The vector class doesn't actually shed its capacity for elements when
cleared, or when reduced by some other means (resize(), or assigning a
smaller vector). This is for efficiency - if it doesn't need to resize,
why waste the time allocating a new array, copying data, etc.? Also, if
you then expand it (e.g. with insert() or push_back()), it will just need
to re-allocate more space. The vector maintains a knowledge of how many
elements it's actually storing, as well as how many elements it's
allocated to be able to store, so the vector length is correct even if the
allocated memory is high.

With your v1/v2 example, yes, v1[1] still has the old value, but that is
undefined behavior. You cannot count on it - accessing v1[1] is illegal.
It just happens to still work.

If you actually want to make sure that you get rid of unneeded capacity,
the idiomatic way to accomplish this is:

vector(v1).swap(v1);

This creates a new vector, which will only allocate as much memory as is
needed to hold the current *contents* of v1. It then swaps this vector's
internal storage with the storage of v1 (a constant-time operation). v1's
old internal storage goes away, and v1 now has a stripped store.

- Michael
 
B

BobR

Gavin Deane said:
The important point there is to post the code, not a description of
the code. [snip]
One final point. ALWAYS copy and paste DIRECTLY from your code editor
into your message. If you type code in to your message by hand, you
risk introducing other errors which distract from your actual problem.
[snip]
What makes you think nothing is removed?

#include <vector>
#include <iostream>
#include <string>
using namespace std;

<joking>
Mr. Deane!! When you *give* advice, you should also *use* it!
int main(){
vector<int> vi(2);
vector<string> vs(2);
// > cout << vi.size() << " " << vi.size() << "\n";
cout << vi.size() << " " << vs.size() << "\n";
vi.clear();
vs.clear();
// > cout << vi.size() << " " << vi.size() << "\n";
cout << vi.size() << " " << vs.size() << "\n";
}

If the above program prints out anything other than
2 2
0 0

then your compiler's broken.

Been there, done that! (way too many times! <G>).
 
J

Jess

Thanks for all the replies and sorry for not being too specific, here
is my code. :)

#include<iostream>
#include<vector>
#include<string>

using namespace std;

int main(){
vector<int> v1;
vector<int> v2;
vector<string> v3;
v1.push_back(1);
v1.push_back(2);
v2.push_back(3);
v3.push_back("a");

cout << v1.size() << endl; //output size is indeed 2
cout << v2.size() << endl; //output size is indeed 1
cout << v3.size() << endl; //output size is indeed 1
v1.clear(); //clear v1
cout << v1.size() << endl; //v1's size is indeed 0
cout << v1[0] << endl; //output is 1, garbbage?

v3.clear();
cout << v3.size() << endl; //size is indeed 0
cout << v3[0] << endl; //null string on one machine, and
"Segmentation fault" on another

//vector copying
v1.push_back(1); //put back the values again
v1.push_back(2);

v1 = v2; //copy v2 to v1
cout << v1.size() << endl; //v1's size has been decreased to 1
return 0;
}

If I used vector.at() as suggested by Erik for v1.at(0), then I got
"Abort". So, is vector.at() a much better way to access vector's
content than indices?

Thanks!
 
?

=?iso-8859-1?q?Erik_Wikstr=F6m?=

Thanks for all the replies and sorry for not being too specific, here
is my code. :)

#include<iostream>
#include<vector>
#include<string>

using namespace std;

int main(){
vector<int> v1;
vector<int> v2;
vector<string> v3;
v1.push_back(1);
v1.push_back(2);
v2.push_back(3);
v3.push_back("a");

cout << v1.size() << endl; //output size is indeed 2
cout << v2.size() << endl; //output size is indeed 1
cout << v3.size() << endl; //output size is indeed 1
v1.clear(); //clear v1
cout << v1.size() << endl; //v1's size is indeed 0
cout << v1[0] << endl; //output is 1, garbbage?

v3.clear();
cout << v3.size() << endl; //size is indeed 0
cout << v3[0] << endl; //null string on one machine, and
"Segmentation fault" on another

//vector copying
v1.push_back(1); //put back the values again
v1.push_back(2);

v1 = v2; //copy v2 to v1
cout << v1.size() << endl; //v1's size has been decreased to 1
return 0;

}

If I used vector.at() as suggested by Erik for v1.at(0), then I got
"Abort". So, is vector.at() a much better way to access vector's
content than indices?

Yes, it checks whether the element you are trying to access exist or
not, if it doesn't it will throw an exception. However depending on
your usage of the vector there might be situations where you know for
sure that the element is in the vector, and then the [] operator can
be used, on example of such a situation is when looping over all
members of the vector:

for (int i = 0; i < v.size(); ++i)
std::cout << v << "\n";
 
G

Gavin Deane

<joking>
Mr. Deane!! When you *give* advice, you should also *use* it!
</joking> <G>

Oh but I did. I compiled it, I ran it *and* I checked the output. It's
just that the program I wrote happens to generate the same output as
the program I meant to write...

Thanks :)

Gavin Deane
 
G

Gavin Deane

Thanks for all the replies and sorry for not being too specific, here
is my code. :)

#include<iostream>
#include<vector>
#include<string>

using namespace std;

int main(){
vector<int> v1;
vector<int> v2;
vector<string> v3;
v1.push_back(1);
v1.push_back(2);
v2.push_back(3);
v3.push_back("a");

cout << v1.size() << endl; //output size is indeed 2
cout << v2.size() << endl; //output size is indeed 1
cout << v3.size() << endl; //output size is indeed 1
v1.clear(); //clear v1
cout << v1.size() << endl; //v1's size is indeed 0
cout << v1[0] << endl; //output is 1, garbbage?

The behaviour of the above statement, specifically the expression
v1[0] which attempts to access the first element of a vector with no
elements, is undefined. Any output is as valid or invalid as any other
output, and any other behaviour (e.g. a crash) is as valid or invalid
as outputting something.
v3.clear();
cout << v3.size() << endl; //size is indeed 0
cout << v3[0] << endl; //null string on one machine, and
"Segmentation fault" on another

As before, the behaviour of v3[0] is undefined. Outputting a null
string is just as vaild or invalid as a segmentation fault.
//vector copying
v1.push_back(1); //put back the values again
v1.push_back(2);

v1 = v2; //copy v2 to v1
cout << v1.size() << endl; //v1's size has been decreased to 1
return 0;

}

If I used vector.at() as suggested by Erik for v1.at(0), then I got
"Abort".

You are getting undefined behaviour with v1[0] because you are trying
to access an element that doesn't exist. v1[n] has undefined behaviour
for any n >= the number of elements in the vector (>= rather than >
because indices start at 0). If you are going to access elements of a
vector using v1[n] it is your responsibility as the programmer to
ensure that n is a valid index, i.e. n < the number of elements in the
vector.

However, vector being a sensibly designed class, a vector object knows
how many elements it has. The at() member function provides a
different way of accessing elements. When you do v1.at(n), inside the
at function, the vector checks for itself whether n is a vaild index.
If so, it returns you the element just like v1[n], but this time if n
is not a valid index, at throws an exception (of type
std::eek:ut_of_range). The important distinction is that this behaviour
is well defined and reliable, unlike the behaviour of v1[n] for
invalid n. You saw an abort because your program did not catch the
exception so it terminated in the same way it would for any uncaught
exception.
So, is vector.at() a much better way to access vector's
content than indices?

Not necessarily. The undefined behaviour in your program was due to
bugs that could have been avoided (avoided easily now you know how
element access works for vectors). A program like yours controls the
size of the vector and controls the indices used (as opposed to, for
example, asking a user to type in sizes and indices) so an out of
range index will only ever be a programmer error. The solution is to
fix the error. Having said that, while you are learning to use
vectors, and presumably writing programs with them that are a little
more complex than your demo program here, you may find it useful to
use at() quite liberally to help you identify bugs. As you get the
hang of it, you will quickly indentify situations where you can be
confident enough in the code that you don't need the extra protection
of at().

Much of the time I use [] to access vector elements it is in a loop
like

vector<int> v(42);
for (vector<int>::size_type index = 0; index < v.size(); ++index)
{
// Do something with v[index].
}

Because of the way the loop is written, I know that index must be
valid for every iteration [*] so at() doesn't offer me any advantage.
But more often I use iterators rather than indices to access elements
anyway.

[*] Assuming I don't do anything inside the loop that modifies the
size of the vector, but that's generally something to be very careful
with anyway because it makes the loop harder to get correct.

Gavin Deane
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,575
Members
45,053
Latest member
billing-software

Latest Threads

Top