[newbie] strange pointer behavior

M

Markus Dehmann

I discovered a strange pointer behavior that screws my program. Below
is a mini code example.

Basically, I have a vector with data and want a function (getLongest)
to find one of these values. The function should return a pointer to
the value. In the function, everything works, it finds the correct
element, but the returned pointer points to an empty element (but not
NULL). It's very strange. Can someone explain what's wrong with the
example?

--------------------------------------
#include <string>
#include <vector>
#include <iostream>

using namespace std;

class Data{
public:
Data(string val){
data = val;
length = data.length();
}
inline int getLength(){return length;}
inline int getData(){return data;}
private:
string data;
int length;
};

Data *getLongest(vector<Data> data){
Data *longest;
for(int i=0; i < data.size(); i++){
if(data.getLength() > longest->getLength()){
longest = &data;
}
}
cerr << " Longest is " << longest->data << endl; // correct output
return longest;
}

int main(){
vector<Data> myData;
myData.push_back(Data("one"));
myData.push_back(Data("three"));
myData.push_back(Data("four"));

Data *longest = getLongest(myData);
cerr << "main(): Longest is " << longest->data << endl; // incorrect
}
--------------------------------------

I wanted that getLongest() returns a pointer, not the value itself, so
that I can be sure that, when I change that value later, the original
element in the vector is changed. Is that a correct assumption?

Another problem: To iterate over the vecor, I would like to use an
iterator, but that does not work:
for ( vector<Data>::iterator it = data.begin(); it != data.end();
++it ){
if(it->getLength() > longest->getLength()){
longest = it; // ??? or:
longest = *it; // what do I have to say here???
}
}

Thanks for every hint! (grrr, switching from Java to C++ is really
hard!!)
Markus
 
R

Rob Williscroft

Markus Dehmann wrote in
Data *getLongest(vector<Data> data){
Data *longest;

Data *longest = 0;
for(int i=0; i < data.size(); i++){
if(data.getLength() > longest->getLength()){


if( !longest || data.getLength() > longest->getLength()){

longest = &data;
}
}
cerr << " Longest is " << longest->data << endl; // correct output
return longest;
}


Notes
1) logical || shortcuts.
2) The above will return 0 (NULL) if data is empty (maybe you would
prefer to throw an exception.

HTH.

Rob.
 
A

Aggro

Markus said:
Basically, I have a vector with data and want a function (getLongest)
to find one of these values. The function should return a pointer to
the value. In the function, everything works, it finds the correct
element, but the returned pointer points to an empty element (but not
NULL). It's very strange. Can someone explain what's wrong with the
example?

The code you provided doesn't even compile. Did you test it before
sending it?
--------------------------------------
#include <string>
#include <vector>
#include <iostream>

using namespace std;

class Data{
public:
Data(string val){
data = val;
length = data.length();
}
inline int getLength(){return length;}
inline int getData(){return data;}

If you want to return string, then return string not int.

inline string getData(){return data;}
private:
string data;
int length;
};

Data *getLongest(vector<Data> data){
Data *longest;
for(int i=0; i < data.size(); i++){
if(data.getLength() > longest->getLength()){


Your program would crash or do some other undefined behaviour here,
because you are accessing the method of longest, which is not set to
point anywhere yet. You could assign a null value to it before you use
it, and then compare here

if( !longest || ( data.getLength() > longest->getLength() ) )
longest = &data;
}
}
cerr << " Longest is " << longest->data << endl; // correct output


You can't access private attributes of a class from outiside of the
class, that's the whole idea of attributes to be private.

cerr said:
return longest;
}

You are returning a pointer to a vector item which will be destroyed
with the vector after this function ends. I'm guessing that you didn't
know that when you pass the vector to function the way you are doing
here, a copy of it is created and used inside the function.

You could use something like:

Data *getLongest( vector<Data> &data )

To pass a reference of the vector to your function. Then it won't be
copied, but the same vector will be used, so the pointer should work also.
 
J

John Harrison

Markus Dehmann said:
I discovered a strange pointer behavior that screws my program. Below
is a mini code example.

Basically, I have a vector with data and want a function (getLongest)
to find one of these values. The function should return a pointer to
the value. In the function, everything works, it finds the correct
element, but the returned pointer points to an empty element (but not
NULL). It's very strange. Can someone explain what's wrong with the
example?

--------------------------------------
#include <string>
#include <vector>
#include <iostream>

using namespace std;

class Data{
public:
Data(string val){
data = val;
length = data.length();
}
inline int getLength(){return length;}
inline int getData(){return data;}
private:
string data;
int length;
};

Data *getLongest(vector<Data> data){

Your vector data is created here.
Data *longest;
for(int i=0; i < data.size(); i++){
if(data.getLength() > longest->getLength()){
longest = &data;
}
}
cerr << " Longest is " << longest->data << endl; // correct output
return longest;


And destroyed here, but you've returned a pointer to a destroyed vector.

You could pass in a reference to your vector, which would (a) work and (b)
be tons more efficient

Data *getLongest(vector<Data>& data){

But I would consider returning the index of the longest element. That avoids
all this trouble, and also allows you to use a const reference.

size_t getLongestIndex(const vector said:
}

int main(){
vector<Data> myData;
myData.push_back(Data("one"));
myData.push_back(Data("three"));
myData.push_back(Data("four"));

Data *longest = getLongest(myData);
cerr << "main(): Longest is " << longest->data << endl; // incorrect
}
--------------------------------------

I wanted that getLongest() returns a pointer, not the value itself, so
that I can be sure that, when I change that value later, the original
element in the vector is changed. Is that a correct assumption?

Yes. Except for the fact that you were finding the largest element in a COPY
of the vector you were really interested in.
Another problem: To iterate over the vecor, I would like to use an
iterator, but that does not work:
for ( vector<Data>::iterator it = data.begin(); it != data.end();
++it ){
if(it->getLength() > longest->getLength()){
longest = it; // ??? or:
longest = *it; // what do I have to say here???
}
}

Well you should make longest an iterator as well. But if you really want to
convert from an iterator to a pointer

longest = &*it;

I always think this looks silly, and it reminds me not to use pointers when
I should be using iterators.
Thanks for every hint! (grrr, switching from Java to C++ is really
hard!!)

In C++ everything is a value unless you specify otherwise by using a
reference. That seems to be the essential point you are missing.

john
 
K

Kevin Goodsell

Markus said:
I discovered a strange pointer behavior that screws my program. Below
is a mini code example.

Basically, I have a vector with data and want a function (getLongest)
to find one of these values. The function should return a pointer to
the value. In the function, everything works, it finds the correct
element, but the returned pointer points to an empty element (but not
NULL). It's very strange. Can someone explain what's wrong with the
example?

--------------------------------------
#include <string>
#include <vector>
#include <iostream>

using namespace std;

class Data{
public:
Data(string val){
data = val;
length = data.length();
}

A few comments:

1) Prefer member initialization lists.
2) length is an int -- not the most appropriate type for a string size.
string::size_type would be better.
3) Don't store the length separately. It's already part of the string.
4) For objects of class types, particularly those that may be large,
pass by const reference instead of by value. This saves a potentially
costly copy operation.

So:

Data(const string &val) : data(val) {}
inline int getLength(){return length;}

1) The 'inline' here is redundant.
2) Make the return type string::size_type.
3) Make this function const so that you can have useful const instances
of this class.

So:

string::size_type getLength() const { return data.length(); }
inline int getData(){return data;}

1) The 'inline' is redundant.
2) Return type is int, but you return a string. This should not even
compile.
3) Make this function const so that you can have useful const instances
of this class.

string getData() const { return data; }
private:
string data;
int length;
};

Data *getLongest(vector<Data> data){

Again, pass by const reference rather than by value:

const Data *getLongest(const vector<Data> &data)
{

The const on the return type will also be necessary, unless you want to
pass by non-const reference.
Data *longest;

Likewise, this will need to be const:

const Data *longest;
for(int i=0; i < data.size(); i++){

1) vector<Data>::size_type would be more appropriate than int.
2) Prefer pre-increment to post-increment. It may be faster, and is
probably not slower. In the case of built-in types it probably makes no
difference, but it's a good habit.
if(data.getLength() > longest->getLength()){


This is a very serious error. 'longest' has not been initialized, and
doesn't point to anything useful. Dereferencing it gives undefined behavior.
longest = &data;
}
}
cerr << " Longest is " << longest->data << endl; // correct output
return longest;


The other problem with this function is that you are returning a pointer
to something that ceases to exist as soon as the function exits. The
'data' parameter was being passed by value -- that is, a local copy of
it was made for the function, and that copy (along with all the items
contained in it) goes out of scope and disappears when the function
exits. As soon as the caller does anything with the returned pointer,
you get undefined behavior.
}

int main(){
vector<Data> myData;
myData.push_back(Data("one"));
myData.push_back(Data("three"));
myData.push_back(Data("four"));

Perhaps surprisingly, you don't need to say Data("the string") here.
Your single-argument constructor acts as an implicit conversion:

myData.push_back("one");
myData.push_back("three");
etc.

This behavior is sometimes desirable, sometimes not. If it is not, make
the constructor explicit:

explicit MyClassName(some_type some_arg);
Data *longest = getLongest(myData);
cerr << "main(): Longest is " << longest->data << endl; // incorrect
}
--------------------------------------

I wanted that getLongest() returns a pointer, not the value itself, so
that I can be sure that, when I change that value later, the original
element in the vector is changed. Is that a correct assumption?

A pointer, or an iterator, or a reference would do. But beware. They can
all be invalidated by operations performed on the vector (such as
inserting new elements). Even an index could do the job, and that is a
bit more difficult to invalidate.
Another problem: To iterate over the vecor, I would like to use an
iterator, but that does not work:
for ( vector<Data>::iterator it = data.begin(); it != data.end();
++it ){
if(it->getLength() > longest->getLength()){
longest = it; // ??? or:
longest = *it; // what do I have to say here???

This:

longest = it;

should be correct, assuming 'longest' is declared as an iterator.
Actually, with the other changes 'longest' and 'it' will need to be type
}
}

Thanks for every hint! (grrr, switching from Java to C++ is really
hard!!)

C++ is fairly hard (I can't make a meaningful comparison to Java, since
I know very little about Java), but there are many tricks and tools to
make it easier. You aren't doing too badly so far. Using standard
library types like vector and string help enormously, so it's good that
you are using them. One of your problems is that you are lacking const
correctness, something you should be familiar with from Java I think.
Another is that you need to pay attention to your arguments, and pass
using an appropriate mechanism (generally choosing from value,
reference, or const reference).

-Kevin
 
J

John Harrison

2) length is an int -- not the most appropriate type for a string size.
string::size_type would be better.

You're telling a java programmer to start using unsigned types! I don't
think we need revisit that topic, lets just say it's a matter of opinion.

john
 
D

David Harmon

On Fri, 9 Apr 2004 08:23:27 +0100 in comp.lang.c++, "John Harrison"
You could pass in a reference to your vector, which would (a) work and (b)
be tons more efficient

Data *getLongest(vector<Data>& data){

bool compare_length( Data const & a, Data const & B)
{
return a.getLength() < b.getLength();
}

vector<data>::iterator getLongest(vector<Data>& data)
{
return std::max_element(data.begin(), data.end(),
compare_length);
}
 
H

Howard

David Harmon said:
vector<data>::iterator getLongest(vector<Data>& data)

I'm still trying to grasp the nuances of vectors and iterators, and this
confuses me. Shouldn't that be:

vector<Data>::iterator getLongest(vector<Data>& data)

("Data" instead of "data" in the <>)...?

Was that just a typo? Or is there a reason to use the name of the variable
itself instead of its type here, such as the fact this is a function and not
a variable declaration? (Or does it even matter?)

Thanks for any clarification,
-Howard
 
D

David Harmon

On 09 Apr 2004 10:44:57 EDT in comp.lang.c++, "Howard"
I'm still trying to grasp the nuances of vectors and iterators, and this
confuses me. Shouldn't that be:

vector<Data>::iterator getLongest(vector<Data>& data)

("Data" instead of "data" in the <>)...?

Was that just a typo?

Yes. Thanks for the correction. Identifiers that differ only in the
case of one letter are a Nuisance, and I generally try to avoid them.
Content-free identifiers like "data" are also to be avoided.
 
K

Kevin Goodsell

John said:
You're telling a java programmer to start using unsigned types! I don't
think we need revisit that topic, lets just say it's a matter of opinion.

Does Java not have unsigned types?

But in any case, it /is/ the type that is provided for the purpose is
question, and guaranteed to work. You can probably get away with 'int'
for learning programs, though. And maybe even real programs, depending
on implementation details.

-Kevin
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,431
Messages
2,571,678
Members
48,796
Latest member
Greg L.

Latest Threads

Top