Using pointer or reference?

S

shuisheng

Dear All,

The problem of choosing pointer or reference is always confusing me.
Would you please give me some suggestion on it. I appreciate your kind
help.

For example, I'd like to convert a string to a integer number.

bool Convert(const string& str, int* pData);
bool Convert(const string& str, int& data);

which one is better? Here bool is to show if the conversion is
successful.

Another example.

class Shape {};
class Sphere: public Shape {};
class Cube: public Shape {};

class Union: public Shape
{
private:
Shape* pShape1, pShape2;
// Or Shape& shape1, shape2;

public:
// Constructor.
Union(const Shape* pShape1, const Shape* pShape2); // or using
references
};

Shape* UnionOperation(const Shape* pShape1, const Shape* pShape2);
or
Shape& UnionOperation(const Shape& shape1, const Shape& shape2);
or
Shape* UnionOperation(const Shape& shape1, const Shape& shape2);

Thanks a lot!

Shuisheng
 
N

Noah Roberts

shuisheng said:
Dear All,

The problem of choosing pointer or reference is always confusing me.
Would you please give me some suggestion on it.

Use reference unless you have to use a pointer for some reason.
 
V

Victor Bazarov

shuisheng said:
The problem of choosing pointer or reference is always confusing me.
Would you please give me some suggestion on it. I appreciate your kind
help.

For example, I'd like to convert a string to a integer number.

bool Convert(const string& str, int* pData);
bool Convert(const string& str, int& data);

which one is better? Here bool is to show if the conversion is
successful.

Use the reference. You're going to provide an int no matter what,
right? So, why bother with the pointer? Pointers are only needed
if you for some reason might supply NULL (and the function would
have to check for it).
Another example.

class Shape {};
class Sphere: public Shape {};
class Cube: public Shape {};

class Union: public Shape
{
private:
Shape* pShape1, pShape2;
// Or Shape& shape1, shape2;

public:
// Constructor.
Union(const Shape* pShape1, const Shape* pShape2); // or using
references

References, unless you intend to potentially supply NULL as one or
both arguments.
};

Shape* UnionOperation(const Shape* pShape1, const Shape* pShape2);
or
Shape& UnionOperation(const Shape& shape1, const Shape& shape2);
or
Shape* UnionOperation(const Shape& shape1, const Shape& shape2);

Return an object, not a pointer or reference.

V
 
E

Evan

shuisheng said:
Dear All,

The problem of choosing pointer or reference is always confusing me.
Would you please give me some suggestion on it. I appreciate your kind
help.

For example, I'd like to convert a string to a integer number.

bool Convert(const string& str, int* pData);
bool Convert(const string& str, int& data);

which one is better? Here bool is to show if the conversion is
successful.

There are two schools of thought on this. The "C++ way" is to use a
reference. It simplifiys the syntax a bit, makes it harder to pass a
completely bogus pointer, etc. However, there is a school of thought
that says that you should use pointers for "out" parameters, because
it's self-documenting *at the call sites* (without even looking at the
function signature) that the function could change the arguments. The
argument goes that if you see "foo(arg)", you can't tell whether foo
will change its argument, but if you see "foo(&arg)", you know it can.

I don't fully buy into this argument, but it's at least fairly widely
held. It's my suspicion that an argument somewhat along these lines is
why Java doesn't have pass-by-reference. C# solves this problem by
making you specify at call sites that something is being passed by
reference. For instance, if foo is declared as 'void foo(int ref a)',
when calling it you have to say 'foo(ref x)'. I think this is probably
a reasonable compromise, and might be better than either the C or the
C++ approach. I've somewhat taken up the habit of doing something
similar in C++ at call sites of functions that take reference
parameters to use them rather than const refs for speed, and write
foo(*& x). (This isn't always safe though.)

So I would summarize:

Arguments in favor of pointers:
- You can't pass by reference if you want it to be optional
- It's self-documenting at call-sites potential behavior of the
function

Arguments in favor of references:
- It does a slightly better job at enforcing that you give it a valid
object, because it decreases the amount you have to deal with pointers
at all
- The function itself is self-documenting that you must pass a valid
object, whereas you have to rely on either reading code or written
documentation to tell if you can pass a null pointer to a function that
takes pointers
- Somewhat cleaner syntax (IMO)
- It's the "C++ way"

In summary, I would go with what one of the other responses said, and
say use pointers if passing NULL makes sense for an optional parameter,
and use references if you have to be provided a valid object.

(Also note that if you're passing for efficiency -- you should
definitiely always use a const reference rather than a pointer to
const. Here you don't WANT to give the impression that it can be
modified, 'cause it can't.)
Another example.

class Shape {};
class Sphere: public Shape {};
class Cube: public Shape {};

class Union: public Shape
{
private:
Shape* pShape1, pShape2;
// Or Shape& shape1, shape2;

public:
// Constructor.
Union(const Shape* pShape1, const Shape* pShape2); // or using
references

This falls under my "passing for efficiency use references" rule I
think
};

Shape* UnionOperation(const Shape* pShape1, const Shape* pShape2);
or
Shape& UnionOperation(const Shape& shape1, const Shape& shape2);
or
Shape* UnionOperation(const Shape& shape1, const Shape& shape2);

Unless the UnionOperation is acting like a += operator and mutating one
of the elements then returning the new value, you almost hve to return
neither a reference nor a pointer. I can tell though that it's not
acting as that because both arguments are const.

It's very rare that you should return a reference. The only common
exceptions I can think of is containers returning references to
elements and smart pointers returning a reference to elements, if
that's how you want your smart pointer to work. (Even in the second
case, I don't know that I know all the implications of returning by
reference; it could be that that would cause subtle problems. Providing
access to the referent by pointer is more common.)

For instance, the following code:
int& foo() { int a; return a; }
is wrong, because the return value can never be accessed. (Automatic
storage goes away as soon as the function returns.

The same problem is in the following code:
int* foo() { int a; return &a; }
The pointer this returns is necessarily invalid.

To return a pointer or reference from a function, you need to refer to
data that it gets from somewhere else. A possibly incomplete list would
be:
-- For member functions of a class, data stored in the object (this
would be like the containers or function pointers)
-- Global data
-- Static data
-- Parameters (this is like most of the str___ functions from the std
library; they return one of the parameters passed in)
-- Heap allocated storage

The last one deserves more comment. You can write this:
int* foo() { return new int; }
or even this (I think):
int& foo() { return *(new int); }
but both are probably a bad idea at best.

In the first case, the caller of foo has to call delete on whatever
pointer is returned. This isn't enforced; it's just an implicit part of
the contract. And it's of course prone to forgetting to return, etc.
And finally you have to document what you need to use to delete it --
maybe the return from foo() was allocated with malloc, maybe new, maybe
some other allocator. And you have to pair the correct deallocation
routine. "Callee-allocated buffers" are non too uncommon, at least in C
code, so their use is not totally unresonable, but it's probably
usually better to ask the user to allocate a buffer and pass a pointer
or reference to it which the function can then fill in.

The second case is even worse because it's possible to write stuff like
"int x=foo();" where you don't even have to realize that you're dealing
with an int& return. Even if you did realize that, it's so non-standard
(I mean standard here not in the sense of ISO 14882 but just what is
used in practice) that it would totally throw anyone using it off.

Evan
 
A

andrewmcdonagh

right? So, why bother with the pointer? Pointers are only needed
if you for some reason might supply NULL (and the function would
have to check for it).

You could even use a reference in that case by returning a NullObject.
 
A

andrewmcdonagh

reference. It simplifiys the syntax a bit, makes it harder to pass a
completely bogus pointer, etc. However, there is a school of thought
that says that you should use pointers for "out" parameters, because
it's self-documenting *at the call sites* (without even looking at the
function signature) that the function could change the arguments. The
argument goes that if you see "foo(arg)", you can't tell whether foo
will change its argument, but if you see "foo(&arg)", you know it can.

I don't fully buy into this argument, but it's at least fairly widely
held. It's my suspicion that an argument somewhat along these lines is
why Java doesn't have pass-by-reference. C# solves this problem by
making you specify at call sites that something is being passed by
reference. For instance, if foo is declared as 'void foo(int ref a)',
when calling it you have to say 'foo(ref x)'. I think this is probably
a reasonable compromise, and might be better than either the C or the
C++ approach. I've somewhat taken up the habit of doing something
similar in C++ at call sites of functions that take reference
parameters to use them rather than const refs for speed, and write
foo(*& x). (This isn't always safe though.)

So I would summarize:

Arguments in favor of pointers:
- You can't pass by reference if you want it to be optional

You can, use the NullObject pattern

Andrew
 
C

Cy Edmunds

Evan said:
There are two schools of thought on this. The "C++ way" is to use a
reference. It simplifiys the syntax a bit, makes it harder to pass a
completely bogus pointer, etc. However, there is a school of thought
that says that you should use pointers for "out" parameters, because
it's self-documenting *at the call sites* (without even looking at the
function signature) that the function could change the arguments. The
argument goes that if you see "foo(arg)", you can't tell whether foo
will change its argument, but if you see "foo(&arg)", you know it can.

[snip]

Evan-

Consider: void foo(int const *arg);

foo(&arg) sort of implies arg can be changed but certainly doesn't prove it.

OP-

The problem I have with pointers is that people may think they are supposed
to delete them when they are done. However, even using references can't
prevent that:

void dumb_implementation(Fred &f) {
delete &Fred; // ugh
}
I know that looks horrible but I have seen it done.

Overall I would say stick with references unless you have a pretty good
reason (need to test for NULL, legacy, C interface, etc.)

Cy
 
E

Evan

Cy said:
Evan-

Consider: void foo(int const *arg);

foo(&arg) sort of implies arg can be changed but certainly doesn't prove it.

Oh, I know. This is one reason why I don't really buy the "pointers are
useful because they are somewhat self-documenting" argument. Just
because you pass by pointer doesn't mean that the function is using it
as an out parameter.

Evan
 
D

Daniel T.

shuisheng said:
Dear All,

The problem of choosing pointer or reference is always confusing me.
Would you please give me some suggestion on it. I appreciate your kind
help.

For example, I'd like to convert a string to a integer number.

bool Convert(const string& str, int* pData);
bool Convert(const string& str, int& data);

or
bool Convert( const string* str, int* pData );
bool Convert( const string* str, int& pData );
which one is better? Here bool is to show if the conversion is
successful.

First, let's review why references were added to the language. The only
reason they were added was to support operator overloading. In those
instances you must use a reference. In any other situation, if you can
use a reference, you could also use a pointer.

In Even's post, he discussed a common heard argument about using
pointers for output parameters. However, the argument is bogus:

int i;
foo( &i );

Without seeing the declaration of 'foo' you simply can't know if it can
modify the value of 'i'. (foo may take a const int*.)

So, you *could* just always use a pointer in every case (except in
op-overloads) and you would be fine. Personally, I prefer using this as
an opportunity to help document the code. So here is my algorithm for
deciding:

if passing in a C array, use a pointer.
else if the function is an opperator overload, use a reference.
else if the caller can destroy the parameter immediatly after the
function returns, use a reference.
else if the caller cannot destroy the parameter immediatly after the
function returns, use a pointer.
else if the function would work fine with parameters passed by value
but the sizeof the parameter is greater than sizeof int, use a const
reference.
else Pass by value.

I think that's about it.

So in your example above, the function would work fine with the 'str'
passed in by value, but the sizeof string is greater than sizeof int,
so pass 'str' by const reference. The caller can destroy the second
parameter immediatly after the function returns (by deleting or forcing
the parameter out of scope) so use a reference. i.e.,

bool Convert( const string& str, int& data );
Another example.

class Shape {};
class Sphere: public Shape {};
class Cube: public Shape {};

class Union: public Shape
{
private:
Shape* pShape1, pShape2;
// Or Shape& shape1, shape2;

I can't think of the last time I used a reference as member-data.
public:
// Constructor.
Union(const Shape* pShape1, const Shape* pShape2); // or using
references
};

If the two objects passed in are going to be assigned to the two
member-variables, then the caller cannot destroy them immediatly after
the object is constructed (without leaving the object with a couple of
invalid pointers,) so I would use Shape*. BTW, in the situation I
describe, they could not be const pointers, because the
member-variables aren't const.
Shape* UnionOperation(const Shape* pShape1, const Shape* pShape2);
or
Shape& UnionOperation(const Shape& shape1, const Shape& shape2);
or
Shape* UnionOperation(const Shape& shape1, const Shape& shape2);

For return values:
if the function returns an array and the caller isn't supposed to
destroy the array, return by const pointer.
else if the function returns an array and the caller is supposed to
destroy the array, return by pointer.
else if the caller is supposed to destroy the object returned, return a
pointer.
else if the function would work just fine if it returned by value, but
sizeof the return type is greater than sizeof int, return by const
reference.
else if the caller is supposed to be able to modify the value returned,
return by reference (this should be very rare IMHO.)
else return by value.

Which of the above three "UnionOperation"s is approprate depends on
what the function does with the parameters.
 
I

Ivan Novick

shuisheng said:
Dear All,

The problem of choosing pointer or reference is always confusing me.
Would you please give me some suggestion on it. I appreciate your kind
help.
Here is an attempt at a simpler list:

1) Use pointers if you may need to move to additional locations offset
from the address.
2) Use pointers if setting the pointer to 0 will indicate something
back to the caller
3) Use references if you want to protect against possibly derencing a
null pointer and core dumping

otherwise, use either.

-
Ivan
http://www.0x4849.net
 
D

Duane Hebert

In Even's post, he discussed a common heard argument about using
pointers for output parameters. However, the argument is bogus:

int i;
foo( &i );

Without seeing the declaration of 'foo' you simply can't know if it can
modify the value of 'i'. (foo may take a const int*.)

How often do you use functions without looking at the
declaration?

This is actually more of a problem when using pointers.

foo(int *i) {
*i = new int;
}

Just looking at the declaration doesn't help much.
 
D

Duane Hebert

Here is an attempt at a simpler list:

1) Use pointers if you may need to move to additional locations offset
from the address.
2) Use pointers if setting the pointer to 0 will indicate something
back to the caller
3) Use references if you want to protect against possibly derencing a
null pointer and core dumping

otherwise, use either.

I prefer Stroustrup's advice. Use a reference
when you can. Use a pointer when you have to.
 
E

Evan

Duane said:
How often do you use functions without looking at the
declaration?

This is actually more of a problem when using pointers.

foo(int *i) {
*i = new int;
}

i = new int; you mean. ;-)


But to answer your question, I think the argument goes that it doesn't
help much when you're WRITING the code, but if you're thrown into some
system you probably don't want to check the signature of every function
that's used. (Especially if you're using an editor without good code
browsing abilities (right click, got to declaration).) Ideally then if
you see a call "foo(x)" you'd like to know at the call site if x could
change or if foo's a pure function. Even C falls flat on its face in
this regard, so I don't think that references hurt it too much.

Evan
 
D

Daniel T.

Duane said:
How often do you use functions without looking at the declaration?

Never and that's the point.
This is actually more of a problem when using pointers.

foo(int *i) {
*i = new int;
}

Just looking at the declaration doesn't help much.

The above doesn't compile and:

void foo( int* i ) {
i = new int;
}

Doesn't affect the calling code at all. You might want to re-think what
you are trying to get at.
 
G

Gavin Deane

Evan said:
But to answer your question, I think the argument goes that it doesn't
help much when you're WRITING the code, but if you're thrown into some
system you probably don't want to check the signature of every function
that's used. (Especially if you're using an editor without good code
browsing abilities (right click, got to declaration).) Ideally then if
you see a call "foo(x)" you'd like to know at the call site if x could
change or if foo's a pure function. Even C falls flat on its face in
this regard, so I don't think that references hurt it too much.

As well as the pointer to const data example making the argument bogus,
there is also the fact that in the real world, functions are not called
"foo(x)", they are called things like "cosine(x)" or "increment(x)" and
the name itself should give a strong indication as to whether the
parameter is modified or not.

Even given a meaningful function name, you still cannot *know* whether
the parameter is modified or not. Only the interface documentation can
ever tell you that. But if meaningful names are used, the problem of
interpreting the code when all you see is the call site is not as hard
as it might appear to be from an example that only uses abstract names
like foo and bar for the function.

Gavin Deane
 
D

Daniel T.

Duane said:
I prefer Stroustrup's advice. Use a reference when you can. Use a
pointer when you have to.

Hmm, AFAIK the only time you must use a pointer is if you are passing
an array. What about passing by value?
 
D

Duane Hebert

Never and that's the point.
should be i = new int;
The above doesn't compile and:

void foo( int* i ) {
i = new int;
}

Doesn't affect the calling code at all. You might want to re-think what
you are trying to get at.

You seem to be saying to prefer pointers over references
because with references you can't know if the variable is
going to be modified by the function. Sorry if I didn't get
your point then...
 
D

Duane Hebert

Gavin Deane said:
As well as the pointer to const data example making the argument bogus,
there is also the fact that in the real world, functions are not called
"foo(x)", they are called things like "cosine(x)" or "increment(x)" and
the name itself should give a strong indication as to whether the
parameter is modified or not.

Even given a meaningful function name, you still cannot *know* whether
the parameter is modified or not. Only the interface documentation can
ever tell you that. But if meaningful names are used, the problem of
interpreting the code when all you see is the call site is not as hard
as it might appear to be from an example that only uses abstract names
like foo and bar for the function.

Well that's sort of my point. You need to know what the function
is doing when you call it. Making the arg a pointer doesn't change
this. My other point is that it's actually clearer without using
pointers. If it's a copy, it doesn't matter. If it's a const & it won't
be modified. If it's a non const &, just assume that it will be or
look at the documentation.
 
D

Duane Hebert

Daniel T. said:
Hmm, AFAIK the only time you must use a pointer is if you are passing
an array. What about passing by value?

When passing by value, the variable can't be modified. I think
I've missed your point. As for when you need a pointer, you need
it if it can be null. You also need it when you want to reassign it.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,432
Messages
2,571,680
Members
48,796
Latest member
Greg L.

Latest Threads

Top