Pointers vs References: A Question on Style

D

Desmond Liu

I've read articles like Scott Meyer's EC++ (Item 22) that advocate the use
of references when passing parameters. I understand the reasoning behind
using references--you avoid the cost of creating a local object when you
pass an object by reference.

But why use a reference? Is there any inherent advantage of using a
reference over using a pointer? My workplace discourages the use of
pointers when it comes to parameter passing. They always prefer references.

I would argue one reason why we might consider using a pointer is to help
communicate intent. For example, suppose I do the following in C:

#include "foo.h"

int main(int argc, char *argv[])
{
int x=5;

foo(x);

printf("%d\n",x);

return 0;
}

I can guarantee that the program will print '5' because C only supports
pass-by-value. If we're using C++, I can't make that guarantee because
we're allowed to pass-by-reference.

But if I use pointers in parameter-passing, I can use the address-of
operator to communicate intent to other programmers that the parameter
is meant to be modified. For example,

#include "A.h"
#include "X.h"

int main(int argc, char *argv[])
{
A a;
X x;

x.foo(&a); // the intent of this member function is to modify 'a'
// A non-const member function of 'a' will be called.

x.bar(a); // this member function will NOT modify 'a'. Only const
// member functions of 'a' will be called.

std::cout << a << std::endl;

return 0;
}

If I'm having problems with object 'a', I see that member function bar()
will not change object 'a' (by convention), and that my problem is likely
in member function foo(). This is especially useful if I'm sifting through
a large amount of code to track down a problem. By using this convention, I
can save time not having to look at every member function in the class
declaration to see which ones might be the cause of my problem.

Of course, all programmers have to adhere to the convention in order for it
to be useful. So to make a long story short, is it better to do
this:

void x::foo(A* a); // foo() will modify 'a'
void x::bar(const A& a); // bar() will not modify 'a'

Or is it better to do what my workplace requires?

void x::foo(A& a); // foo() will modify 'a'
void x::bar(const A& a); // bar() will not modify 'a'

Am I wrong in suggesting the use of pointers for parameter-passing? Is
there a reason why a reference should always be preferred over a pointer?
My thinking is that this convention would make the C++ code more readable.
 
J

JKop

There are 2 and only 2 situations in which I choose a pointer over a
reference:

1) When it must be re-seated

2) When arrays are involved

I understand the reasoning behind
using references--you avoid the cost of
creating a local object when you
pass an object by reference.


Incorrect. The nature of today's computers (ie. stack and registers) still
require a hidden pointer. (This is ofcourse where outline functions are
involved).


-JKop
 
D

DaKoadMunky

There are 2 and only 2 situations in which I choose a pointer over a
reference:

1) When it must be re-seated

2) When arrays are involved

Another possibility is when a function argument is optional. Pointers can be
null whereas references cannot.
 
P

Phlip

DaKoadMunky said:
Another possibility is when a function argument is optional. Pointers can be
null whereas references cannot.

One good style rule is to never return null, and never accept null as
parameter.

Follow that rule by passing in instead a Null Object. Consider this code:

void funk(SimCity const * pCity)
{
if (pCity)
pCity->throwParade();
}

Now contrast with this:

class NullCity: public SimCity
{
public: /*virtual*/ void throwParade() {}
};

void funk(SimCity const & aCity)
{
aCity.throwParade();
}

The code simplifies because it pushes behavior behind an interface. The
interface simply promises to its caller that it is doing something, whether
or not it really is. Patterns like this are the heart of OO - programming to
the interface instead of the implementation.

Desmond said:
But why use a reference? Is there any inherent advantage of using a
reference over using a pointer? My workplace discourages the use of
pointers when it comes to parameter passing. They always prefer
references.

You need to pair-program with your colleagues, because if they know just
enough C++ to write that advice down, they probably know much more verbally.

The C++ keyword const instructs compilers to reject overt attempts to change
a variable's value. Covert attempts produce undefined behavior, meaning
anything could happen.

C++ functions can take arguments by copy, by address, or by reference.
Ideally, if an object passed into a function does not change, the object
should pass by copy:

void foo(SimCity aCity);

That code is inefficient. In general, programmers should not stress about
efficiency until they have enough code to measure it and find the slow
spots. In this situation, a more efficient implementation is equal cost.
When we pass by reference, our program spends no time making a huge copy of
an entire city:

void foo(SimCity &aCity);

Now if foo() won't change that city's value, the function should declare
that intention in its interface, using pass-by-constant-reference to
simulate pass-by-copy:

void foo(SimCity const &aCity);

That is the most efficient call syntax, cognitively and physically. It's
cognitively efficient because it gives foo() no copied object to foolishly
change and then discard. Statements inside foo() that might attempt to
change that city shouldn't compile. It's physically efficient because the
compiler produces opcodes that only give foo() a handle to an existing city,
without copying it.

C++ supports qualifications before their qualified types, such as "const
SimCity &". I try to write expressions with the most important part first.
There are also subtle technical reasons, in rare situations, to write
"SimCity const &", with the const after its type.
 
D

Dave Townsend

I prefer to use references over pointers when I can. One of my strong
reasons is a reference should be referencing a valid object, whereas
a pointer you should always check that it is not NULL before using
it. The code is a little bit less "noisy" since I don't have the -> or *
dererencing to look at.

In situations where an object may be passed or not, or may be present
or not, a pointer is needed.

dave

Desmond Liu said:
I've read articles like Scott Meyer's EC++ (Item 22) that advocate the use
of references when passing parameters. I understand the reasoning behind
using references--you avoid the cost of creating a local object when you
pass an object by reference.

But why use a reference? Is there any inherent advantage of using a
reference over using a pointer? My workplace discourages the use of
pointers when it comes to parameter passing. They always prefer references.

I would argue one reason why we might consider using a pointer is to help
communicate intent. For example, suppose I do the following in C:

#include "foo.h"

int main(int argc, char *argv[])
{
int x=5;

foo(x);

printf("%d\n",x);

return 0;
}

I can guarantee that the program will print '5' because C only supports
pass-by-value. If we're using C++, I can't make that guarantee because
we're allowed to pass-by-reference.

But if I use pointers in parameter-passing, I can use the address-of
operator to communicate intent to other programmers that the parameter
is meant to be modified. For example,

#include "A.h"
#include "X.h"

int main(int argc, char *argv[])
{
A a;
X x;

x.foo(&a); // the intent of this member function is to modify 'a'
// A non-const member function of 'a' will be called.

x.bar(a); // this member function will NOT modify 'a'. Only const
// member functions of 'a' will be called.

std::cout << a << std::endl;

return 0;
}

If I'm having problems with object 'a', I see that member function bar()
will not change object 'a' (by convention), and that my problem is likely
in member function foo(). This is especially useful if I'm sifting through
a large amount of code to track down a problem. By using this convention, I
can save time not having to look at every member function in the class
declaration to see which ones might be the cause of my problem.

Of course, all programmers have to adhere to the convention in order for it
to be useful. So to make a long story short, is it better to do
this:

void x::foo(A* a); // foo() will modify 'a'
void x::bar(const A& a); // bar() will not modify 'a'

Or is it better to do what my workplace requires?

void x::foo(A& a); // foo() will modify 'a'
void x::bar(const A& a); // bar() will not modify 'a'

Am I wrong in suggesting the use of pointers for parameter-passing? Is
there a reason why a reference should always be preferred over a pointer?
My thinking is that this convention would make the C++ code more readable.

----
Desmond

(Remove the 'nospam' from the address for e-mail replies, but I prefer
reply posts to the newsgroup)
 
P

Phlip

Dave said:
In situations where an object may be passed or not, or may be present
or not, a pointer is needed.

Read my post.

I think I might be able to extend the NullObject concept to say not using it
violates the Liskov Substitution Principle...
 
D

Dave Townsend

Agreed, but using a NullObject only relates to new classes, not
legacy ones I have to live with.
 
P

Phlip

Dave said:
Agreed, but using a NullObject only relates to new classes, not
legacy ones I have to live with.

Read /Working Effectively with Legacy Code/ by Mike Feathers.

Then think globally and act locally. One little NullObject at one interface
won't bring down the whole house of cards.
 
D

Desmond Liu

JKop said:
There are 2 and only 2 situations in which I choose a pointer over a
reference:

1) When it must be re-seated

2) When arrays are involved




Incorrect. The nature of today's computers (ie. stack and registers)
still require a hidden pointer. (This is ofcourse where outline
functions are involved).


-JKop

Whoops. That was a typo. That should have read "avoid the cost of creating
a local object when you pass an object by _value_, as per Scott Meyer's
advice. Sorry.

Desmond
 
J

Joe C

Phlip said:
C++ supports qualifications before their qualified types, such as "const
SimCity &". I try to write expressions with the most important part first.
There are also subtle technical reasons, in rare situations, to write
"SimCity const &", with the const after its type.

Hi Phlip, can you elaborate on this? I've been using "const datatype &data"
rather than "datatype const &data". What are the differences between these
two semantic styles?
 
D

DaKoadMunky

One good style rule is to never return null, and never accept null as
parameter.

Follow that rule by passing in instead a Null Object. Consider this code:

void funk(SimCity const * pCity)
{
if (pCity)
pCity->throwParade();
}

Now contrast with this:

class NullCity: public SimCity
{
public: /*virtual*/ void throwParade() {}
};

void funk(SimCity const & aCity)
{
aCity.throwParade();
}

Two questions...

What if calling SimCity::throwParade requires the caller to ensure certain
preconditions are met? Suppose ensuring those preconditions is expensive. It
seems that using a NullCity here and not checking for it (or a 0 value) could
result in unnecessary code being executed.

What if the caller of SimCity::throwParade writes code that is dependent upon
the postconditions associated with SimCity::throwParade? I am not sure how a
NullCity can be reasonably substituted in this case. Because it is a "do
nothing" object it doesn't seem correct to have it guarantee postconditions nor
dor does it seem correct for calling code to respond as though an error was
present because postconditions were not met.

I am in the midst of reading about NullObject at
http://c2.com/cgi/wiki?NullObject. Maybe I will find the answer there. I am
curious as to your response though.

Thanks.
 
P

Phlip

Joe said:
Phlip wrote:

Hi Phlip, can you elaborate on this? I've been using "const datatype &data"
rather than "datatype const &data". What are the differences between these
two semantic styles?

It's only style - the base type should go first, so you read it first when
you scan a line.

The only technical reason I can think of is this:

#define datatype someType*
//typedef someType * datatype;

Only the post-fixed 'const' works the same for each of those two different
ways to declare datatype.
 
P

Phlip

DaKoadMunky said:
Two questions...

What if calling SimCity::throwParade requires the caller to ensure certain
preconditions are met? Suppose ensuring those preconditions is expensive. It
seems that using a NullCity here and not checking for it (or a 0 value) could
result in unnecessary code being executed.

Uh, move those expensive things into delegates of SimCity. Call them
meetPreconditions(). Then give NullCity an empty implementation of that
method.

H. S. Lahman wrote this:

Responding to Phlip...
The Liskov Substitution Principle states (roughly) that users of polymorphic
types must not be required to detect which derived type responds to an
interface.

A thread on another newsgroup lead me to suspect this violates LSP:

void funk(SimCity * pCity)
{
if (pCity != NULL)
pCity->throwParade();
}

By that estimation, NULL pointers in interfacial C++ are history.
NullObject - or better - is the way to go.

I agree with Wissler. Checking a NULL pointer is merely a check on
relationship conditionality, which is quite valid.

OTOH, one has to wonder why a NULL pointer is being passed to funk. If
anything funny is going on, the damage was already done in the caller,
such as:

Client::doIt (City* pCity)
{
SimCity* pSimCity; // subclass of City

pSimCity = dynamic_cast<SimCity>(pCity);
funk (pSimCity);
}


FWIW, I think this is just another example of why it is a bad idea to
pass object references except as a setter for a referential attribute.
Conditionality in relationships presents enough problems to the client
without combining it with temporary relationship instantiation.

*************
There is nothing wrong with me that could
not be cured by a capful of Drano.

----8<-----------------------------------
What if the caller of SimCity::throwParade writes code that is dependent
upon
the postconditions associated with SimCity::throwParade? I am not sure
how a
NullCity can be reasonably substituted in this case. Because it is a "do
nothing" object it doesn't seem correct to have it guarantee
postconditions nor
dor does it seem correct for calling code to respond as though an error
was
present because postconditions were not met.

You may want to post this to my thread on
I know I would continue to test for null-ness where needed, and would
attempt to push those sensitive things into a delegate that naturally goes
away when the NullObject is around. But like the "pimpl idiom", NullObject
is the target of an emergency refactor, not a design goal by itself. I think
your sensitive things already violated LSP before replacing the pointer with
the NullObject.
 
D

David Rubin

Typically, you should pass parameters *by* *address*

1. whenever the parameter is going to be modified
e.g., int getValue(int *result, const my_Object& key);

2. whenever arrays are involved
e.g, int bubbleSort(int *array);

3. whenever you need to express an optional argument
e.g., my_Object::my_Object(int initialValue, an_Allocator = 0);

4. whenever you want to delegate ownership
e.g., objectVector.push_back(new my_Object(22));
void an_Allocator::deallocate(void *buffer);

4. when you want to pass a string
e.g., int lookup(const char *name);
One good style rule is to never return null, and never accept null as
parameter.

This is a naive rule. For example, you want to return a pointer when
it is possible for a result to be invalid, undefined, or unset:

const my_Object *lookup(const char *name);

my_Object *my_Objet::singleton();

Obviously, you need to return by address when you return a dynamically
allocated object, as in a factory method.

As for *accepting* a null pointer argument, there is really no problem
with this as long as you *document* the expected behavior so that your
clients understand how to use your function:

int getValue(int *result, const my_Object& key);
// Load the value associated with the specified 'key'
// into the specified 'result'. Return 0 on success,
// and a non-zero value otherwise. The behavior is
// undefined unless 'result' is a valid pointer. Note
// that the value pointed to by 'result' is not altered
// if the function does not succeed.

Notice that "undefined" behavior means that you, the imlementor, can
deallocate a null or invalid pointer, 'assert' that the parameter is
not null, or whatever.
Follow that rule by passing in instead a Null Object. Consider this code:
void funk(SimCity const * pCity)
{
if (pCity)
pCity->throwParade();
}
Now contrast with this:
class NullCity: public SimCity
{
public: /*virtual*/ void throwParade() {}
};
void funk(SimCity const & aCity)
{
aCity.throwParade();
}

This may be completly unreasonable when working with third-party or
legacy code. Also, it makes it impossible to detect a usage violation
without doing a dynamic_cast on an argument with a polymorphic type.
For example, if it is not valid to pass a null SimCity object to
'funk', you should pass by address and 'assert' that the parameter is
not null. If it is not *possible* to pass a null SimCity object to
'funk', you should pass by reference, and catch errors at compile
time.
C++ supports qualifications before their qualified types, such as "const
SimCity &". I try to write expressions with the most important part first.
There are also subtle technical reasons, in rare situations, to write
"SimCity const &", with the const after its type.

What technical reasons? Style is not "technical"...

/david
 
P

Phlip

One good style rule is to never return null, and never accept null as
David Rubin did not know his peril when he uttered:
This is a naive rule. For example, you want to return a pointer when
it is possible for a result to be invalid, undefined, or unset:

All rules are naive. Calling that rule naive is naive.
const my_Object *lookup(const char *name);

my_Object *my_Objet::singleton();

Obviously, you need to return by address when you return a dynamically
allocated object, as in a factory method.

That is non-obvious. If the factory throws when it can't allocate, it can't
return NULL, hence the reason for pointing goes away. It could honestly
return a reference.

Rule 1: Prefer references to pointers unless you need pointers' special
abilities.

Rule 2: Avoid the need for pointers' special abilities.

I recently wrote a dialog box, in WTL (on MS Windows) that stores customer
names in a list box, and names and address in edit fields. The edit field
for the State is a combo box. The dialog box stores the name list in XML
(via MSXML via COM), and the list box displays tabs correctly as columns.
The edit fields link to the XML via MVC. The dialog box can localize to
Sanskrit, and can display a cancellable progress bar based on a window
timer.

The only * in the program are for passing constant strings into low-level
As for *accepting* a null pointer argument, there is really no problem
with this as long as you *document* the expected behavior so that your
clients understand how to use your function:

int getValue(int *result, const my_Object& key);
// Load the value associated with the specified 'key'
// into the specified 'result'. Return 0 on success,
// and a non-zero value otherwise. The behavior is
// undefined unless 'result' is a valid pointer. Note
// that the value pointed to by 'result' is not altered
// if the function does not succeed.

Comments suck. int result should be a full-fledged object that enforces
those behaviors, if they are important.
This may be completly unreasonable when working with third-party or
legacy code.

It may be the only salvation for such code. Obviously you can't change
someone's opaque function.

When un-f***ing-up legacy code, the ability to slip new polymorphic
behaviors into its tangled code is priceless. Introducing the ability to
polymorph a SimCity carries many more benefits than just introducing
NullObjects.

(Read /working effectively with legacy code/ by Mike Feathers.)
Also, it makes it impossible to detect a usage violation
without doing a dynamic_cast on an argument with a polymorphic type.
For example, if it is not valid to pass a null SimCity object to
'funk', you should pass by address and 'assert' that the parameter is
not null. If it is not *possible* to pass a null SimCity object to
'funk', you should pass by reference, and catch errors at compile
time.

That sounds like an argument for NullObject.

Compile time checking only catches some errors. All a NULL pointer needs to
accidentally become an undefinable reference is a single dereferencing star
* in the wrong spot.

If you are that frantic about them, use wall-to-wall unit tests. BTW my
Sanskrit dialog was written via test-first on every single feature. The test
cases can also record screen shots of the dialog's various locale skins, and
can record animations of the progress bar.
What technical reasons? Style is not "technical"...

No shit. Style is not technical? Damn, am I ever glad I clicked on this post
today! Woah, you sure set me straight on that one!
 
R

Richard Herring

But default-constructed "null" objects can often be passed by reference
to give the same effect.
Typically, you should pass parameters *by* *address*

1. whenever the parameter is going to be modified
e.g., int getValue(int *result, const my_Object& key);

But a pointer might be null or point to something that's been deleted.
If you pass by non-const reference, you don't have to worry about that.
2. whenever arrays are involved
e.g, int bubbleSort(int *array);

Well, yes, but why use arrays? (why write your own bubblesort? why not
pass two iterators to it? ...)
3. whenever you need to express an optional argument
e.g., my_Object::my_Object(int initialValue, an_Allocator = 0);

Except when it's more appropriate to pass it by value. Or reference
(e.g. see how the STL passes allocators to constructors.)
4. whenever you want to delegate ownership
e.g., objectVector.push_back(new my_Object(22));
void an_Allocator::deallocate(void *buffer);

void* ? Real allocators mostly use Allocator::pointer.
4. when you want to pass a string
e.g., int lookup(const char *name);
Same as the array case. But why use array of const char when you have
std::string?
 
D

David Rubin

Phlip said:
David Rubin did not know his peril when he uttered:


All rules are naive. Calling that rule naive is naive.


That is non-obvious. If the factory throws when it can't allocate, it can't
return NULL, hence the reason for pointing goes away. It could honestly
return a reference.

If your function throws an exception, it doesn't matter what it
returns; you need to deal with an exception. There are also situations
in which you do not want to support exceptions (possibly due to
run-time considerations). Furthermore, it is a good idea to follow the
rule that if you have a pointer, you should use a pointer. For
example, what do you gain from this?

Type& my_Factory::allocate() {
Type *obj = new Type;
return *obj;
}

You are also left with the connundrum of writing the analogous
'deallocate' method:

void my_Factory::deallocate(Type& obj);
// 'obj' is altered even though it is passed by reference
and
factory.deallocate(object); // unconventional

or

void my_Factory::deallocate(Type *obj); // not symmetric to
'allocate'
and
factory.deallocate(&object);
// How do you know 'object' is dynamically allocated?

[snip]
Comments suck. int result should be a full-fledged object that enforces
those behaviors, if they are important.

Really? What are the semantics of this function?

my_Result compute(my_Value& v1, my_Value& v2);

Comments are the *only* way most clients can understand how code
works. Most people speak better English (for example) than C++.

No shit. Style is not technical? Damn, am I ever glad I clicked on this post
today! Woah, you sure set me straight on that one!

Can you please tell us what you had in mind when you wrote the
original comment, rather than unleash your misguided sarcasm? Several
people here seem interested.

/david
 
D

David Rubin

[snip]
But a pointer might be null or point to something that's been deleted.
If you pass by non-const reference, you don't have to worry about that.

You are suggesting

int getValue(int& result, const my_Object& key);

which would be called like this

int rc = getValue(numBeans, pod);

This does not read well since it is difficult to tell what parameter
is being modified. See C++PL 3ed 5.5.

[snip]
void* ? Real allocators mostly use Allocator::pointer.

Yes, this is a typo. And, this is not an 'Allocator', it's
'an_Allocator'. But the point is the same; you still return a pointer.
Same as the array case. But why use array of const char when you have
std::string?

This is a subtle point. You want to choose a lowest common
denomentator type. For example, an interface such as

int lookup(const std::string& name);

*forces* clients to use 'std::string'; it's not optional. You client
might prefer to use 'Acme::string', or might not be able to use STL.
(This does happen sometimes...). In any case, with the 'const char *'
interface, clients *can* use 'std::string' or any other string type
which converts to 'const char *'.

This begs the question of whether or not you should *return* a
'std::string' (reference) or a 'const char *'... /david
 
P

Phlip

David said:
If your function throws an exception, it doesn't matter what it
returns; you need to deal with an exception. There are also situations
in which you do not want to support exceptions (possibly due to
run-time considerations). Furthermore, it is a good idea to follow the
rule that if you have a pointer, you should use a pointer. For
example, what do you gain from this?

Type& my_Factory::allocate() {
Type *obj = new Type;
return *obj;
}

You are also left with the connundrum of writing the analogous
'deallocate' method:

void my_Factory::deallocate(Type& obj);
// 'obj' is altered even though it is passed by reference
and
factory.deallocate(object); // unconventional

or

void my_Factory::deallocate(Type *obj); // not symmetric to
'allocate'
and
factory.deallocate(&object);
// How do you know 'object' is dynamically allocated?

Add a smart shared pointer to the above - if your design truly needs 'new'.
Really? What are the semantics of this function?

my_Result compute(my_Value& v1, my_Value& v2);

Comments are the *only* way most clients can understand how code
works. Most people speak better English (for example) than C++.

I would work on making that function's unit test self-documenting before
working on its comments. You can't test a comment.

I'm not sure but there might be more reasons than this:

#define datatype someType*
//typedef someType * datatype;

Only the post-fixed 'const' works the same for each of those two different
ways to declare datatype.

Insert the standard newsgroup screed against #define here.
 
R

Richard Herring

David Rubin said:
[snip]
But a pointer might be null or point to something that's been deleted.
If you pass by non-const reference, you don't have to worry about that.

You are suggesting

int getValue(int& result, const my_Object& key);

which would be called like this

int rc = getValue(numBeans, pod);

This does not read well since it is difficult to tell what parameter
is being modified. See C++PL 3ed 5.5.

....which says, inter alia, 'Consequently "plain" reference arguments
should be used only where the name of the function gives a strong hint
that the reference argument is modified.'

"Get" in the function's name _would_ be such a hint were it not for the
fact that the function returns _another_ value as its return value. I'd
be inclined to question why. If it's an error indicator, perhaps you'd
be better throwing an exception instead of returning a code which has to
be tested.
[snip]
void* ? Real allocators mostly use Allocator::pointer.

Yes, this is a typo. And, this is not an 'Allocator', it's
'an_Allocator'. But the point is the same; you still return a pointer.
Same as the array case. But why use array of const char when you have
std::string?

This is a subtle point. You want to choose a lowest common
denomentator type. For example, an interface such as

int lookup(const std::string& name);

*forces* clients to use 'std::string'; it's not optional. You client
might prefer to use 'Acme::string',

So they have to write lookup(std::string(acmeString.c_str())). That's
not so terrible.
or might not be able to use STL.

If we're talking about standard (hosted) C++, I think that possibility
is somewhat academic.
(This does happen sometimes...). In any case, with the 'const char *'
interface, clients *can* use 'std::string' or any other string type
which converts to 'const char *'.

But now they can't pass a string which contains '\0'. This happens
sometimes, too.
This begs the question of whether or not you should *return* a
'std::string' (reference) or a 'const char *'... /david

Neither. Return a std::string by value and you have no need to argue
about ownership or lifetime of the result.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,050
Latest member
AngelS122

Latest Threads

Top