Don't pass by reference to non-const?

Alf P. Steinbach · May 2, 2010

I was about to post the same comment (until some prawn in a digger hit
our power cable!). I ran a quick test and found passing by reference
slower than returning a vector of 10000 doubles. I had expected them to
be about the same.

When you do it manually instead of RVO you have an extra default construct
construction and an extra swap, but it shouldn't really matter for efficiency.

What results did you obtain?

Doing it manually is more fragile, though.

Cheers,

- Alf

Alf P. Steinbach · May 2, 2010

Yes and no. The logic is that in arguments are references, out
and inout pointers. This has two advantages:

-- It requires explicit action at the call site in order to
pass an out argument; you can see immediately that the
argument might change.

Sorry, no, you can't in general.

It's just an urban myth.

For example,

foo( v );

Here v might be a pointer denoting an out argument. Or v might be an object with
pointers to other objects. Concluding that foo can't change anything because
there's no address operator in the call is just stupid; one has to know about
foo, and for that matter, about v.

-- It ensures that even if you currently compile with a
compiler that doesn't enforce the rule forbidding
initializing a non-const reference with an rvalue (VC++, for
example), your code will compile with more "correct"
compilers: all compilers require an lvalue in order to use
the (built-in)& operator.

As I recall VC warns about that, so, no problem, argument moot.

Neither are "killer" arguments

They're certainly not "killer" arguments: they're void arguments.

, and certainly a lot of other
conventions are possible. In the end, you only have two
choices, and there are far more than one binary information that
might be useful to indicate. Still, way, way back, this was the
convention I preferred.

Well, I've also used silly in-practice-detrimental conventions, like

void foo()
{
bar();
}

But one would sort of expect a modern firm like Google to not /require/ its
developers to do such silly things, and to have understood the problems of
micro-management and simply not do that.

Hence my conclusion that it's most likely political.

I'd gotten out of it, because the
places I'd worked in used other conventions, but the circle has
turned, and where I work now uses it (and used it before I got
there).

The state of knowledge & understanding advances /very/ slowly.

Cheers,

- Alf

Kai-Uwe Bux · May 3, 2010

Stuart said:
James said:

( [...]
And for things like +=, as I said, there are
some very good arguments for making it a non-member. Even if I
don't do so.)

Click to expand...

What are the arguments for making things like operator+= a non-member,
out of interest? I'm aware of an (the?) argument for things like
operator+, where you often want to allow conversions on the left-hand
argument as well as the right, but what's the argument for the '='
versions please?

If you make it a non-member, operator+= behaves more like the version for
built-in types: you cannot call it on a temporary. As a member function, you
could:

*( begin() += 5 )
*( ++ begin() )

Whether that is an advantage or a shortcoming depends on your point of view.
However, for implementations of the standard library iterators, I would
choose the non-member approach since ++begin() is not guaranteed to compile
by the standard and I would love to see the implementation catch that non-
portable use.

Best

Kai-Uwe Bux

Keith H Duggar · May 3, 2010

What are the arguments for making things like operator+= a non-member,
out of interest? I'm aware of an (the?) argument for things like
operator+, where you often want to allow conversions on the left-hand
argument as well as the right, but what's the argument for the '='
versions please?

For some reasons read:

http://www.drdobbs.com/cpp/184401197

Most of the arguments apply to both operator and non-operator
functions.

KHD

James Kanze · May 4, 2010

I love it when people comment about things they know nothing
about. We did the measures with a number of different
compilers: VC++, g++, Intel and Sun CC. In all cases, passing
the pre-constructed vector to the function was significantly
faster than returning a vector.

When you do it manually instead of RVO you have an extra
default construct construction and an extra swap, but it
shouldn't really matter for efficiency.

The compiler can't always use RVO. Our two versions were:

std::vector<double> v(...);
for (... lot's of iterations ... )
{
// calculate some values based on the current contents
// of v.
v = some_function(... the calculated values ...);
// ...
}

as opposed to

std::vector<double> v(...);
for (... lot's of iterations ... )
{
// calculate some values based on the current contents
// of v.
some_function(&v, ... the calculated values ...);
// ...
}

(As far as I can tell, this is a more or less standard procedure
in numerical analysis. Although in some cases, you might have
two vectors, one with the old values, and one in which you put
the new, swapping them each time you go through the loop.)

James Kanze · May 4, 2010

On 02.05.2010 11:45, * James Kanze:

[...]

Sorry, no, you can't in general.

It's just an urban myth.

For example,

foo( v );

Here v might be a pointer denoting an out argument.

The caller of foo has v handy. He knows it's a pointer.

Or v might be an object with pointers to other objects.
Concluding that foo can't change anything because there's no
address operator in the call is just stupid; one has to know
about foo, and for that matter, about v.

Sure, you can do all sorts of stupid things, regardless of the
convention used.

In actual practice, this particular convention reduces
errors. Measurably. So do some of the other conventions
suggested. So you have to choose according to which one seems
most important in your context.

As I recall VC warns about that, so, no problem, argument moot.

As I recall, it doesn't, but it possibly (probably) depends on
the version.

They're certainly not "killer" arguments: they're void
arguments.

Except in actual practice.

Well, I've also used silly in-practice-detrimental conventions, like

void foo()
{
bar();
}

But one would sort of expect a modern firm like Google to not
/require/ its developers to do such silly things, and to have
understood the problems of micro-management and simply not do
that.

Hence my conclusion that it's most likely political.

No one disputes your "conclusion", since any choice between two
or more alternatives is in some way "political".

Alf P. Steinbach · May 4, 2010

On 02.05.2010 11:45, * James Kanze:
[...]

Yes and no. The logic is that in arguments are references, out
and inout pointers. This has two advantages:
-- It requires explicit action at the call site in order to
pass an out argument; you can see immediately that the
argument might change.

Click to expand...

Click to expand...

Sorry, no, you can't in general.

Click to expand...

It's just an urban myth.

Click to expand...

For example,

Click to expand...

foo( v );

Click to expand...

Here v might be a pointer denoting an out argument.

Click to expand...

The caller of foo has v handy. He knows it's a pointer.

I think you mean "the one who writes the code calling foo", otherwise it doesn't
make sense.

But the alleged advantage is not for that person writing the code, who would
know it was an out-param anyway.

The alleged advantage iss about someone else reading the code, who allegedly
could see from the call's lack of address operator that it's an out-param. And
clearly he/she can't. Hence the argument is void.

Sure, you can do all sorts of stupid things, regardless of the
convention used.

Yes, but that is not relevant.

In actual practice, this particular convention reduces
errors. Measurably. So do some of the other conventions
suggested. So you have to choose according to which one seems
most important in your context.

As I recall, it doesn't, but it possibly (probably) depends on
the version.

<example>
C:\test> cl /W4 x.cpp
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.30729.01 for 80x86
Copyright (C) Microsoft Corporation. All rights reserved.

x.cpp
x.cpp(5) : warning C4239: nonstandard extension used : 'argument' : conversion
from 'T' to 'T &'
A non-const reference may only be bound to an lvalue
Microsoft (R) Incremental Linker Version 9.00.30729.01
Copyright (C) Microsoft Corporation. All rights reserved.

/out:x.exe
x.obj

C:\test>
</example>

Some do maintain that upping the warning the level is impractictal due to
Microsoft's own dirty code. However, most Windows headers (in particular the
full [windows.h]) compile without warnings at level 4, when the appropriate few
silly-warnings have been turned off. Which one should do as a matter of course.

Except in actual practice.

You're saying "except in some circumstances that I refuse to discuss". Oh well.

No one disputes your "conclusion", since any choice between two
or more alternatives is in some way "political".

Cheers,

- Alf

Balog Pal · May 8, 2010

James Kanze said:
I love it when people comment about things they know nothing
about.

.... or fail to read minds auto-correcting misleading forum comments?

The compiler can't always use RVO. Our two versions were:

std::vector<double> v(...);
for (... lot's of iterations ... )
{
// calculate some values based on the current contents
// of v.
v = some_function(... the calculated values ...);
// ...
}

as opposed to

std::vector<double> v(...);
for (... lot's of iterations ... )
{
// calculate some values based on the current contents
// of v.
some_function(&v, ... the calculated values ...);
// ...
}

In this example you clearly use the vector as INOUT parameter, and have a
performance gain from that fact. Originally we were talking about OUT
parameters...

James Kanze · May 8, 2010

"James Kanze" <[email protected]>

... or fail to read minds auto-correcting misleading forum
comments?

In this example you clearly use the vector as INOUT parameter,
and have a performance gain from that fact.

In this example (in the actual code), the vector is a pure
output parameter.

Originally we were talking about OUT parameters...

So was I.

Thomas J. Gritzan · May 8, 2010

Am 08.05.2010 17:43, schrieb Balog Pal:

.... or fail to read minds auto-correcting misleading forum comments?

In this example you clearly use the vector as INOUT parameter, and have
a performance gain from that fact. Originally we were talking about OUT
parameters...

No. The performance gain comes from the fact that he only allocates
memory once and uses this memory for all iterations.
It's the same with std::getline (in a loop), where the output string's
capacity grows to the maximal line length at some time, and then there's
no more allocation unless you copy the strings.

Ian Collins · May 8, 2010

Am 08.05.2010 17:43, schrieb Balog Pal:

No. The performance gain comes from the fact that he only allocates
memory once and uses this memory for all iterations.
It's the same with std::getline (in a loop), where the output string's
capacity grows to the maximal line length at some time, and then there's
no more allocation unless you copy the strings.

That is true, but it's interesting how close the two approaches are in
performance when that factor is removed. Comparing case 1:

std::vector<std::vector<double> > v(loops);
for( int n = 0; n < loops; ++n )
{
fn1(v[n]);
double d = v[n][4999];
}

with case 2:

std::vector<std::vector<double> > v(loops);
for( int n = 0; n < loops; ++n )
{
v[n] = fn2();
double d = v[n][4999];
}

Where f1 and f2 are:

void fn1( std::vector<double>& v ) {
for( int i = 0; i < 1000000; ++i ) v.push_back( 42.0*i );
}

std::vector<double> fn2() {
std::vector<double> v;
for( int i = 0; i < 1000000; ++i ) v.push_back( 42.0*i );
return v;
}

I see 3.73 seconds for case 1 and 4.07 seconds for case 2.

James Kanze · May 9, 2010

[...]

That is true, but it's interesting how close the two
approaches are in performance when that factor is removed.
Comparing case 1:

std::vector<std::vector<double> > v(loops);
for( int n = 0; n < loops; ++n )
{
fn1(v[n]);
double d = v[n][4999];
}

with case 2:

std::vector<std::vector<double> > v(loops);
for( int n = 0; n < loops; ++n )
{
v[n] = fn2();
double d = v[n][4999];
}

Where f1 and f2 are:

void fn1( std::vector<double>& v ) {
for( int i = 0; i < 1000000; ++i ) v.push_back( 42.0*i );
}

That's not quite the same thing. A closer approximation would
be something like:

std::vector<double> f1()
{
std::vector<double> retval;
for ( int i = 0; i < 1000000; ++i ) v.push_back( 42.0 * i );
return retval;
}

void f2( std::vector<double>& retval )
{
retval.clear();
for ( int i = 0; i < 1000000; ++i ) v.push_back( 42.0 * i );
}

or even:

void f2( std::vector<double>& retval )
{
assert( retval.size() >= 1000000 );
for ( int i = 0; i < 1000000; ++ ) v = 42.0 * i;
}

Where f1 or f2 is called in a loop (and in the case of f2, the
definition of the vector is outside of the loop).

std::vector<double> fn2() {
std::vector<double> v;
for( int i = 0; i < 1000000; ++i ) v.push_back( 42.0*i );
return v;
}

Click to expand...

I see 3.73 seconds for case 1 and 4.07 seconds for case 2.

Click to expand...

The issue isn't as clear cut as that, even in your example. In
a real program, for example, using push_back each time on a new
vector will result in more allocations, and possibly (probably?)
more fragmentation.

Anyway, the fact is that we originally used the "natural"
version, returning the vector, that the program was too slow,
that one of the changes we tried was passing the vector as an
out parameter, and that it made a significant difference.

And that the compiler we were using did RVO (both named and
unnamed), but that the context we we working in didn't allow it.
(IIRC, the vector was part of a larger struct.)

Balog Pal · May 9, 2010

James Kanze said:
In this example (in the actual code), the vector is a pure
output parameter.

It isn't, as you smuggle in its previous state across iterations. You even
init it to something before the first.

Can you show the content of some_function please to make the source of
performance clear?

Balog Pal · May 9, 2010

Thomas J. Gritzan said:
No. The performance gain comes from the fact that he only allocates
memory once and uses this memory for all iterations.

Of course. And then what your "No" applies to? That preallocated memory is
IN param used in the called function. That is why it is not out, but inout
case.

Thomas J. Gritzan · May 10, 2010

Am 09.05.2010 22:49, schrieb Balog Pal:

Of course. And then what your "No" applies to? That preallocated memory
is IN param used in the called function. That is why it is not out, but
inout case.

When passing memory counts as IN param, then every OUT parameter is also
an IN parameter, because if you wouldn't pass (pointers to) memory in,
you would have no place to store the OUT parameters.

Balog Pal · May 10, 2010

Thomas J. Gritzan said:
When passing memory counts as IN param, then every OUT parameter is also
an IN parameter, because if you wouldn't pass (pointers to) memory in,
you would have no place to store the OUT parameters.

IMO that is a too simplistic and impractical view.

If the called function does like:

void foo( Vector & param)
{
Vector res;
/// ... fill res;
param.swap(res);
}

then it uses param as out only. Any state param could have will not leak
in. However,

{
param.resize(0);
param.reserve(100);
// do 100 push-backs
}
or
{
param.resize(100);
// assign to elements [0] - [99]
}

uses the state of param as it was passed in. And performance of the function
will be different in latter cases if you do just
Vector v();
or also do
v.reserve(100);
before calling foo(v);

Or like in a loop example just let it roll -- the first time allocation
happens then all the further calls keep the block.

Selling this as pure OUT parameter will not fly in my terminology.

The issue is not straightforward because the state of contained elements do
not differ in the alternatives. But capacity() is also part of the state.
With another collection, we could have similar state but entirely private,
and it still would count.

AnonJ · Dec 14, 2013

What are you guys talking about? Seriously. What "pollitics"? Even the least competent company won't let such ridiculous nonsense ruin their productivity, not to mention Google, obviously. I think the reason why they use it is abundantly self-evident:

In C, if a function needs to modify a variable, the parameter must use a pointer, eg int foo(int *pval).

Google has loads of old code base written in C. So obviously the reason why they do so is to preserve the maximum interoperability even if the new codes are written in C++. This is why. Clear and simple, and overwhelmingly convincing.

General caution is good, but why would some people always be so cynical and suspect the very worst to the extreme on the first sight of somebody/something new is truly beyond my comprehension. I totally dislike such presumptions.

non-const reference and const reference	10	Dec 14, 2007
Why pass doubles as const references?	46	Feb 10, 2013
implicit passing by reference	18	Feb 20, 2014
Cheaper to pass by reference?	12	Sep 21, 2009
Const reference passing in constructor	2	Jun 21, 2011
const reference	2	Aug 15, 2007
Const/non-const pointer returning method	11	May 25, 2010
How can I pass-through an r-value reference in VC++ 2013?	4	Nov 24, 2013

Don't pass by reference to non-const?

Alf P. Steinbach

Alf P. Steinbach

Kai-Uwe Bux

Keith H Duggar

James Kanze

James Kanze

Alf P. Steinbach

Balog Pal

James Kanze

Thomas J. Gritzan

Ian Collins

James Kanze

Balog Pal

Balog Pal

Thomas J. Gritzan

Balog Pal

AnonJ

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads