Passing by const & and returning a temp vs passing by value and returningit

V

Victor Bazarov

In the project I'm maintaining I've seen two distinct techniques used for
returning an object from a function. One is

AType function(AType const& arg)
{
AType retval(arg); // or default construction and then..
// some other processing and/or changing 'retval'
return retval;
}

void foo()
{
AType somevariable;

AType anothervariable = function(somevariable); // pass by ref
// some other processing
}

and the other is

AType function(AType retval)
{
// some other processing and changing 'retval'
return retval;
}

void foo()
{
AType somevariable;

AType anothervariable = function(somevariable); // pass by value
// some other processing
}

As you can see the difference is when the variable which later is returned
is constructed. Yes, I realize the difference is really minimal. It is
probably has no effect on performance (and I wouldn't try to speculate one
way or another). The number of objects [copy-] constructed is probably
the same (even in theory, RVO aside). The objects being passed around are
*not* polymorphic. What am I forgetting? Copy-construction is defined
for them and in general no tricks are played. All relatively straight-
forward.

Now, my question is, why would I want to prefer/preserve one way of using
values passed to the function or should I keep both? Do you see any reefs
under the surface, any potential issues with one method or with having
both methods in the same project? Would the optimizer stumble on one of
the methods and not the other, for example?

I'd appreciate any insight. Thanks!

V
 
H

Howard

Ahmed MOHAMED ALI said:
The number of objects [copy-] constructed is probably
the same (even in theory, RVO aside
No.
AType function(AType retval) --> copy constructor will be called .

Eh? Look at his post again:

The copy constructor is called there, too. The difference is that here it's
called inside the function, and in the other case it was called in order to
pass a copy to the function in the first place.
Use this version only if AType has a simple and fast initialization
otherwise passing argument by reference is better ( const reference if you
will not modify the argument) .
Ahmed MOHAMED ALI

I think you missed what I pointed out above, that both versions are making
copies. If he were writing functions that didn't return a copy (thus making
the copy neccessariy in both cases), then your advice would make more sense.
In this case, it's not apparent that there is a difference, which is why
Victor is looking for more insight (than he already has!) into the problem.

-Howard
 
H

Howard

If he were writing functions that didn't return a copy (thus making the
copy neccessariy in both cases), then your advice would make more sense.

Re-worded to be more clear:

He's writing functions that return a copy, thus making the copy-construction
neccessary in both cases. If the functions were *not* returning a copy,
then your advice would make more sense.

-Howard
 
L

lilburne

Victor said:
Now, my question is, why would I want to prefer/preserve one way of using
values passed to the function or should I keep both? Do you see any reefs
under the surface, any potential issues with one method or with having
both methods in the same project? Would the optimizer stumble on one of
the methods and not the other, for example?

I'd appreciate any insight. Thanks!


We (http://tinyurl.com/7y38k) prefer the form

void func(const myClass& input, myClass& output);

we discourage copy construction and assignment by default, our
make-a-new class script makes both of those methods private. Why? Well
our classes can contain 100Mb+ of data and we don't want those being
copied about or assigned unnecessarily, this we'll go to any lengths to
avoid someone inadvertantly writing:

myClass func(myClass input);
 
A

Ahmed MOHAMED ALI

The number of objects [copy-] constructed is probablyNo.
AType function(AType retval) --> copy constructor will be called .
Use this version only if AType has a simple and fast initialization
otherwise passing argument by reference is better ( const reference if you
will not modify the argument) .
Ahmed MOHAMED ALI

Victor Bazarov said:
In the project I'm maintaining I've seen two distinct techniques used for
returning an object from a function. One is

AType function(AType const& arg)
{
AType retval(arg); // or default construction and then..
// some other processing and/or changing 'retval'
return retval;
}

void foo()
{
AType somevariable;

AType anothervariable = function(somevariable); // pass by ref
// some other processing
}

and the other is

AType function(AType retval)
{
// some other processing and changing 'retval'
return retval;
}

void foo()
{
AType somevariable;

AType anothervariable = function(somevariable); // pass by value
// some other processing
}

As you can see the difference is when the variable which later is returned
is constructed. Yes, I realize the difference is really minimal. It is
probably has no effect on performance (and I wouldn't try to speculate one
way or another). The number of objects [copy-] constructed is probably
the same (even in theory, RVO aside). The objects being passed around are
*not* polymorphic. What am I forgetting? Copy-construction is defined
for them and in general no tricks are played. All relatively straight-
forward.

Now, my question is, why would I want to prefer/preserve one way of using
values passed to the function or should I keep both? Do you see any reefs
under the surface, any potential issues with one method or with having
both methods in the same project? Would the optimizer stumble on one of
the methods and not the other, for example?

I'd appreciate any insight. Thanks!

V
 
H

Howard

lilburne said:
We (http://tinyurl.com/7y38k) prefer the form

void func(const myClass& input, myClass& output);

Yuk! :)
we discourage copy construction and assignment by default, our make-a-new
class script makes both of those methods private. Why? Well our classes
can contain 100Mb+ of data and we don't want those being copied about or
assigned unnecessarily, this we'll go to any lengths to avoid someone
inadvertantly writing:

myClass func(myClass input);

Avoiding mistakes is a noble concept, but I think you go too far. I much
prefer that whenever I have only one output from a function, then that
output be the return value. I have at least a couple of reasons for that
preference. First, it makes it obvious what the output is, without having
to name it "output". Second, it allows use in an assignment or other
statement, instead of having to assign to a local variable and then write a
whole new statement to actually use it. (Think about operators... how would
you write operator +() given your method above?)

And since I prefer to only have one output from a function whenever
possible, I pretty much always follow that rule. (And of course, if I want
to actually modify the variable passed instead of returning something, then
I pass by reference or pointer, as appropriate.)

-Howard
 
E

E. Robert Tisdale

Victor said:
In the project I'm maintaining I've seen two distinct techniques used for
returning an object from a function. One is

AType function(AType const& arg) {
AType retval(arg); // or default construction and then..
// some other processing and/or changing 'retval'
return retval;
}

void foo(void) {
AType somevariable;

AType anothervariable = function(somevariable); // pass by ref
// some other processing
}

and the other is

AType function(AType retval) {
// some other processing and changing 'retval'
return retval;
}

void foo(void) {
AType somevariable;

AType anothervariable = function(somevariable); // pass by value
// some other processing
}

As you can see, the difference is when the variable which later is returned
is constructed. Yes, I realize the difference is really minimal. It is
probably has no effect on performance (and I wouldn't try to speculate one
way or another). The number of objects [copy-] constructed is probably
the same (even in theory, RVO aside). The objects being passed around are
*not* polymorphic. What am I forgetting? Copy-construction is defined
for them and in general no tricks are played. All relatively straight-
forward.

Now, my question is, why would I want to prefer/preserve one way of using
values passed to the function or should I keep both? Do you see any reefs
under the surface, any potential issues with one method or with having
both methods in the same project? Would the optimizer stumble on one of
the methods and not the other, for example?

I'd appreciate any insight.

A working example might help:
cat AType.h
#ifndef GUARD_ATYPE_H
#define GUARD_ATYPE_H 1

#include <iostream>

class AType {
private:
// representation
int I;
public:
friend
std::eek:stream& operator<<(std::eek:stream& os, AType const& t) {
return os << t.I;
}

AType(int i = 0): I(i) {
std::cerr << "AType(int)" << std::endl;
}

AType(AType const& t): I(t.I) {
std::cerr << "AType(AType const&)" << std::endl;
}
};

#endif//GUARD_ATYPE_H

cat main.cc
#include "AType.h"

#ifndef BY_VALUE
AType function(AType const& arg) { // pass by reference
AType retval(arg); // or default construction and then..
// some other processing and modifying 'retval'
return retval;
}
#else //BY_VALUE
AType function(AType retval) { // pass by value
// some other processing and modifying 'retval'
return retval;
}
#endif//BY_VALUE

int main(int argc, char* argv[]) {
const
AType somevariable(13);
const
AType anothervariable = function(somevariable);
std::cerr << anothervariable << std::endl;
// some other processing
return 0;
}
g++ -Wall -ansi -pedantic -o main main.cc
./main
AType(int)
AType(AType const&)
13
g++ -DBY_VALUE -Wall -ansi -pedantic -o main main.cc
./main
AType(int)
AType(AType const&)
AType(AType const&)
13

When you pass by value, the copy constructor is called *twice* --
once to copy the function argument and once again
to copy the [modified] function argument into the return value.
For this reason, pass by [const] reference is always preferred
over pass by value when passing large objects.
 
C

Cy Edmunds

Victor Bazarov said:
In the project I'm maintaining I've seen two distinct techniques used for
returning an object from a function. One is

AType function(AType const& arg)
{
AType retval(arg); // or default construction and then..
// some other processing and/or changing 'retval'
return retval;
}

void foo()
{
AType somevariable;

AType anothervariable = function(somevariable); // pass by ref
// some other processing
}

and the other is

AType function(AType retval)
{
// some other processing and changing 'retval'
return retval;
}

void foo()
{
AType somevariable;

AType anothervariable = function(somevariable); // pass by value
// some other processing
}

As you can see the difference is when the variable which later is returned
is constructed. Yes, I realize the difference is really minimal. It is
probably has no effect on performance (and I wouldn't try to speculate one
way or another). The number of objects [copy-] constructed is probably
the same (even in theory, RVO aside). The objects being passed around are
*not* polymorphic. What am I forgetting? Copy-construction is defined
for them and in general no tricks are played. All relatively straight-
forward.

Now, my question is, why would I want to prefer/preserve one way of using
values passed to the function or should I keep both? Do you see any reefs
under the surface, any potential issues with one method or with having
both methods in the same project? Would the optimizer stumble on one of
the methods and not the other, for example?

I'd appreciate any insight. Thanks!

V

I would slightly prefer the const ref version. I find it a little clearer
and more logical. The reader might wonder, "Why is the client being asked to
pass in the return value?" It's also a little more flexible if you need to
refactor. Maybe at some point you will want to pass in a const BType & which
can be used to construct an AType but isn't an AType.

There are no big issues that I see though.
 
V

Victor Bazarov

E. Robert Tisdale said:
A working example might help:

No, it won't. See notes below.
[...]

When you pass by value, the copy constructor is called *twice* --
once to copy the function argument and once again
to copy the [modified] function argument into the return value.

For this reason, pass by [const] reference is always preferred
over pass by value when passing large objects.

When you pass by reference, the copy constructor is also called twice --
once to create a local variable from the reference passed as the argument
and once again to copy the modified function local object into the return
value. Your compiler apparently doing RVO behind your back. Try it with
a different compiler. On VC++ v7.1 you get two exact same outputs for two
different ways of passing the argument.

For this reason never trust the results any particular compiler gives you
when you need a conclusion about the behaviour in general.

V
 
E

E. Robert Tisdale

Victor said:
E. Robert Tisdale said:
A working example might help:

No, it won't. See notes below.
[...]

When you pass by value, the copy constructor is called *twice* --
once to copy the function argument and once again
to copy the [modified] function argument into the return value.
For this reason, pass by [const] reference is always preferred
over pass by value when passing large objects.

When you pass by reference, the copy constructor is also called twice --
once to create a local variable from the reference passed as the argument
and once again to copy the modified function local object
into the return value.

This is a deficiency in your optimizing compiler.
Your compiler apparently doing RVO behind your back.

You probably mean
the *Named* Return Value Optimization (NRVO) in this case.
No it isn't doing the NRVO behind my back.
I expect any good optimizing C++ compiler
to perform this optimization for me.
Try it with a different compiler.
On VC++ v7.1 you get two exact same outputs
for two different ways of passing the argument.

For this reason never trust the results any particular compiler gives you
when you need a conclusion about the behaviour in general.

You are confused.
Your reasoning is flawed.

The problem with the pass by value version is that
*no* compiler will be able to optimize away the superfluous copy.
Even VC++ will eventually implement the NRVO
but the pass by value version -- function(AType) --
will *still* require two copies.

It is a bad idea to cobble your code
just to accommodate an inferior compiler.
If your C++ compiler does not implement the NRVO,
you should be shopping for a better optimizing C++ compiler.

The ANSI/IOS C++ standards don't specify
which optimizations are implemented.
They only specify which optimizations are allowed
and it is up to the marketplace
to weed out inferior implementations.
 
H

Hang Dog

Howard said:
Yuk! :)




Avoiding mistakes is a noble concept, but I think you go too far. I much
prefer that whenever I have only one output from a function, then that
output be the return value. I have at least a couple of reasons for that
preference. First, it makes it obvious what the output is, without having
to name it "output". Second, it allows use in an assignment or other
statement, instead of having to assign to a local variable and then write a
whole new statement to actually use it.

Shrug. None of that makes much difference. What we don't
want is many Kb and Mbs being copied about, or loads of
memory being allocated and deallocated needlessly (with the
attendant problems of memory fragmentation). As said before
our objects can contain large amounts of data.

(Think about operators... how would
you write operator +() given your method above?)

In general we wouldn't return by value, well except for
small objects like Vectors, Points, smart pointers, and
other miniscule classes. Anyway I think that out of the
1000s of classes we have less than 10 would have an
operator+(). We prefer an idiom that isn't going to produce
surprises than one that does.
And since I prefer to only have one output from a function whenever
possible, I pretty much always follow that rule. (And of course, if I want
to actually modify the variable passed instead of returning something, then
I pass by reference or pointer, as appropriate.)

In many cases it is preferable to reuse the memory that an
existing object already contains rather than creating a new one.
 
L

lilburne

Howard said:
Yuk! :)




Avoiding mistakes is a noble concept, but I think you go too far. I much
prefer that whenever I have only one output from a function, then that
output be the return value. I have at least a couple of reasons for that
preference. First, it makes it obvious what the output is, without having
to name it "output". Second, it allows use in an assignment or other
statement, instead of having to assign to a local variable and then write a
whole new statement to actually use it.

Shrug. None of that makes much difference. What we don't
want is many Kb and Mbs being copied about, or loads of
memory being allocated and deallocated needlessly (with the
attendant problems of memory fragmentation). As said before
our objects can contain large amounts of data.

(Think about operators... how would
you write operator +() given your method above?)

In general we wouldn't return by value, well except for
small objects like Vectors, Points, smart pointers, and
other miniscule classes. Anyway I think that out of the
1000s of classes we have less than 10 would have an
operator+(). We prefer an idiom that isn't going to produce
surprises than one that does.
And since I prefer to only have one output from a function whenever
possible, I pretty much always follow that rule. (And of course, if I want
to actually modify the variable passed instead of returning something, then
I pass by reference or pointer, as appropriate.)

In many cases it is preferable to reuse the memory that an
existing object already contains rather than creating a new one.
 
V

Victor Bazarov

E. Robert Tisdale said:
[...]
The problem with the pass by value version is that
*no* compiler will be able to optimize away the superfluous copy.

Which copy out of the two is superfluous? And if it's superfluous,
why no compiler will be able to optimize it away?
 
H

Howard

Shrug. None of that makes much difference. What we don't want is many Kb
and Mbs being copied about, or loads of memory being allocated and
deallocated needlessly (with the attendant problems of memory
fragmentation). As said before our objects can contain large amounts of
data.

Here's your code again:

void func(const myClass& input, myClass& output);

Now, in order to do something with that output object, it has to be created
somewhere. Since you're passing it by reference, it apparently is created
outside this function, then passed in via reference, then its contents
filled (or altered) by the code in func().

How is that any more efficient? The output object has to run through its
constructor (at some point prior to calling func()), and then run through
func() in order to get its data. That sounds like the same amount of work
to me.

From your comment below about re-using memory, though, I'm guessing that the
output object is something that you have in reserve, and simply re-use it as
needed. Is that correct? In that case, it sounds like a good solution for
your particular problem. But as a general practice (i.e, outside your
particular case), I still see no real benefit.
In general we wouldn't return by value, well except for small objects like
Vectors, Points, smart pointers, and other miniscule classes. Anyway I
think that out of the 1000s of classes we have less than 10 would have an
operator+(). We prefer an idiom that isn't going to produce surprises than
one that does.


In many cases it is preferable to reuse the memory that an existing object
already contains rather than creating a new one.

-Howard
 
P

Pete Becker

Howard said:
And since I prefer to only have one output from a function whenever
possible, I pretty much always follow that rule. (And of course, if I want
to actually modify the variable passed instead of returning something, then
I pass by reference or pointer, as appropriate.)

One lesson I learned early on: CAD folks live in a different world from
the rest of us. So don't be too hard on lilburne: his problems aren't
like the ones you and I are used to.
 
E

E. Robert Tisdale

Victor said:
E. Robert Tisdale said:
[...]
The problem with the pass by value version is that
*no* compiler will be able to optimize away the superfluous copy.

Which copy out of the two is superfluous?
And if it's superfluous,
why no compiler will be able to optimize it away?

AType function(AType retval) { // pass by value
// some other processing and modifying 'retval'
return retval;
}

When your C++ compiler emits code to invoke:

AType anothervariable = function(somevariable);

it calls the copy constructor to copy somevariable into retval.
Then, function(AType) calls the copy constructor again
to copy retval into anothervariable.
One of these copies is superfluous.
If, instead, you implement pass by [const] reference,

AType function(AType const& arg) { // pass by reference
AType retval(arg); // or default construction and then..
// some other processing and modifying 'retval'
return retval;
}

the compiler is allowed to emit code
to copy somevariable directly into anothervariable.
The compiler recognizes retval
as a synonym for the return value -- anothervariable in this case.

So, in your example,
pass by value *requires* two copies but
pass by reference requires only one copy.
 
V

Victor Bazarov

E. Robert Tisdale said:
Victor said:
E. Robert Tisdale said:
[...]
The problem with the pass by value version is that
*no* compiler will be able to optimize away the superfluous copy.


Which copy out of the two is superfluous?
And if it's superfluous,
why no compiler will be able to optimize it away?


AType function(AType retval) { // pass by value
// some other processing and modifying 'retval'
return retval;
}

When your C++ compiler emits code to invoke:

AType anothervariable = function(somevariable);

it calls the copy constructor to copy somevariable into retval.

Why do you think it can't avoid doing that? If it can recognize that
the argument is going to be returned, why can't it just construct the
temporary which will be returned?
Then, function(AType) calls the copy constructor again
to copy retval into anothervariable.
One of these copies is superfluous.

How do you know and what prevents the compiler from optimizing one of
them away, that's what I asked and that's what you failed to answer so
far.
If, instead, you implement pass by [const] reference,

AType function(AType const& arg) { // pass by reference
AType retval(arg); // or default construction and then..
// some other processing and modifying 'retval'
return retval;
}

the compiler is allowed to emit code
to copy somevariable directly into anothervariable.

Is that NRVO and copy-initialisation baked into one? I guess that you
mean to say that in a stand-alone function

AType function() {
AType retval;
..
return retval;
}

the temporary to be returned can be used instead of the local variable and
no additional construction takes place there, right? And, in conjunction
with

AType var1;
..
AType var2 = var1;

which is the same as

AType var1;
..
AType var2 = AType(var1);

creation of a temporary can be omitted, thus making it

AType var1;
..
AType var2(var1);

.. Yes, I knew about those. However the following is still confusing:
So, in your example,
pass by value *requires* two copies but
pass by reference requires only one copy.

If it *requires* two copies, why do you say that one of them is
"superfluous"? If it is truly superfluous, why is the copy *required*?
Do you see my problem here? I'd appreciate a quote or at least a pointer
to the passage from the Standard.

V
 
E

E. Robert Tisdale

Victor said:
E. Robert Tisdale said:
Victor said:
E. Robert Tisdale wrote:

[...]
The problem with the pass by value version is that
*no* compiler will be able to optimize away the superfluous copy.

Which copy out of the two is superfluous?
And if it's superfluous,
why no compiler will be able to optimize it away?

AType function(AType retval) { // pass by value
// some other processing and modifying 'retval'
return retval;
}

When your C++ compiler emits code to invoke:

AType anothervariable = function(somevariable);

it calls the copy constructor to copy somevariable into retval.

Why do you think it can't avoid doing that?
If it can recognize that the argument is going to be returned,
why can't it just construct the temporary which will be returned?
Then, function(AType) calls the copy constructor again
to copy retval into anothervariable.
One of these copies is superfluous.

How do you know
and what prevents the compiler from optimizing one of them away,

The compiler could "optimizing one of them away"
if it could inline function(AType).
I'm not sure but I don't think that that is what you meant.
If function(AType) is defined externally
(and not visible to the compiler when it is invoked)
the optimizer will *not* be able to inline it
and the compiler will be obliged to emit code
to pass by value by value (copy somevariable into retval).
that's what I asked and that's what you failed to answer so far.
If, instead, you implement pass by [const] reference,

AType function(AType const& arg) { // pass by reference
AType retval(arg); // or default construction and then..
// some other processing and modifying 'retval'
return retval;
}

the compiler is allowed to emit code
to copy somevariable directly into anothervariable.

Is that NRVO and copy-initialisation baked into one?
I guess that you mean to say that in a stand-alone function

AType function(void) {
AType retval;
..
return retval;
}

the temporary to be returned can be used instead of the local variable and
no additional construction takes place there, right?
Yes.

And, in conjunction with

AType var1;
..
AType var2 = var1;

which is the same as

AType var1;
..
AType var2 = AType(var1);

creation of a temporary can be omitted, thus making it

AType var1;
..
AType var2(var1);

. Yes, I knew about those. However the following is still confusing:
So, in your example,
pass by value *requires* two copies but
pass by reference requires only one copy.

If it *requires* two copies,
why do you say that one of them is "superfluous"?

It is "superfluous" in the sense that
an extra unnecessary copy is required.
It is superfluous in the same sense that temporary
is superfluous in the following example:

AType function(AType const&);
AType somevariable(13);
AType temporary(somevariable);
AType anothervariable = function(temporary);
If it is truly superfluous, why is the copy *required*?

Because pass by value requires a copy.
That's what pass by value means.
Do you see my problem here?

I'm not sure I do. I probably don't.
You might be wondering, "Why doesn't the optimizer recognize
the formal function argument passed by value (retval)
as a synonym for the return value?"
Of course, the answer to that question is that
the calling program doesn't know that
retval is a synonym for the return value
so it can't copy it into the return value instead.
I'd appreciate a quote
or at least a pointer to the passage from the Standard.

I can't help you here.
I'm not even sure what you are looking for from the standards.
They don't specify which optimizations are implemented.
At best, they may specify which optimizations are allowed.
The NVRO is allowed.
 
D

Daniel T.

lilburne said:
We (http://tinyurl.com/7y38k) prefer the form

void func(const myClass& input, myClass& output);

we discourage copy construction and assignment by default, our
make-a-new class script makes both of those methods private. Why? Well
our classes can contain 100Mb+ of data and we don't want those being
copied about or assigned unnecessarily, this we'll go to any lengths to
avoid someone inadvertantly writing:

myClass func(myClass input);

In that case, I would suggest:

myClass& func( const myClass& input, myClass& output ) {
/* modify output and return it */
}

That way you can nest easer, this is a pretty standard C idiom.

In general though, I would rather see this func as a member of myClass:

class myClass {
public:
void func( const myClass& input ); /* modifies 'this' */
};
 
E

E. Robert Tisdale

Daniel said:
I would suggest:

myClass& func(const myClass& input, myClass& output) {
// modify output and return it
}

I would *never* suggest this.
That way you can nest easer, this is a pretty standard C idiom.

No it isn't. It's just a bad habit.
In general though, I would rather see this func as a member of myClass:

class myClass {
public:
void func(const myClass& input); // modifies *this
};

void functions are almost always a very bad idea
because they can't be used in expressions.
You are obliged to manage temporary intermediate results explicitly
and your code ends up looking like assembler.
They just make your programs more difficult
to read, analyze, understand and maintain.

If you must have a member function that modifies *this,
it should, return a reference to *this:

class myClass {
public:
myClass& func(const myClass& input) {
// modify *this
return *this;
}
};
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,733
Messages
2,569,440
Members
44,832
Latest member
GlennSmall

Latest Threads

Top