why is binding non-const ref to return value bad?

P

PeteUK

Hello,

I'm doing a code review for someone and picked them up on this:

BigClass AFunction() { /*returns a large object by value*/ }

Caller()
{
BigClass bc = AFunction();
/* use bc */
}

I saif they should do this in the caller to avoid copying the
temporary:

Caller()
{
const BigClass& bc = AFunction();
/* use bc */
}

They came back with the following code which compiled but I asked them
to change to const but I had trouble justifying why it should be
const:

Caller()
{
BigClass& bc = AFunction();
/* use bc */
}

Does the life of the temporary get extended using the non-const
reference as it does from using the const reference? Was I right to
pick him up on it not being const? Please give me some rationale if
I'm right!

Thanks,

PeteUK
 
J

Johannes Schaub (litb)

PeteUK said:
Hello,

I'm doing a code review for someone and picked them up on this:

BigClass AFunction() { /*returns a large object by value*/ }

Caller()
{
BigClass bc = AFunction();
/* use bc */
}

I saif they should do this in the caller to avoid copying the
temporary:

Caller()
{
const BigClass& bc = AFunction();
/* use bc */
}
This won't avoid a copy in current C++ more than it does if you remove the
reference.
They came back with the following code which compiled but I asked them
to change to const but I had trouble justifying why it should be
const:

Caller()
{
BigClass& bc = AFunction();
/* use bc */
}

Does the life of the temporary get extended using the non-const
reference as it does from using the const reference? Was I right to
pick him up on it not being const? Please give me some rationale if
I'm right!

You can justify by saying making it nonconst violates the C++ specs. A
temporary in itself has no bearing of being referenced by name after its
created. Having it able to bind to a const reference is a sort of workaround
to be able to accept both named object and the temporary. This can be seen
by noticing that the compiler may still copy the temporary - it doesn't try
to keep the temporaries' object identity the same.

If you need a modifiable object, then copy it into a local non-const, non-
reference variable.
 
P

PeteUK

This won't avoid a copy in current C++ more than it does if you remove the
reference.

Oh - I thought having the reference would force the compiler to
prevent a copy. My understanding is that without the reference, the
compiler is at liberty to invoke the copy constructor to construct bc
with the temporary as an argument. Is my understanding wrong?
You can justify by saying making it nonconst violates the C++ specs. A
temporary in itself has no bearing of being referenced by name after its
created. Having it able to bind to a const reference is a sort of workaround
to be able to accept both named object and the temporary. This can be seen
by noticing that the compiler may still copy the temporary - it doesn't try
to keep the temporaries' object identity the same.

Violate the C++ specs - that's good enough. This page
http://herbsutter.spaces.live.com/blog/cns!2D4327CC297151BB!378.entry
from Herb Sutter also says it's wrong to use a non-const ref as it's
not portable C++.
If you need a modifiable object, then copy it into a local non-const, non-
reference variable.

OK.

I would still like to clear up the point above on whether using a
const ref can guarantee that a copy will not take place.

Thanks,

Pete
 
P

peter koch

Hello,

I'm doing a code review for someone and picked them up on this:

BigClass AFunction() {   /*returns a large object by value*/  }

Caller()
{
  BigClass bc = AFunction();
  /* use bc */

}

I saif they should do this in the caller to avoid copying the
temporary:

Caller()
{
  const BigClass& bc = AFunction();
  /* use bc */

}

They came back with the following code which compiled but I asked them
to change to const but I had trouble justifying why it should be
const:

Caller()
{
  BigClass& bc = AFunction();
  /* use bc */

}

Does the life of the temporary get extended using the non-const
reference as it does from using the const reference? Was I right to
pick him up on it not being const? Please give me some rationale if
I'm right!

Well, first of all I believe you should know C++ better to do a code
review. First of all, you misunderstand the consequence of extending
the life of the temporary. By having a const& you essentially have the
code

BigClass __hidden = AFunction();
BigClass const& bc = __hidden;

Secondly you seem to be unaware of RVO. There will be no copy of the
big object in the code shown - at least not with any compiler I know
of when you optimise the code.

Last, you are complaining about some code because of some undefined
performance concern. To do this without doing measurements is quite
naïve.

/Peter
 
P

PeteUK

Well, first of all I believe you should know C++ better to do a code
review.

I'm more versed in C++ than the other programmers in the organisation
I
am part of. Please note that I am not making a claim that I know it
well.
Far from it!
It is part of the job description of all programmers here to provide
code
reviews. The guy whose code I'm reviewing also reviews my code. That's
the
way the organisation works. Anyway, by asking the question on here I
am
attempting to know C++ better.
First of all, you misunderstand the consequence of extending
the life of the temporary. By having a const& you essentially have the
code

 BigClass __hidden = AFunction();
 BigClass const& bc = __hidden;

Following on from that, would you then say that following two forms
are equivalent?:

const BigClass bc = AFunction();
const BigClass& bc = AFunction();

Because I thought the first form is *potentially* less efficient than
the second because in the first form the compiler is given leway to
construct bc via the copy constructor.
Secondly you seem to be unaware of RVO. There will be no copy of the
big object in the code shown - at least not with any compiler I know
of when you optimise the code.

Read about RVO a long time ago. Probably from Meyers' book. When I
think of RVO and how best to enable it from the programmer's POV, I've
distilled it down to trying to return an unnamed object - i.e.
construct
the object at return time. As opposed to default constructing the
object,
calling setters, then returning.
Last, you are complaining about some code because of some undefined
performance concern. To do this without doing measurements is quite
naïve.

I fully agree that premature optimisation is the root of all evil. I
also
preach this to my fellow developers who use unsigned chars instead of
enums
or eschew virtual functions and try to write their own dispatch
mechanism!

The point I'm talking about here is analogous (in my mind anyway!) to
how
to accept an object as a parameter. If using a builtin, then accept by
value (copy), if an object then accept by const ref. These are
guidelines
I stick to always and I figured my guideline of attaching to a return
value
of a non builtin type by const ref also had *potential* performance
merit.

Pete
 
M

Miles Bader

PeteUK said:
Read about RVO a long time ago. Probably from Meyers' book. When I
think of RVO and how best to enable it from the programmer's POV,
I've distilled it down to trying to return an unnamed object -
i.e. construct the object at return time. As opposed to default
constructing the object, calling setters, then returning.

With gcc, at least, there seems to be almost no difference -- if gcc
can figure out which "named object" you're going to return, it will use
RVO for that too, instead of creating it locally, just as if you had
done "return X(...)". I imagine the same is true of other modern
compilers.

I've basically found it almost completely unnecessary to even think
about the issue, even where performance is important; as long as the
code isn't obfuscated, just writing things naturally and returning by
value generally works very well.

For instance, gcc 4.4 will generate _exactly_ the same code for
"fun1" and "fun2" below (given -O2 or so):

class C {
public:

C ()
: x (0), y (0), z (0)
{ big_array[0] = 0; }

C (int _x, int _y, int _z, char _b)
: x (_x), y (_y), z (_z)
{ big_array[0] = _b; }

void set_x (int _x) { x = _x; }
void set_y (int _y) { y = _y; }
void set_z (int _z) { z = _z; }
void set_b (char _b) { big_array[0] = _b; }

int x, y, z;

char big_array[100];
};

C fun1 (bool t)
{
int z = t ? 3 : 7;
return C (1, 2, z, 4);
}

C fun2 (bool t)
{
C c;
c.set_x (1);
c.set_y (2);
if (t)
c.set_z (3);
else
c.set_z (7);
c.set_b (4);
return c;
}


yields:

_Z4fun1b:
cmpb $1, %sil
movq %rdi, %rax
sbbl %edx, %edx
movl $1, (%rdi)
andl $4, %edx
movl $2, 4(%rdi)
addl $3, %edx
movb $4, 12(%rdi)
movl %edx, 8(%rdi)
ret

_Z4fun2b:
cmpb $1, %sil
movq %rdi, %rax
sbbl %edx, %edx
movl $1, (%rdi)
andl $4, %edx
movl $2, 4(%rdi)
addl $3, %edx
movb $4, 12(%rdi)
movl %edx, 8(%rdi)
ret


-Miles
 
J

James Kanze

I'm doing a code review for someone and picked them up on this:
BigClass AFunction() { /*returns a large object by value*/ }
Caller()
{
BigClass bc = AFunction();
/* use bc */

I saif they should do this in the caller to avoid copying the
temporary:
Caller()
{
const BigClass& bc = AFunction();
/* use bc */
}

I'd disagree with that recommendation. There will normally be
the same number of copies, and the latter is just more verbose
and slightly less clear. It's also dangerous if at some later
date, AFunction is modified to return a const reference.
They came back with the following code which compiled but I
asked them to change to const but I had trouble justifying why
it should be const:
Caller()
{
BigClass& bc = AFunction();
/* use bc */
}

The simple answer is: because the standard says so.
Does the life of the temporary get extended using the
non-const reference as it does from using the const reference?

Nothing to do with lifetimes. You can't use a temporary to
initialize a non-const reference.
Was I right to pick him up on it not being const? Please give
me some rationale if I'm right!

It won't compile if the reference isn't const. It that a
sufficiently good reason?
 
J

James Kanze

I'm more versed in C++ than the other programmers in the
organisation I am part of. Please note that I am not making a
claim that I know it well. Far from it! It is part of the
job description of all programmers here to provide code
reviews. The guy whose code I'm reviewing also reviews my
code. That's the way the organisation works. Anyway, by asking
the question on here I am attempting to know C++ better.

There should be no requirement that all of the reviewers be
exceptionally gifted in C++. On the other hand, I don't think
reviewers should encourage less readable and more dangerous
coding styles. Unless there's a good reason for using a
reference, stick with the value.
Following on from that, would you then say that following two
forms are equivalent?:
const BigClass bc = AFunction();
const BigClass& bc = AFunction();
Because I thought the first form is *potentially* less
efficient than the second because in the first form the
compiler is given leway to construct bc via the copy
constructor.

The compiler always has leeway to throw in an extra copy. In
both cases. In practice, however: when a function returns a
class type, the compiler will pass a hidden pointer to the
function, pointing to where the returned object is to be
constructed. And it has to be constructed somewhere if it's
going to outlive the function. In the first case, the compiler
will pass the address of bc; in the second, the address of an
unnamed temporary. About the only difference is that the
compiler will then have to initialize the reference (almost
certainly a pointer) with the address of the hidden temporary.
Read about RVO a long time ago. Probably from Meyers' book.
When I think of RVO and how best to enable it from the
programmer's POV, I've distilled it down to trying to return
an unnamed object - i.e. construct the object at return time.
As opposed to default constructing the object, calling
setters, then returning.

Note that this is *not* a question of RVO. RVO occurs within
the called function.

[...]
The point I'm talking about here is analogous (in my mind
anyway!) to how to accept an object as a parameter. If using a
builtin, then accept by value (copy), if an object then accept
by const ref.

Some would also consider that premature optimization:). In
practice, however, it's pretty much standard practice. And if
the function is in a separate translation unit, it's almost
impossible for the compiler to optimize out the copy of pass by
value.
These are guidelines I stick to always and I figured my
guideline of attaching to a return value of a non builtin type
by const ref also had *potential* performance merit.

It makes no difference, and can be error prone.
 
P

PeteUK

I'm more versed in C++ than the other programmers in the
organisation I am part of. Please note that I am not making a
claim that I know it well.  Far from it!  It is part of the
job description of all programmers here to provide code
reviews. The guy whose code I'm reviewing also reviews my
code. That's the way the organisation works. Anyway, by asking
the question on here I am attempting to know C++ better.

There should be no requirement that all of the reviewers be
exceptionally gifted in C++.  On the other hand, I don't think
reviewers should encourage less readable and more dangerous
coding styles.  Unless there's a good reason for using a
reference, stick with the value.
Following on from that, would you then say that following two
forms are equivalent?:
const BigClass bc = AFunction();
const BigClass& bc = AFunction();
Because I thought the first form is *potentially* less
efficient than the second because in the first form the
compiler is given leway to construct bc via the copy
constructor.

The compiler always has leeway to throw in an extra copy.  In
both cases.  In practice, however: when a function returns a
class type, the compiler will pass a hidden pointer to the
function, pointing to where the returned object is to be
constructed.  And it has to be constructed somewhere if it's
going to outlive the function.  In the first case, the compiler
will pass the address of bc; in the second, the address of an
unnamed temporary.  About the only difference is that the
compiler will then have to initialize the reference (almost
certainly a pointer) with the address of the hidden temporary.
Read about RVO a long time ago. Probably from Meyers' book.
When I think of RVO and how best to enable it from the
programmer's POV, I've distilled it down to trying to return
an unnamed object - i.e.  construct the object at return time.
As opposed to default constructing the object, calling
setters, then returning.

Note that this is *not* a question of RVO.  RVO occurs within
the called function.

    [...]
The point I'm talking about here is analogous (in my mind
anyway!) to how to accept an object as a parameter. If using a
builtin, then accept by value (copy), if an object then accept
by const ref.

Some would also consider that premature optimization:).  In
practice, however, it's pretty much standard practice.  And if
the function is in a separate translation unit, it's almost
impossible for the compiler to optimize out the copy of pass by
value.
These are guidelines I stick to always and I figured my
guideline of attaching to a return value of a non builtin type
by const ref also had *potential* performance merit.

It makes no difference, and can be error prone.

James - your answer is exactly what I was looking for. I'm very
grateful to have learnt a bit more and will modify my practice (and
come up with guidelines for the team) accordingly.

Pete
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,432
Messages
2,571,680
Members
48,796
Latest member
Greg L.

Latest Threads

Top