Pass array by value

E

E. Robert Tisdale

Richard said:
Would you give the same advice
if the object being modified were a million-element vector?

Yes.

Elaborate a little. An example might help.
You might be surprised
 
E

E. Robert Tisdale

Siemel said:
If one is allowed to pass in a NULL integer (not zero, which is something),
meaning that 'i' is not supplied, then you have to choose the second
version. Otherwise prefer the first method as it makes it clear that the
caller has to pass in a valid integer.

A better option would be to overload the function

void afunc(void);

"to pass in a NULL integer".
 
C

Cy Edmunds

Berk Birand said:
Thanks to both for your answers. Also another question that pops in my
mind is whether to use references or pointers in function headers. Is it
better practice to declare a function as, say:

void afunc(int& i);

or is it better to declare it as

void afunc(int* i);

I know that in the second case, one has to explicitly dereference i
whenever one wants to access it. But this can be considered a good thing,
as it makes it more obvious that you are manipulating pointers, and the
values are going to change. On the other hand, it's just more work!

So, what do you say about this?

Consider these possible calling sequences with a comment saying what they
look like to me:

int j = 0, k;
BigObject obj;

func1(j); // inputting an int
k = func2(j); // inputting an int and outputting another int
func3(&obj); // outputting an object
func4(obj); // inputting an object by const reference
func5(&j, &k); // outputting two ints

Of course many of these reasonable guesses could be wrong depending on the
actual declarations, but I think the code is easier to understand if you
write your declarations in such a way that they are right! So that means:

1) express output by return value if possible
2) use a pointer to express output otherwise
3) express input by call-by-value or const reference
4) never use a non-const reference (looks like input)

Note that using this system

func6(&j); // should have been either j = func6(); or j = func6(j);

I break these "rules" myself whenever I feel like it, so don't take this too
seriously. On the other hand, when you have a choice it is a good idea to
give some thought to what the calling sequence will look like.

One further note: using exceptions for error handling lets you make some of
these distinctions. If every function returns an error code it muddies up
the waters.

Finally: the distinction between output by pointer and update by pointer
cannot be made by syntax alone.
 
J

Jonathan Turkanis

Cy said:
int j = 0, k;
BigObject obj;

func1(j); // inputting an int
k = func2(j); // inputting an int and outputting another int
func3(&obj); // outputting an object
func4(obj); // inputting an object by const reference
func5(&j, &k); // outputting two ints

Of course many of these reasonable guesses could be wrong depending
on the actual declarations, but I think the code is easier to
understand if you write your declarations in such a way that they are
right! So that means:

1) express output by return value if possible
2) use a pointer to express output otherwise
3) express input by call-by-value or const reference
4) never use a non-const reference (looks like input)

4) seems questionable. There's a broad class of cases where you have an object
that's noncopyable or expensive to copy -- such as a stream or a container --
which you want to pass to a modifying algorithm.

Generally, if you believe that functions which operate on a objects of a given
type should be non-members as often as possible, then most of the functions
which would otherwise be non-const member functions will become non-member
functions taking a reference to non-const.

Jonathan
 
S

Siemel Naran

E. Robert Tisdale said:
Siemel Naran wrote:

A better option would be to overload the function

void afunc(void);

"to pass in a NULL integer".

Say you call a function find(x,y) that returns a pointer to an integer, and
NULL if the integer does not exist. Then one could just say

afunc(find(x,y));

where the signature of afunc is void afunc(int *). Surely this is more
convenient than

int * f = find(x,y);
if (f) afunc(*f);
else afunc();
 
C

Cy Edmunds

Jonathan Turkanis said:
4) seems questionable. There's a broad class of cases where you have an
object
that's noncopyable or expensive to copy -- such as a stream or a
container --
which you want to pass to a modifying algorithm.

This rule set suggests using a pointer in this case. I mentioned that
updating and output by pointers would be indistinguishable.
Generally, if you believe that functions which operate on a objects of a
given
type should be non-members as often as possible,

(I certainly do agree.)
then most of the functions
which would otherwise be non-const member functions will become non-member
functions taking a reference to non-const.

Jonathan

There is no reason to make functions into member functions to avoid
non-const references. Use pointers:

func3(&obj); // outputting an object

Again, outputting and updating are not distinguishable.
 
J

Jonathan Turkanis

Cy said:
This rule set suggests using a pointer in this case. I mentioned that
updating and output by pointers would be indistinguishable.

Okay, I see what you're saying.
(I certainly do agree.)


There is no reason to make functions into member functions to avoid
non-const references.

I wouldn't dream of it. ;-) Often it's impossible, anyway, since you don't
control the class definition.
Use pointers:

func3(&obj); // outputting an object

Again, outputting and updating are not distinguishable.

Having to sprinkle ampersands all around creates a lot of noise, IMO. I'd
rather give functions descriptive names:

int n = read_int32(std::cin, flags);

sort(vec, by_age());

Jonathan
 
R

Richard Herring

E. Robert Tisdale said:
Yes.

Elaborate a little. An example might help.
You might be surprised

struct I {
int i_;
}

typedef std::vector<I> Vector;

/* Function which sets element p of a Vector to value q
As recommended, function doesn't modify its arguments
but returns a copy.
*/
Vector afunct (const Vector & v, int p, int q)
{
Vector temp(v);
temp[p].i_ = q;
return temp;
}

int main()
{
Vector v(1000000);
for (int i=0; i<1000000; ++i)
{
v = afunct(v, p, q);
}
/*...*/
}

Still think it's a good idea?
 
R

Richard Herring

[oops - hit post too soon :-( Second attempt:]

E. Robert Tisdale said:
Yes.

Elaborate a little. An example might help.
You might be surprised

struct I {
int i_;
/* and maybe some other stuff */
}

typedef std::vector<I> Vector;

/* Function which sets element p of a Vector to value q
As recommended, function doesn't modify its arguments
but returns a copy.
*/
Vector afunct (const Vector & v, int p, int q)
{
Vector temp(v);
temp[p].i_ = q;
return temp;
}

int main()
{
Vector v(1000000);
for (int i=0; i<1000000; ++i)
{
v = afunct(v, i, i);
}
/*...*/
}

Still think it's a good idea?
 
E

E. Robert Tisdale

Richard said:
E. Robert Tisdale writes
Yes.

Elaborate a little. An example might help.
You might be surprised


struct I {
int i_;
/* and maybe some other stuff */
}

typedef std::vector<I> Vector;

/* Function which sets element p of a Vector to value q
As recommended, function doesn't modify its arguments
but returns a copy.
*/

Vector afunct (const Vector & v, int p, int q) {
Vector temp(v);
temp[p].i_ = q;
return temp;
}

int main(int argc, char* argv[]) {

Vector v(1000000);
for (int i = 0; i < 1000000; ++i) {
v = afunct(v, i, i);
}
/*...*/ return 0;
}

Still think it's a good idea?
I'm thinking more like
cat main.cc
#include <cmath>
#include <vector>
#include <iostream>

std::vector<double> ramp(size_t n) {
std::vector<double> x(n);
for (size_t j = 0; j < x.size(); ++j)
x[j] = j;
return x;
}

std::vector<double> sqrt(const std::vector<double>& x) {
std::vector<double> y(x.size());
for (size_t j = 0; j < x.size(); ++j)
y[j] = sqrt(x[j]);
return y;
}

std::eek:stream& operator<<(std::eek:stream& os,
const std::vector<double>& x) {
for (size_t j = 0; j < x.size(); ++j)
os << ' ' << x[j];
return os << std::endl;
}

int
main(int argc, char* argv[]) {
const
size_t n = 4;
const
std::vector<double> x = ramp(n);
std::cout << "x =" << std::endl;
std::cout << x << std::endl;
const
std::vector<double> y = sqrt(x);
std::cout << "x =" << std::endl;
std::cout << y << std::endl;
return 0;
}

The problem with your example is that
you have not justified modifying std::vector<I> v *in-place*.
You contrived a ridiculous *straw-man* function afunct
then *assign* the return value to v a million times!
Didn't you notice that assignment modifies v?

Some objects (I/O streams, for example) are not useful
unless they can be modified.
Containers often must be declared as variables
and modified in place (as in the ramp and sqrt functions above).
But most other object including most containers
can be defined as constants initialized with a function.

I think that most experience programmers agree that
it is important to minimize the number of [state] variables
in your program because it makes your programs easier
to analyze (read, understand and maintain).
So my advice is,
"Avoid defining functions that modify their arguments
because they compel programmers to declare an pass variables
which may otherwise be constants in the rest of the program.
Reserve functions that modify their arguments
to situations where an object *must* be modified in-place."

For example, I think that almost everybody would agree that

void sqrt(std::vector<double>& x) {
std::vector<double> y(x.size());
for (size_t j = 0; j < x.size(); ++j)
x[j] = sqrt(x[j]);
}

would probably be a bad design.
 
E

E. Robert Tisdale

Siemel said:
Say you call a function find(x,y) that returns a pointer to an integer, and
NULL if the integer does not exist. Then one could just say

afunc(find(x, y));

where the signature of afunc is void afunc(int *).
Surely this is more convenient than

int * f = find(x, y);
if (f) afunc(*f);
else afunc();

Would it be more convenient
if find(x, y) returned an array subscript instead of a pointer?
 
S

Shezan Baig

E. Robert Tisdale said:
Richard said:
E. Robert Tisdale writes
Richard Herring wrote:

Would you give the same advice
if the object being modified were a million-element vector?

Yes.

Elaborate a little. An example might help.
You might be surprised


struct I {
int i_;
/* and maybe some other stuff */
}

typedef std::vector<I> Vector;

/* Function which sets element p of a Vector to value q
As recommended, function doesn't modify its arguments
but returns a copy.
*/

Vector afunct (const Vector & v, int p, int q) {
Vector temp(v);
temp[p].i_ = q;
return temp;
}

int main(int argc, char* argv[]) {

Vector v(1000000);
for (int i = 0; i < 1000000; ++i) {
v = afunct(v, i, i);
}
/*...*/ return 0;
}

Still think it's a good idea?
I'm thinking more like
cat main.cc
#include <cmath>
#include <vector>
#include <iostream>

std::vector<double> ramp(size_t n) {
std::vector<double> x(n);
for (size_t j = 0; j < x.size(); ++j)
x[j] = j;
return x;
}

std::vector<double> sqrt(const std::vector<double>& x) {
std::vector<double> y(x.size());
for (size_t j = 0; j < x.size(); ++j)
y[j] = sqrt(x[j]);
return y;
}

std::eek:stream& operator<<(std::eek:stream& os,
const std::vector<double>& x) {
for (size_t j = 0; j < x.size(); ++j)
os << ' ' << x[j];
return os << std::endl;
}

int
main(int argc, char* argv[]) {
const
size_t n = 4;
const
std::vector<double> x = ramp(n);
std::cout << "x =" << std::endl;
std::cout << x << std::endl;
const
std::vector<double> y = sqrt(x);
std::cout << "x =" << std::endl;
std::cout << y << std::endl;
return 0;
}

The problem with your example is that
you have not justified modifying std::vector<I> v *in-place*.
You contrived a ridiculous *straw-man* function afunct
then *assign* the return value to v a million times!
Didn't you notice that assignment modifies v?


This isn't the point. The point is that it's best not to return a
large object by value, that's all. If you really need to "justify
modifying std::vector<I> v *in-place*" then you should set 'n' to
10000000 in your code above, you will probably notice a difference in
performance.

Some objects (I/O streams, for example) are not useful
unless they can be modified.


Of course.

So my advice is,
"Avoid defining functions that modify their arguments
because they compel programmers to declare an pass variables
which may otherwise be constants in the rest of the program.
Reserve functions that modify their arguments
to situations where an object *must* be modified in-place."


This is good advice in most cases, but it cannot be used for all cases.
Obviously if you have *very* large objects, you must take a different
approach.

For example, I think that almost everybody would agree that

void sqrt(std::vector<double>& x) {
std::vector<double> y(x.size());
for (size_t j = 0; j < x.size(); ++j)
x[j] = sqrt(x[j]);
}

would probably be a bad design.


Yes, I agree. For many reasons, none of which have anything to do with
this thread. Why do we need 'y'? And why do we need a sqrt function
that takes 'std::vector<double>&'? What's wrong with plain old 'double
sqrt(double)' and using 'for_each' to apply it to each element in a
vector (or *any* container).

Hope this helps,
-shez-
 
E

E. Robert Tisdale

Shezan said:
E. Robert Tisdale said:
Richard said:
E. Robert Tisdale writes

Richard Herring wrote:

Would you give the same advice
if the object being modified were a million-element vector?

Yes.

Elaborate a little. An example might help.
You might be surprised

struct I {
int i_;
/* and maybe some other stuff */
}

typedef std::vector<I> Vector;

/* Function which sets element p of a Vector to value q
As recommended, function doesn't modify its arguments
but returns a copy.
*/

Vector afunct (const Vector & v, int p, int q) {
Vector temp(v);
temp[p].i_ = q;
return temp;
}

int main(int argc, char* argv[]) {

Vector v(1000000);
for (int i = 0; i < 1000000; ++i) {
v = afunct(v, i, i);
}
/*...*/

return 0;
}

Still think it's a good idea?

I'm thinking more like
cat main.cc
#include <cmath>
#include <vector>
#include <iostream>

std::vector<double> ramp(size_t n) {
std::vector<double> x(n);
for (size_t j = 0; j < x.size(); ++j)
x[j] = j;
return x;
}

std::vector<double> sqrt(const std::vector<double>& x) {
std::vector<double> y(x.size());
for (size_t j = 0; j < x.size(); ++j)
y[j] = sqrt(x[j]);
return y;
}

std::eek:stream& operator<<(std::eek:stream& os,
const std::vector<double>& x) {
for (size_t j = 0; j < x.size(); ++j)
os << ' ' << x[j];
return os << std::endl;
}

int
main(int argc, char* argv[]) {
const
size_t n = 4;
const
std::vector<double> x = ramp(n);
std::cout << "x =" << std::endl;
std::cout << x << std::endl;
const
std::vector<double> y = sqrt(x);
std::cout << "x =" << std::endl;
std::cout << y << std::endl;
return 0;
}

The problem with your example is that
you have not justified modifying std::vector<I> v *in-place*.
You contrived a ridiculous *straw-man* function afunct
then *assign* the return value to v a million times!
Didn't you notice that assignment modifies v?

This isn't the point. The point is that
it's best not to return a large object by value, that's all.
If you really need to "justify modifying std::vector<I> v *in-place*"
then you should set 'n' to 10000000 in your code above,
you will probably notice a difference in performance.

I don't think you said what you meant to say here.
Of course it will take about 2.5 million times as long
to process 10 million elements as it does to process 4 elements.

I suspect that you meant that

void sqrt(std::vector<double>& x) {
std::vector<double> y(x.size());
for (size_t j = 0; j < x.size(); ++j)
x[j] = sqrt(x[j]);
}

would be faster than

std::vector<double> sqrt(const std::vector<double>& x) {
std::vector<double> y(x.size());
for (size_t j = 0; j < x.size(); ++j)
y[j] = sqrt(x[j]);
return y;
}

If that's what you meant, you're wrong.
But don't take my word for it.
Try it yourself.
This is good advice in most cases, but it cannot be used for all cases.
Obviously, if you have *very* large objects,
you must take a different approach.

No. you are wrong. Try it.
For example, I think that almost everybody would agree that

void sqrt(std::vector<double>& x) {
std::vector<double> y(x.size());
for (size_t j = 0; j < x.size(); ++j)
x[j] = sqrt(x[j]);
}

would probably be a bad design.

Yes, I agree. For many reasons,

none of which have anything to do with this thread.
Why do we need 'y'?
And why do we need a sqrt function
that takes 'std::vector<double>&'?
What's wrong with plain old 'double sqrt(double)' and using 'for_each'
to apply it to each element in a vector (or *any* container).

If it isn't obvious to you,
provide an example showing what you mean to do.
Note that your solution would compel the programmer
to declare a variable vector (y) to receive the result
even if y is never modified anywhere else in the program.
 
S

Shezan Baig

E. Robert Tisdale said:
I don't think you said what you meant to say here.
Of course it will take about 2.5 million times as long
to process 10 million elements as it does to process 4 elements.

I suspect that you meant that

void sqrt(std::vector<double>& x) {
std::vector<double> y(x.size());


Why would you need this 'y' vector???? I suspect this is why you don't
notice the difference in performance.

for (size_t j = 0; j < x.size(); ++j)
x[j] = sqrt(x[j]);
}

would be faster than

std::vector<double> sqrt(const std::vector<double>& x) {
std::vector<double> y(x.size());
for (size_t j = 0; j < x.size(); ++j)
y[j] = sqrt(x[j]);
return y;
}

If that's what you meant, you're wrong.
But don't take my word for it.
Try it yourself.


Just for the sake of argument, I tried it. There is a big difference
when you are running with debugging information (no optimisation).
When you switch on optimisation, the difference is reduced - but still
not the same. If you look at the code (and my comment above), it's
obvious why its not the same.

Personally, I would prefer good performance even in debug builds. It
just makes maintenance easier :)

No. you are wrong. Try it.


I tried it. I still prefer taking a different approach :)


For example, I think that almost everybody would agree that

void sqrt(std::vector<double>& x) {
std::vector<double> y(x.size());
for (size_t j = 0; j < x.size(); ++j)
x[j] = sqrt(x[j]);
}

would probably be a bad design.

Yes, I agree. For many reasons,

none of which have anything to do with this thread.
Why do we need 'y'?
And why do we need a sqrt function
that takes 'std::vector<double>&'?
What's wrong with plain old 'double sqrt(double)' and using 'for_each'
to apply it to each element in a vector (or *any* container).

If it isn't obvious to you,
provide an example showing what you mean to do.
Note that your solution would compel the programmer
to declare a variable vector (y) to receive the result
even if y is never modified anywhere else in the program.


I don't see anything wrong with this. I trust you are an experienced
enough programmer to know not to make 'y' visible anywhere else. Try
to focus on higher level issues (like scoping your data) and you'll
find that there is no real need to make 'y' const. And your code will
be cleaner too :)


Hope this helps,
-shez-
 
E

E. Robert Tisdale

Shezan said:
Why would you need this 'y' vector?

You have discovered "a paste-o" in my code.
Please omit this object.
I suspect this is why you don't notice the difference in performance.
for (size_t j = 0; j < x.size(); ++j)
x[j] = sqrt(x[j]);
}

would be faster than

std::vector<double> sqrt(const std::vector<double>& x) {
std::vector<double> y(x.size());
for (size_t j = 0; j < x.size(); ++j)
y[j] = sqrt(x[j]);
return y;
}

If that's what you meant, you're wrong.
But don't take my word for it.
Try it yourself.

Just for the sake of argument, I tried it.

Please show us the code that you used.
There is a big difference
when you are running with debugging information (no optimisation).
When you switch on optimisation,

Please tell us which compiler and optimization options you used.
the difference is reduced - but still not the same.

Please show us the timing results.
If you look at the code (and my comment above),
it's obvious why its not the same.

Personally, I would prefer good performance even in debug builds.
It just makes maintenance easier :)

So do I. But passing a non-const reference
instead of returning by value won't improve performance.
I tried it. I still prefer taking a different approach :)

You can lead a [horse] to water but ...
For example, I think that almost everybody would agree that

void sqrt(std::vector<double>& x) {
std::vector<double> y(x.size());
for (size_t j = 0; j < x.size(); ++j)
x[j] = sqrt(x[j]);
}

would probably be a bad design.

Yes, I agree. For many reasons,

none of which have anything to do with this thread.
Why do we need 'y'?
And why do we need a sqrt function
that takes 'std::vector<double>&'?
What's wrong with plain old 'double sqrt(double)' and using
'for_each'
to apply it to each element in a vector (or *any* container).

If it isn't obvious to you,
provide an example showing what you mean to do.
Note that your solution would compel the programmer
to declare a variable vector (y) to receive the result
even if y is never modified anywhere else in the program.

I don't see anything wrong with this.
I trust you are an experienced enough programmer
to know not to make 'y' visible anywhere else.
Try to focus on higher level issues
(like scoping your data) and you'll find that
there is no real need to make 'y' const.
And your code will be cleaner too :)

Please show us an example of what you mean.
I'm pretty sure that you're confused.
 
S

Shezan Baig

E. Robert Tisdale said:
Please show us an example of what you mean.

{
{ // <-- new scope
// declare y and use it (localise)
}

// "anywhere else in the program"
}
I'm pretty sure that you're confused.

Hope this helps,
-shez-
 
E

E. Robert Tisdale

Shezan said:
{
{ // <-- new scope
// declare y

// const
std::vector<double> x(n);

// x is defined as a variable
// only so that I can initialize it with
extern void sqrt(std::vector<double>&);

sqrt(x);

// Now, x is supposed to be constant.
// and use it (localize)

g(x);

// Does x still have the same value?
// Or did function g modify it?

x[0] = 13.0; // error! x is supposed to be constant

// The compiler won't catch this error.
 
R

Richard Herring

[restoring over-snipped context]
struct I {
int i_;
/* and maybe some other stuff */
}
typedef std::vector<I> Vector;
/* Function which sets element p of a Vector to value q
As recommended, function doesn't modify its arguments
but returns a copy.
*/
Vector afunct (const Vector & v, int p, int q) {
Vector temp(v);
temp[p].i_ = q;
return temp;
}
int main(int argc, char* argv[]) {
Vector v(1000000);
for (int i = 0; i < 1000000; ++i) {
v = afunct(v, i, i);
}
/*...*/ return 0;
}
Still think it's a good idea?
[snip]

The problem with your example is that
you have not justified modifying std::vector<I> v *in-place*.
You contrived a ridiculous *straw-man* function afunct

Ridiculous, certainly, But it's not a strawman, it's a counter-example.
It's a function which precisely complies with your "recommendation" and
illustrates its absurdity:

.... which I notice you conveniently snipped from your reply.

You might want to check on the definition of "strawman" before you use
it again.
then *assign* the return value to v a million times!
Didn't you notice that assignment modifies v?

Of course I "noticed". That's precisely my point.

[...]
 
S

Shezan Baig

E. Robert Tisdale said:
Shezan said:
{
{ // <-- new scope
// declare y

// const
std::vector<double> x(n);

// x is defined as a variable
// only so that I can initialize it with
extern void sqrt(std::vector<double>&);

sqrt(x);

// Now, x is supposed to be constant.
// and use it (localize)

g(x);

// Does x still have the same value?
// Or did function g modify it?

x[0] = 13.0; // error! x is supposed to be constant

// The compiler won't catch this error.
}

// "anywhere else in the program"
}



If you really had code like this in your software, I would suggest
refactoring.

Hope this helps,
-shez-
 
E

E. Robert Tisdale

Shezan said:
E. Robert Tisdale said:
Shezan said:
E. Robert Tisdale wrote:

Please show us an example of what you mean.

{
{ // <-- new scope
// declare y

// const
std::vector<double> x(n);

// x is defined as a variable
// only so that I can initialize it with
extern void sqrt(std::vector<double>&);

sqrt(x);

// Now, x is supposed to be constant.

// and use it (localize)

g(x);

// Does x still have the same value?
// Or did function g modify it?

x[0] = 13.0; // error! x is supposed to be constant

// The compiler won't catch this error.
}

// "anywhere else in the program"
}

If you really had code like this in your software,
I would suggest refactoring.

Please show us how *you* would "refactor" the code above.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,058
Latest member
QQXCharlot

Latest Threads

Top