Question about evaluation order

Jeroen · Feb 28, 2007

Hi all,

I've got a very specific question about the evaluation order in C++.
Assume some kind of custom array class, with an overloaded subscript
operator. In the following code:

{
my_array a, b, c;

a[6] = b[5] + c[4];
}

I would assume the following order:

1) The subscript operators are evaluated for 'b' and 'c', both
returning a reference (double& for example)
2) The add-operator is called, returning a temporary object
3) The subscript operator is evaluated for 'a', returning a reference.
4) The = operator is evaluated.

My question: is this the way things go, or is this order not defined in
C++? To be more specific: is it possible that the subscript evaluation
for 'a' (step 3) is done already in step 1 (before the add-operator)?

Thanx,

Jeroen

Pete Becker · Feb 28, 2007

Jeroen said:
To be more specific: is it possible that the subscript evaluation
for 'a' (step 3) is done already in step 1 (before the add-operator)?

Yes.

--

-- Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com)
Author of "The Standard C++ Library Extensions: a Tutorial and
Reference." (www.petebecker.com/tr1book)

Jeroen · Feb 28, 2007

Pete Becker schreef:

Yes.

Thanks Pete! That keeps me from working out an implementation that
doesn't work in the end....

Chris Theis · Feb 28, 2007

Jeroen said:
Hi all,

I've got a very specific question about the evaluation order in C++.
Assume some kind of custom array class, with an overloaded subscript
operator. In the following code:

{
my_array a, b, c;

a[6] = b[5] + c[4];
}

I would assume the following order:

1) The subscript operators are evaluated for 'b' and 'c', both returning
a reference (double& for example)
2) The add-operator is called, returning a temporary object
3) The subscript operator is evaluated for 'a', returning a reference.
4) The = operator is evaluated.

My question: is this the way things go, or is this order not defined in
C++? To be more specific: is it possible that the subscript evaluation for
'a' (step 3) is done already in step 1 (before the add-operator)?

It's certainly possible that the subscript op of a is evaluated before the
right side of the assignment is taken care of. This would be required for
example to allow for optimizing the temporary created by + operator.

Just out of curiosity, why or rather in what sense would this have an impact
on your implementation of the subscript op.

Cheers
Chris

Jeroen · Feb 28, 2007

Chris Theis schreef:

Just out of curiosity, why or rather in what sense would this have an impact
on your implementation of the subscript op.

Cheers
Chris

I am working on a multidimensional matrix which allows subscripting
resulting in submatrices instead of single cells (just like Matlab does
if you are familliar with that). The problem however is that you cannot
pass a reference to that submatrix after evaluating the subscript
operator, because the submatrix cannot be represented by the original
matrix memory structure. And I don't want to return a temporary object
because that can cause tremendous overhead if the matrix is large, and
(if I am not mistaken) I cannot do correct assignemnts if the submatrix
'selection' is on the lhs (only the temporary submatrix object will be
assigned the new value!):

a("1:6") = 3;

So I thought that I could return a reference to the original matrix, but
in the class I save the subscript string as an indicator for the
selected submatrix. My operators then take care of that information.
That should work fine with:

a("1:6") = b("1:6") + c("1:6");

but we get a serious problem (what the original question was about) with:

a("1:6") = a("2:7") + a("11:16");

The subscript string as saved in the class is overwritten by sequential
evaluation of the subscript operator, and there's no way to get around
the problem like this.

In the meantime I think I have a solution that does work, but that's
another 'big' story. I have to work that out first and think of possible
complications. If I got questions, or a working matrix class available
on a webpage, I'll post agin!

Jeroen

Chris Theis · Feb 28, 2007

Jeroen said:
Chris Theis schreef:

I am working on a multidimensional matrix which allows subscripting
resulting in submatrices instead of single cells (just like Matlab does if
you are familliar with that). The problem however is that you cannot pass
a reference to that submatrix after evaluating the subscript operator,
because the submatrix cannot be represented by the original matrix memory
structure. And I don't want to return a temporary object because that can
cause tremendous overhead if the matrix is large, and (if I am not
mistaken) I cannot do correct assignemnts if the submatrix 'selection' is
on the lhs (only the temporary submatrix object will be assigned the new
value!):

a("1:6") = 3;

So I thought that I could return a reference to the original matrix, but
in the class I save the subscript string as an indicator for the selected
submatrix. My operators then take care of that information. That should
work fine with:

a("1:6") = b("1:6") + c("1:6");

but we get a serious problem (what the original question was about) with:

a("1:6") = a("2:7") + a("11:16");

The subscript string as saved in the class is overwritten by sequential
evaluation of the subscript operator, and there's no way to get around the
problem like this.

Okay, now I see your point. Well, that's quite a tricky thing I gotta admit.
At first glance I'd guess that you'd have to sacrifice the syntax with the
subscript operator to have it working. But I'm curious to see what you will
pull off.

Cheers
Chris

Mark P · Feb 28, 2007

Jeroen said:
Chris Theis schreef:

I am working on a multidimensional matrix which allows subscripting
resulting in submatrices instead of single cells (just like Matlab does
if you are familliar with that). The problem however is that you cannot
pass a reference to that submatrix after evaluating the subscript
operator, because the submatrix cannot be represented by the original
matrix memory structure. And I don't want to return a temporary object
because that can cause tremendous overhead if the matrix is large, and
(if I am not mistaken) I cannot do correct assignemnts if the submatrix
'selection' is on the lhs (only the temporary submatrix object will be
assigned the new value!):

a("1:6") = 3;

So I thought that I could return a reference to the original matrix, but
in the class I save the subscript string as an indicator for the
selected submatrix. My operators then take care of that information.
That should work fine with:

a("1:6") = b("1:6") + c("1:6");

but we get a serious problem (what the original question was about) with:

a("1:6") = a("2:7") + a("11:16");

The subscript string as saved in the class is overwritten by sequential
evaluation of the subscript operator, and there's no way to get around
the problem like this.

The problem is your design, specifically your choice to embed this
active subscript within the matrix class. Perhaps a better approach is
to make a bona fide class to represent a submatrix. Give it a reference
to an underlying matrix, some information about its range in that
matrix, and overload it's array index and assignment operators to read
and write the underlying matrix. It may even make sense to define an
abstract matrix class and derive from it two classes, a concrete matrix
class which holds its data values, and a relative matrix class which
holds a reference and a range.

Mark

=?ISO-8859-15?Q?Juli=E1n?= Albo · Feb 28, 2007

Jeroen said:
matrix memory structure. And I don't want to return a temporary object
because that can cause tremendous overhead if the matrix is large, and
(if I am not mistaken) I cannot do correct assignemnts if the submatrix
'selection' is on the lhs (only the temporary submatrix object will be
assigned the new value!):

a("1:6") = 3;

So I thought that I could return a reference to the original matrix, but
in the class I save the subscript string as an indicator for the
selected submatrix. My operators then take care of that information.
That should work fine with:

a("1:6") = b("1:6") + c("1:6");

but we get a serious problem (what the original question was about) with:

a("1:6") = a("2:7") + a("11:16");

The subscript string as saved in the class is overwritten by sequential
evaluation of the subscript operator, and there's no way to get around
the problem like this.

An intermediate solution with not so big overhead can be to use references
to the original matrix for the subscripts but creating new objects for the
result of operators.

Jeroen · Mar 1, 2007

Mark P schreef:

The problem is your design, specifically your choice to embed this
active subscript within the matrix class. Perhaps a better approach is
to make a bona fide class to represent a submatrix. Give it a reference
to an underlying matrix, some information about its range in that
matrix, and overload it's array index and assignment operators to read
and write the underlying matrix. It may even make sense to define an
abstract matrix class and derive from it two classes, a concrete matrix
class which holds its data values, and a relative matrix class which
holds a reference and a range.

Mark

The thing I came up with yesterday partly resembles your suggestion I guess:

class matrix {
private:
vector<double> *data;
bool matrix_is_a_submatrix;
submatrix_specs s;

// plus ctors, dtor, operators....
};

What I want to do is that ctors always generate a matrix object in which
'matrix_is_a_submatrix=false', so that object actually stores matrix
data. The subscript operators however return a temporary object of this
matrix class (not a reference), but with the specifications of the
selected submatrix stored in 's', 'matrix_is_a_submatrix=true' and
'data' points to the original matrix data (so I don't have to copy the
actual matrix data). My matrix operators then only have to check the
'matrix_is_a_submatrix' flag to decide on correct behaviour. This flag
is also used in the dtor to decide if the dtor should delete the matrix
data (only if 'matrix_is_a_submatrix=false'. I have to think about the
consequences of this approach, but I think it will work.

Jeroen

Chris Theis · Mar 1, 2007

Jeroen said:
The thing I came up with yesterday partly resembles your suggestion I
guess:

class matrix {
private:
vector<double> *data;
bool matrix_is_a_submatrix;
submatrix_specs s;

// plus ctors, dtor, operators....
};

What I want to do is that ctors always generate a matrix object in which
'matrix_is_a_submatrix=false', so that object actually stores matrix data.
The subscript operators however return a temporary object of this matrix
class (not a reference), but with the specifications of the selected
submatrix stored in 's', 'matrix_is_a_submatrix=true' and 'data' points to
the original matrix data (so I don't have to copy the actual matrix data).
My matrix operators then only have to check the 'matrix_is_a_submatrix'
flag to decide on correct behaviour. This flag is also used in the dtor to
decide if the dtor should delete the matrix data (only if
'matrix_is_a_submatrix=false'. I have to think about the consequences of
this approach, but I think it will work.

I agree that the approach will work but IMO it introduces unnecessary
complications because you have to check whether it's a submatrix or not etc.
I'd suggest to consider each matrix object as a kind of proxy, holding the
data as well as always a submatrix object. This submatrix object is keeping
track of the data indices that are referenced by the matrix. So this
embedded object is working as an interface to give you access to the data,
of whatever range you want to. You can think of this submatrix object as a
clever kind of subscript operator which filters out exactly the parts that
you request.

Cheers
Chris

Jeroen · Mar 1, 2007

Chris Theis schreef:

I agree that the approach will work but IMO it introduces unnecessary
complications because you have to check whether it's a submatrix or not etc.
I'd suggest to consider each matrix object as a kind of proxy, holding the
data as well as always a submatrix object. This submatrix object is keeping
track of the data indices that are referenced by the matrix. So this
embedded object is working as an interface to give you access to the data,
of whatever range you want to. You can think of this submatrix object as a
clever kind of subscript operator which filters out exactly the parts that
you request.

Cheers
Chris

Thanks for thinking with me Chris. Let me try to understand your
suggestion, as I'm certainly not so experienced in this matter as most
of you....

I think your suggestion is somethink like:

class submatrix_description {
// parameters that describe the size or selection of a matrix.
};

class matrix {
// original matrix description.
vector<double> data;
submatrix_description original_size;

// current selection.
submatrix_description selected_size;

// the rest below....
};

The member 'selected_size' is equal to 'original_size' if it is the
original matrix, or a subset otherwise.

What I understand from your suggestion is that every matrix object
always contains the full matrix data. That is something I want to avoid
because I tend to work with matrices/vectors which may be tens of Mb big
or even bigger. Every temporary copy then introduces a massive overhead.
For instance:

a = b('e:end') + c('2:end');

requires 3 creations of a temporary object. In your case that is also 3
times creation of all matrix data, with my proposed solution only once
(result of the + operator).

Another issue, take a look again at:

a("1:6") = b("2:7") + c("11:16");

If I allow to select submatrices also at the LHS, then I think that the
subscript operator only can pass some sort of reference to the original
data in order to enable the = operator to modify the original data. That
is accomplished with the pointer construction I proposed.

I might get your idea wrong, then please correct me

Thanks for your
time sofar!

Jeroen

Chris Theis · Mar 1, 2007

Jeroen said:
Chris Theis schreef: [SNIP]

I think your suggestion is somethink like:

class submatrix_description {
// parameters that describe the size or selection of a matrix.
};

class matrix {
// original matrix description.
vector<double> data;
submatrix_description original_size;

// current selection.
submatrix_description selected_size;

// the rest below....
};

The member 'selected_size' is equal to 'original_size' if it is the
original matrix, or a subset otherwise.

What I understand from your suggestion is that every matrix object always
contains the full matrix data.

What I had in mind is more quite close to what you had suggested with
pointers to the original data which are used. I only would use an interface
object (submatrix) to access this data or keep track of the pointers to the
original data, no matter if it's a subset or the whole matrix. This way you
have a clean interface on how to access data and you won't need to
differentiate if you're working on subsets or "whole"matrices.

That is something I want to avoid because I tend to work with
matrices/vectors which may be tens of Mb big or even bigger. Every
temporary copy then introduces a massive overhead.

You are right on that, although the number of temporaries created depends on
the compiler and its abilities to optimize them. But it's certainly prudent
to be on the safe side and think of the worst ;-) BTW, if you're working
with matrices that may be tens of Mbs I suspect you're doing some FEM or
solid state Monte Carlo for example. With respect to performance & memory
consumption it might be an idea to consider sparse storage layouts for the
matrices.

[SNIP]

HTH
Chris

Grizlyk · Mar 6, 2007

Chris said:
{
my_array a, b, c;

a[6] = b[5] + c[4];
}

I would assume the following order:

1) The subscript operators are evaluated for 'b' and 'c', both
returning a reference (double& for example)
2) The add-operator is called, returning a temporary object
3) The subscript operator is evaluated for 'a', returning a reference.
4) The = operator is evaluated.

My question: is this the way things go, or is this order not defined in
C++? To be more specific: is it possible that the subscript evaluation
for 'a' (step 3) is done already in step 1 (before the add-operator)?

Click to expand...

It's certainly possible that the subscript op of a is evaluated before the
right side of the assignment is taken care of. This would be required for
example to allow for optimizing the temporary created by + operator.

I think OP did not ask about possibility, but for required order. What is
explanation?

In the original expression "a[6] = b[5] + c[4];" there are several operators
with different priority:

1.
"operator[]" has highest priority, and is processed "from left to right"

so process sequence could be "a[6],b[5],c[4]"

2.
"operator+" has high priority, and theorder does not matter here

(
The "operator+" is processed "from left to right", but it seems to me, that
if sequence of "operator+" has equal types of parameters, that theorder can
be indefinit (not only "from left to right")

type a,b,c,d;

(a+b+c+d); //can be any order, am i right?
)

3.
so process sequence could be "a[6],b[5],c[4],+"

"operator=" has low priority, and theorder does not matter here

processed "from right to left"

so process sequence could be "a[6],b[5],c[4],+,="

4. ***
But I am not sure that here theorder never have been broken by "operator=".
Does C++ garantee that "operator=" will not break expression into some
independent parts with indefinit theorder between them?

Why do i thik so? Consider: "b[5] + c[4]" is the same as

operator+
(
b.operator[](5),
c.operator[](4)
)

Some people have written in the group, that there are no predefined order in
standard between several function parameters, as

b.operator[](5), c.operator[](4)

in spite of comma (',') that does not work here as "operator,()". C++
garantees only that both of them will be called befor "operator+()".

In the original expression "a[6] = b[5] + c[4];" the "operator=" is the same
as

lvalue.operator=
(
rvalue
);

It is not evidently, that rvalue must be evaluated alvays after lvalue,
because they are independent. The convention (probably with with random
rationale) of behaviour for the case must be explicitly declared by
standard.

I think the best disign to do not trust to theorder between "operator ="
here.

--
Maksim A. Polyanin
http://grizlyk1.narod.ru/cpp_new

"In thi world of fairy tales rolls are liked olso"
/Gnume/

Bo Persson · Mar 6, 2007

Grizlyk said:
Chris said:

{
my_array a, b, c;

a[6] = b[5] + c[4];
}

I would assume the following order:

1) The subscript operators are evaluated for 'b' and 'c', both
returning a reference (double& for example)
2) The add-operator is called, returning a temporary object
3) The subscript operator is evaluated for 'a', returning a
reference. 4) The = operator is evaluated.

My question: is this the way things go, or is this order not
defined in C++? To be more specific: is it possible that the
subscript evaluation for 'a' (step 3) is done already in step 1
(before the add-operator)?

Click to expand...

It's certainly possible that the subscript op of a is evaluated
before the right side of the assignment is taken care of. This would
be required for example to allow for optimizing the temporary
created by + operator.

Click to expand...

I think OP did not ask about possibility, but for required order.
What is explanation?

In the original expression "a[6] = b[5] + c[4];" there are several
operators with different priority:

1.
"operator[]" has highest priority, and is processed "from left to
right"

No, there is no left-to-right order. The order is unspecified.

There also in no requirement that all higher priority operators be evaluated
before any lower priority operation starts, just that b[5] and c[4] are both
evaluated before operator+ (but in any order).

Bo Persson

Grizlyk · Mar 7, 2007

Bo said:
It's certainly possible that the subscript op of a is evaluated
before the right side of the assignment is taken care of. This would
be required for example to allow for optimizing the temporary
created by + operator.

Click to expand...

I think OP did not ask about possibility, but for required order.
What is explanation?

In the original expression "a[6] = b[5] + c[4];" there are several
operators with different priority:

1.
"operator[]" has highest priority, and is processed "from left to
right"

so process sequence could be "a[6],b[5],c[4]"

Click to expand...

No, there is no left-to-right order. The order is unspecified.

"there is no left-to-right order <where>"? You probably have forgotten to
point correct place of "unspecified order". Without <where> you are wrong,
because series of several "operator[]" and other operators for the group of
priority is processed "from left to right":

assumig foo(),boo(),voo() are functions

"*b[boo()]().foo(voo());" is the same as

operator*(
b$operator[]( boo() )$operator()$operator.(type::foo)( voo() )
);

Here boo() and voo() can be called befor the expression, but the expression
will be processed as written "from left to right".

Do you want to say, that declared in standard the strict order (as "from
left to right") can be applied only for each unbroken group with equal
priority?

priority groups example:

() [] -> :: . :left to right
! ~ + - ++ -- & * sizeof new delete :right to left
....

There also in no requirement that all higher priority operators be
evaluated before any lower priority operation starts, just that b[5] and
c[4] are both evaluated before operator+ (but in any order).

Has been written "could be". Theoder _can_ be broken if we have some
independent parts in an expression, as for "operator+" and "operator=".
- for "operator+" as for ordinary function the order is unspecified,
- for "operator=" as for ordinary function it is unspecified _maybe_ also.

But C++ can request any other in the special case of the operators. I do not
know (i have no standerd near me), so because no one have proved the
opposite i have advised to do not trust to the order.

In the example, he can use explicit temporary:

{
type &temp=a[5];
temp=b[5] + c[4];
}

Some people have written in the group, that order can be unspecified only
between ";" and seems "," signs.

--
Maksim A. Polyanin
http://grizlyk1.narod.ru/cpp_new

"In thi world of fairy tales rolls are liked olso"
/Gnume/

Bo Persson · Mar 7, 2007

Grizlyk said:
Bo said:

It's certainly possible that the subscript op of a is evaluated
before the right side of the assignment is taken care of. This
would be required for example to allow for optimizing the temporary
created by + operator.

I think OP did not ask about possibility, but for required order.
What is explanation?

In the original expression "a[6] = b[5] + c[4];" there are several
operators with different priority:

1.
"operator[]" has highest priority, and is processed "from left to
right"

so process sequence could be "a[6],b[5],c[4]"

Click to expand...

No, there is no left-to-right order. The order is unspecified.

Click to expand...

"there is no left-to-right order <where>"? You probably have
forgotten to point correct place of "unspecified order". Without
<where> you are wrong, because series of several "operator[]" and
other operators for the group of priority is processed "from left to
right":

There is no specified left-to-right order in a C++ expression.

That's where.

assumig foo(),boo(),voo() are functions

"*b[boo()]().foo(voo());" is the same as

operator*(
b$operator[]( boo() )$operator()$operator.(type::foo)( voo() )
);

Here boo() and voo() can be called befor the expression, but the
expression will be processed as written "from left to right".

Do you want to say, that declared in standard the strict order (as
"from left to right") can be applied only for each unbroken group
with equal priority?

priority groups example:

() [] -> :: . :left to right
! ~ + - ++ -- & * sizeof new delete :right to left

No, I was saying that this is NOT so. Right here:

There also in no requirement that all higher priority operators be
evaluated before any lower priority operation starts, just that b[5]
and c[4] are both evaluated before operator+ (but in any order).

Click to expand...

Has been written "could be". Theoder _can_ be broken if we have some
independent parts in an expression, as for "operator+" and
"operator=". - for "operator+" as for ordinary function the order is
unspecified,
- for "operator=" as for ordinary function it is unspecified _maybe_
also.
But C++ can request any other in the special case of the operators. I
do not know (i have no standerd near me), so because no one have
proved the opposite i have advised to do not trust to the order.

In the example, he can use explicit temporary:

{
type &temp=a[5];
temp=b[5] + c[4];
}

Some people have written in the group, that order can be unspecified
only between ";" and seems "," signs.

Right, you have a "sequence point" at the semi-colon, and the comma operator
(which is not all uses of a comma

.

All side effects must be complete at a sequence point, like at the
semi-colon at the end of a full expression. (There are other sequence points
as well).

Bo Persson

Operator evaluation order	16	May 12, 2011
Evaluation order in if statement	4	Nov 24, 2011
expressions and order of evaluation	18	Nov 23, 2008
Evaluation order of initializer list parameters	2	Nov 17, 2011
Order of expression evaluation	6	May 15, 2008
Question about my projects	3	Jul 23, 2021
a.f(exp1).f(exp2), what's the evaluation order of exp1 and exp2?	1	Mar 13, 2012
order of evaluation of arguments to constructors	7	Mar 22, 2007

Question about evaluation order

Jeroen

Pete Becker

Jeroen

Chris Theis

Jeroen

Chris Theis

Mark P

=?ISO-8859-15?Q?Juli=E1n?= Albo

Jeroen

Chris Theis

Jeroen

Chris Theis

Grizlyk

Bo Persson

Grizlyk

Bo Persson

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads