Operator Assignment in Parameters - Shocking Results!

A

Anthony Paul

Hello everyone,

I have a snippet of C code (found at the bottom of this post) that is
puzzling me, and I get different results on different compilers (which
is also disturbing).

On my visual studio 2008 c++ compiler and on gcc, I get the following
values :
a = 1308, b = 665, c = 665, d = 665

When I saw these values I was shocked! c and d have no business being
what they are, and I'm hoping someone can explain why.

On tanenbaum's compiler, I get what's expected :
a = 1308, b = 665, c = 666, d = -1

After some experimenting, a friend and I found that if you wrap the
expressions in printf statements (eg. func(printf("%d", 1973-z),
etc...)) gcc would change its tune and output what's expected (the
same as tanenbaum's compiler); visual studio on the other hand
continues to output the same nonsense.

So my question (as a C compiler writer who wants to implement this
correctly) to you C gurus is... is this behavior defined and if so,
which compiler is correct and why the discrepencies between these
compilers?

Regards,

Anthony

void func(int a, int b, int c, int d)
{
// what are the values of a, b, c, and d at this point (assuming
parameters are parsed right to left) ??
}

void main()
{
int z = 1973;

func(1973-z, --z, z=666, z=-1);
}
 
K

Keith Thompson

Anthony Paul said:
I have a snippet of C code (found at the bottom of this post) that is
puzzling me, and I get different results on different compilers (which
is also disturbing). [...]
void func(int a, int b, int c, int d)
{
// what are the values of a, b, c, and d at this point (assuming
parameters are parsed right to left) ??

Do you mean "parsed" or "passed"? Either way, it's an unwarranted
assumption.
}

void main()

This is not good; it should be "int main(void)". See section 11 of
the comp.lang.c FAQ, <http://www.c-faq.com/>, starting at question
11.12a. (Please don't try to dispute this until you've read the FAQ;
this is probably the most common, and most boring, issue discussed
here.)
{
int z = 1973;

func(1973-z, --z, z=666, z=-1);

The behavior of this call is undefined. There are no sequence points
between the evaluations of function arguments (the commas are not
comma operators, they're just part of the syntax of a function call).
See section 3 of the comp.lang.c FAQ, starting at question 3.1. See
also C99 6.5p2:

Between the previous and next sequence point an object shall have
its stored value modified at most once by the evaluation of an
expression. Furthermore, the prior value shall be read only to
determine the value to be stored.

Any possible (or impossible) behavior is conforming. So don't do
that.
 
J

James Kuyper

Anthony said:
Hello everyone,

I have a snippet of C code (found at the bottom of this post) that is
puzzling me, and I get different results on different compilers (which
is also disturbing).

On my visual studio 2008 c++ compiler and on gcc, I get the following
values :
a = 1308, b = 665, c = 665, d = 665

When I saw these values I was shocked! c and d have no business being
what they are, and I'm hoping someone can explain why.

On tanenbaum's compiler, I get what's expected :
a = 1308, b = 665, c = 666, d = -1

After some experimenting, a friend and I found that if you wrap the
expressions in printf statements (eg. func(printf("%d", 1973-z),
etc...)) gcc would change its tune and output what's expected (the
same as tanenbaum's compiler); visual studio on the other hand
continues to output the same nonsense.

So my question (as a C compiler writer who wants to implement this
correctly) to you C gurus is... is this behavior defined and if so,
which compiler is correct and why the discrepencies between these
compilers?

Regards,

Anthony

void func(int a, int b, int c, int d)
{
// what are the values of a, b, c, and d at this point (assuming
parameters are parsed right to left) ??
}

void main()
{
int z = 1973;

func(1973-z, --z, z=666, z=-1);

The C standard does not specify the order in which the arguments of a
function are evaluated (6.5.2.2p10). Since three of the arguments in
your function call change the value of z, and two of the arguments have
values which depend upon the value of z, that alone would be sufficient
to make such a function call a very bad idea. However, you would at
least be guaranteed to b 666, and d would be guaranteed to be -1, and
there would be only a small list of different permitted combinations of
values for a and b.

However, the problem with your code is much worse than simply an
unspecified order of evaluation. There are no sequence points separating
evaluation of the arguments of a function from each other. Section 6.5p2
says "Between the previous and next sequence point an object shall have
its stored value modified at most once by the evaluation of an
expression." Your function call modifies the value of z three times
between consecutive sequence points.

Even worse, the next sentence of that same paragraph says: "Furthermore,
the prior value shall be read only to determine the value to be stored."
Your use of --z does not, in itself violate this requirement, because
the prior value of z is retrieved for the purpose of determining what
the new value will be. However, your use of 1973-z in conjunction with
any one of the other three arguments does violate that requirement.

Therefore, the behavior of this code is undefined, and as a result
there's no wrong way for a compiler to implement it. A fully conforming
compiler could replace your function call with

printf("Pay closer attention to sequence points!");
 
E

Eric Sosman

Anthony said:
Hello everyone,

I have a snippet of C code (found at the bottom of this post) that is
puzzling me, and I get different results on different compilers (which
is also disturbing).
[...]
func(1973-z, --z, z=666, z=-1);

"Undefined behavior" means that the C language assigns
no meaning to the construct and makes no guarantees about
anything that might happen. It's like throwing a pair of
dice coated with nitrogen triiodide: You might get a number
between two and twelve, but more probably the whole thing
will explode in your face. No matter what happens, even if
demons fly out of your nose, you have no cause for surprise.

References: 6.5p2, 4p2, 3.4.3, FAQ 3.2-3.4.
 
A

Anthony Paul

Do you mean "parsed" or "passed"?  Either way, it's an unwarranted
assumption.

I meant parsed... and my assumption was due to the fact that all of
the C compilers I've worked with seem to evaluate them from right to
left. However, I have not confirmed this with the C standard; from
what I understand, it costs quite a bit of money to order.
This is not good; it should be "int main(void)".  See section 11 of
the comp.lang.c FAQ, <http://www.c-faq.com/>, starting at question
11.12a.  (Please don't try to dispute this until you've read the FAQ;
this is probably the most common, and most boring, issue discussed
here.)

Yes, sorry, it was written in haste, but the correctness of the main
function has nothing to do with the point of this post.
The behavior of this call is undefined.  There are no sequence points
between the evaluations of function arguments (the commas are not
comma operators, they're just part of the syntax of a function call).
See section 3 of the comp.lang.c FAQ, starting at question 3.1.  See
also C99 6.5p2:

    Between the previous and next sequence point an object shall have
    its stored value modified at most once by the evaluation of an
    expression. Furthermore, the prior value shall be read only to
    determine the value to be stored.


Any possible (or impossible) behavior is conforming.  So don't do
that.

--
Keith Thompson (The_Other_Keith) (e-mail address removed)  <http://www.ghoti.net/~kst>
Nokia
"We must do something.  This is something.  Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"

Ahhhh... then there it is. I was debating with my friend whether it
was defined or not, and I took the (wrong) position that it must be
defined since I couldn't see why it wouldn't treat these expressions
as it would any other. Thank you for the info; is there a copy of the
standard online that is free to download?

Regards,

Anthony
 
A

Anthony Paul

     "Undefined behavior" means that the C language assigns
Thank you for putting it ever so eloquently; had I realized that it
was undefined in the first place it would certainly have been
deserved! Now to get my hands on a copy of the C standard...

Regards,

Anthony
 
K

Keith Thompson

I wrote the above. Please don't delete attribution lines for quoted
text.
I meant parsed... and my assumption was due to the fact that all of
the C compilers I've worked with seem to evaluate them from right to
left. However, I have not confirmed this with the C standard; from
what I understand, it costs quite a bit of money to order.

Parsing takes place during compilation; it means that the compiler
interprets the input text in accordance with the language grammar. It
has very little to do with run-time evaluation order.
Yes, sorry, it was written in haste, but the correctness of the main
function has nothing to do with the point of this post.

Agreed, but in principle just having "void main()" means that your
program's behavior is undefined (unless your implementation
specifically documents that it accepts it). In practice, any compiler
is vanishingly unlikely to do anything other than (a) reject it, (b)
accept it with a warning, or (c) accept it and behave as you'd expect
it to. But it's just as easy to get it right in the first place and
not have to worry about it.

(Please don't quote signatures.)
Ahhhh... then there it is. I was debating with my friend whether it
was defined or not, and I took the (wrong) position that it must be
defined since I couldn't see why it wouldn't treat these expressions
as it would any other. Thank you for the info; is there a copy of the
standard online that is free to download?

There's no free (legal) copy of the standard itself, but the latest
post-C99 draft is
<http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1256.pdf>. It
includes the C99 standard with all three Technical Corrigenda merged
into it. It's not *quite* official, but it's close enough for almost
all purposes.

Note that most compilers don't (yet?) implement all of C99.
 
A

Anthony Paul

Hi Keith,

Thanks for the link, I finally have something other than my Ansi C K&R
book for reference.

Technically you're right about parsing not having anything to do with
the order of evaluation; however, my case is unique since my compiler
(or perhaps I should call it a C assembler) is a one-pass C compiler
that emits assembly language as it parses. Since my parse phase == my
emit phase...

On a side note, you seem to have an excellent grasp of the C language;
perhaps you could recommend a C book which will demystify pointers for
me? I haven't programmed in C in over 15 years and I've retained only
a general understanding of pointer syntax, but when I look at some of
the slightly more exotic pointer syntax, especially those involving
function pointers, I realize I don't understand it at all and I get
dizzy, lol. Any recommendations?

Cheers!

Anthony
 
K

Keith Thompson

Anthony Paul said:
On a side note, you seem to have an excellent grasp of the C language;
perhaps you could recommend a C book which will demystify pointers for
me? I haven't programmed in C in over 15 years and I've retained only
a general understanding of pointer syntax, but when I look at some of
the slightly more exotic pointer syntax, especially those involving
function pointers, I realize I don't understand it at all and I get
dizzy, lol. Any recommendations?

The classic book on C, widely considered to be one of the best
programming books overall, is K&R2 (Kernighan & Ritchie, "The C
Programming Language", 2nd edition).

Harbison & Steele's "C, A Reference Manual", 5th edition, is an
excellent reference.

The comp.lang.c FAQ, <http://www.c-faq.com>, is not designed as a
resource for learning C, but it's an excellent cure for the inevitable
misconceptions that tend to crop up. I often recommend section 6,
Arrays and Pointers; you'd also be interested in section 4, Pointers.
And section 18, Tools and Resources, has some book recommendations
(though some of the information may be out of date).
 
B

BartC

On a side note, you seem to have an excellent grasp of the C language;
perhaps you could recommend a C book which will demystify pointers for
me? I haven't programmed in C in over 15 years and I've retained only
a general understanding of pointer syntax, but when I look at some of
the slightly more exotic pointer syntax, especially those involving
function pointers, I realize I don't understand it at all and I get
dizzy, lol.

You're not alone. I think it's generally acknowledged here that the syntax
(for mixed pointer/function/array declarations) is not the best.
Any recommendations?

The usual recommendation is to decompose these things using extra typedefs,
rather than try and understand them by studying a book. (Even if you
understood your code, someone else might not.)
 
J

James Kuyper

Anthony Paul wrote:
....
Ahhhh... then there it is. I was debating with my friend whether it
was defined or not, and I took the (wrong) position that it must be
defined since I couldn't see why it wouldn't treat these expressions
as it would any other.

Consider the following expression:

int x = (1973-z) + --z + (z=666) + (z=-1);

It has undefined behavior for exactly the same reasons as your function
call.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,011
Latest member
AjaUqq1950

Latest Threads

Top