Does this program have undefined behavior?


N

nitsnow

#include <stdio.h>
int g = 1;

int func_2()
{
g = 2;
return 3;
}

int main()
{
int *l = &g;
*l = func_2(); /* LHS evaluates to g, and
RHS writes to g */
printf("g = %d\n", g);
return 0;
}

My colleague and I had a lengthy discussion, and nobody can be
convinced by the other. The question is: can g ever gets the value of
2 at the end of the program?

Appreciate your expert thoughts...

Jason
 
Ad

Advertisements

J

James Dow Allen

[snip] ; [snip]

Semi-colon. Less well known, perhaps, than the Comma "sequence point"
but it also serializes.
The question is: can g ever gets the value of
2 at the end of the program?

No. And BTW, we don't use "gets".
Appreciate your expert thoughts...

I'm no expert and am probably wrong again! :)

James
 
N

Nick

#include <stdio.h>
int g = 1;

int func_2()
{
g = 2;
return 3;
}

int main()
{
int *l = &g;
*l = func_2(); /* LHS evaluates to g, and
RHS writes to g */
printf("g = %d\n", g);
return 0;
}

My colleague and I had a lengthy discussion, and nobody can be
convinced by the other. The question is: can g ever gets the value of
2 at the end of the program?

Appreciate your expert thoughts...

I'm not an expert, particularly on things like this. My feeling tends
to be "if this is likely to lead people into lengthy discussions as to
whether it's a bug or not, then don't do it unless you really, really
need to - and then comment it!".

But I'm pretty sure that this is defined. func_2 is executed, changing
g and returning 3. Then this is written into g.

The only reason I'm tentative at all is that I remember a long
discussion from a long time ago involving changing an array index that
turned out to be undefined.

Nick, confidently waiting to be proved wrong.
 
K

Keith Thompson

#include <stdio.h>
int g = 1;

int func_2()
{
g = 2;
return 3;
}

int main()
{
int *l = &g;
*l = func_2(); /* LHS evaluates to g, and RHS writes to g */
printf("g = %d\n", g);
return 0;
}

My colleague and I had a lengthy discussion, and nobody can be
convinced by the other. The question is: can g ever gets the value of
2 at the end of the program?

Strictly speaking, I believe the program's behavior is undefined
because you wrote "int main()" rather than "int main(void)", but
I'm sure that's not what you had in mind. :cool:}

I don't believe doing the assignment through a pointer changes
anything. You could have dropped the declaration of l and written
g = func_2();
with the same effect.

This assigns the result of the function call to g. The assignment
cannot modify the value of g until after it's evaluated the
expression func_2(). During the evaluation of func_2(), g is set
to 2, then there's a sequence point, then the value 3 is returned.

So "g = 2" must be evaluated, and its side effect must occur,
before "return 3" is executed (because of the sequence point), and
"return 3" must be executed before g is modified by the assignment
(because it computes the value to be assigned).

Note that this reasoning does not apply to the classic "i = i++".
The value yielded by "i++" must be determined before the assignment
can modify i, but the side effect of "i++" can take place either
before or after the side effect of the assignment.
 
N

nitsnow

Seems like an unanimous vote for "defined". But what if I tell you the
argument for "undefined" is this: at line "*l = func_2();", both RHS
and the whole assignment expression write to "g", and there is only
one sequence point, which is after ";". According to C99, it is
undefined to have more than one write to the same variable between one
sequence point and the next sequence point.

How would you argue against this?
 
Ad

Advertisements

F

Flash Gordon

pete said:
A function call is also a sequence point.

That shows there is a sequence point, but it's not the important one.
The important one is the sequence point on returning from the function.
 
S

S R

#include <stdio.h>
int g = 1;
int func_2()
{
   g = 2;
   return 3;
}
int main()
{
   int *l = &g;
   *l = func_2(); /* LHS evaluates to g, and RHS writes to g */
   printf("g = %d\n", g);
   return 0;
}
[snip]

Strictly speaking, I believe the program's behavior is undefined
because you wrote "int main()" rather than "int main(void)", but
I'm sure that's not what you had in mind.  :cool:}

[SR] May be, I am missing something, but I don't see why the code
invokes undefined behaviour.

From N1256 (I am aware that this is not the latest, but I think it is
sufficient for this discussion)

6.7.5.3 Function declarators (including prototypes)
"An empty list in a function declarator that is part of a
definition of that function specifies that the function has no
parameters"

Since int main(), in the code presented by the OP, is a definition, I
don't see any cause of undefined behaviour.

Also,
5.1.2.2.1 Program startup

" It shall be defined with a return type of int and with no
parameters "

( Although the example given is of the form int main(void), I am not
able to see how int main() does not satisfy the requirement stated
above. Also, things specified in the future language directions do not
apply here. It is *indeed* better to write int main (void), but int
main() is equally correct)


[snip]
 
J

Joachim Schmitz

S said:
#include <stdio.h>
int g = 1;
int func_2()
{
g = 2;
return 3;
}
int main()
{
int *l = &g;
*l = func_2(); /* LHS evaluates to g, and RHS writes to g */
printf("g = %d\n", g);
return 0;
}
[snip]

Strictly speaking, I believe the program's behavior is undefined
because you wrote "int main()" rather than "int main(void)", but
I'm sure that's not what you had in mind. :cool:}

[SR] May be, I am missing something, but I don't see why the code
invokes undefined behaviour.

From N1256 (I am aware that this is not the latest, but I think it is
sufficient for this discussion)


It is not? AFAIK it is the latest valid Standard including all 3 existing
TCs.

Everything after that is drafts, I think.

Bye, Jojo
 
B

Ben Bacarisse

Seems like an unanimous vote for "defined". But what if I tell you the
argument for "undefined" is this: at line "*l = func_2();", both RHS
and the whole assignment expression write to "g", and there is only
one sequence point, which is after ";". According to C99, it is
undefined to have more than one write to the same variable between one
sequence point and the next sequence point.

How would you argue against this?

There is one at the ; as you say (too late to matter) but there is
also one at the function call itself (too early to matter). The
important point it that are two more in func_2: one just after 2 is
assigned and another just before 3 is returned.
 
B

Ben Bacarisse

James Dow Allen said:
[snip] ; [snip]

Semi-colon. Less well known, perhaps, than the Comma "sequence point"
but it also serializes.

; does not mean there is a sequence point. For example, there is no
sequence point in

break;

nor in

continue;

<snip>
 
Ad

Advertisements

K

Keith Thompson

S R said:
Strictly speaking, I believe the program's behavior is undefined
because you wrote "int main()" rather than "int main(void)", but
I'm sure that's not what you had in mind.  :cool:}

[SR] May be, I am missing something, but I don't see why the code
invokes undefined behaviour.
[...]

C99 5.1.2.2.1p1:

It shall be defined with a return type of int and with no
parameters:
int main(void) { /* ... */ }
or with two parameters (referred to here as argc and argv, though
any names may be used, as they are local to the function in which
they are declared):
int main(int argc, char *argv[]) { /* ... */ }
or equivalent; or in some other implementation-defined manner.

with a footnote:

Thus, int can be replaced by a typedef name defined as int, or the
type of argv can be written as char ** argv, and so on.

My argument is that
int main() { /* ... */ }
is not equivalent to
int main(void) { /* ... */ }
and therefore is not covered.

The only difference between this program:
int main(void) { return 0; }
int func(void) { return main(42); }
and this program:
int main() { return 0; }
int func(void) { return main(42); }
is the void keyword in the declaration of main. The first requires a
diagnostic; the second does not. So they're not equivalent.

In practice, I'd be surprised to see any compiler that doesn't quietly
accept "int main()" (which is, of course, one possible consequence of
undefined behavior). And of course "int main(void)" is better style
anyway.

I should probably set up a web page with this argument so I can refer
to it rather than reconstructing the argument each time.
 
S

Seebs

Seems like an unanimous vote for "defined". But what if I tell you the
argument for "undefined" is this: at line "*l = func_2();", both RHS
and the whole assignment expression write to "g", and there is only
one sequence point, which is after ";". According to C99, it is
undefined to have more than one write to the same variable between one
sequence point and the next sequence point.
How would you argue against this?

There is another sequence point. Look at func_2(). Notice that it has
a statement. The end of that statement is a sequence point. That sequence
point must be hit before the function returns its value. So, before func_2()
returns 3, there has been a sequence point after the modification of g.

-s
 
S

Seebs

My argument is that
int main() { /* ... */ }
is not equivalent to
int main(void) { /* ... */ }
and therefore is not covered.

I'm pretty sure that they're "equivalent" in the relevant sense.

An identifier list declares only the identifiers of the parameters of
the function. An empty list in a function declarator that is part of a
definition of that function specifies that the function has no
parameters.

The special case of an unnamed parameter of type void as the only item
in the list specifies that the function has no parameters.

It seems to me that these are the same definition. There are differences
elsewhere, but they are still equivalent.

In short, I think your argument is good, except that I believe "equivalent"
doesn't quite mean the same thing as "identical". Consider:

int main(int argc, char **argv) {
return argc != 1;
}
vs
int main(int intArgumentCount, char **arrayArrayArgumentVector) {
return argc != 1;
}

One of these requires a diagnostic, so clearly the two declarations are
not equivalent. :) (That said, note that the standard explicitly mentions
that the names are local variables and don't matter.)

Basically, there's a distinction to be made between "make the same
specification of the function" and "be precisely identical in all ways
including information available to the compiler".

-s
 
B

Barry Schwarz

#include <stdio.h>
int g = 1;

int func_2()
{
g = 2;
return 3;
}

int main()
{
int *l = &g;
*l = func_2(); /* LHS evaluates to g, and
RHS writes to g */
printf("g = %d\n", g);
return 0;
}

My colleague and I had a lengthy discussion, and nobody can be
convinced by the other. The question is: can g ever gets the value of
2 at the end of the program?

No. There is a sequence point at the end of the first statement in
func2. There is also a sequence point when func2 returns prior the
returned value being assigned to *l.

This assumes you are not playing word games because at the end of the
program g ceases to exist and questions about its value become
metaphysical.
 
S

S R

[SR] May be, I am missing something, but I don't see why the code
invokes undefined behaviour.

[...]

C99 5.1.2.2.1p1:

[snip CV]

{SR] I had quoted this in my reply earlier. So this is not what
I missed.
My argument is that
    int main() { /* ... */ }
is not equivalent to
    int main(void) { /* ... */ }
and therefore is not covered.

[SR] This where I think I disagree with you. From 6.7.5.3 Function
declarators (including prototypes) ( I'll quote again),

6.7.5.3 Function declarators (including prototypes)
"An empty list in a function declarator that is part of a
definition of that function specifies that the function has no
parameters"

Now, int main(){ /**/}, here, satisfies the above stated rule. So, main
() has no parameters. ---> (1)

Also,

5.1.2.2.1 Program startup

" It shall be defined with a return type of int and with no
^^^^^^^
parameters "
^^^^^^^^^^

(I intentionally ignore the example provided in the standard. The
example might include a void, but there is no wording in the
standard to push the use of void in the definition)

From (1), I conclude that int main(){} satisfies the above rule. So, I
see no omission to make the case for int main(){} to be undefined.

[snip]
 
Ad

Advertisements

J

James Dow Allen

James Dow Allen said:
    [snip]  ;  [snip]
Semi-colon.  Less well known, perhaps, than the Comma "sequence point"
but it also serializes.

; does not mean there is a sequence point.  For example, there is no
sequence point in

  break;

nor in

  continue;

foo(a++ + a++)
gives undefined behavior; one could say that's because a needed
sequence point is missing. In this way, the concept "sequence
point" becomes meaningful.

Is there an example program where saying " 'continue;' lacks
a sequence point" is meaningful? That is, where the same
program could give different results *because* of that "missing"
sequence point? I can't think of one; maybe I'm lacking
imagination.

James
 
B

Ben Bacarisse

James Dow Allen said:
James Dow Allen said:
    [snip]  ;  [snip]
Semi-colon.  Less well known, perhaps, than the Comma "sequence point"
but it also serializes.

; does not mean there is a sequence point.  For example, there is no
sequence point in

  break;

nor in

  continue;

foo(a++ + a++)
gives undefined behavior; one could say that's because a needed
sequence point is missing. In this way, the concept "sequence
point" becomes meaningful.

Is there an example program where saying " 'continue;' lacks
a sequence point" is meaningful? That is, where the same
program could give different results *because* of that "missing"
sequence point? I can't think of one; maybe I'm lacking
imagination.

I can't either or I would have included one! I don't think there is
one. My point was only about how C is specified.
 
K

Keith Thompson

S R said:
S R said:
Strictly speaking, I believe the program's behavior is undefined
because you wrote "int main()" rather than "int main(void)", but
I'm sure that's not what you had in mind.  :cool:}
[SR] May be, I am missing something, but I don't see why the code
invokes undefined behaviour.

[...]

C99 5.1.2.2.1p1:

[snip CV]

{SR] I had quoted this in my reply earlier. So this is not what
I missed.
My argument is that
    int main() { /* ... */ }
is not equivalent to
    int main(void) { /* ... */ }
and therefore is not covered.

[SR] This where I think I disagree with you. From 6.7.5.3 Function
declarators (including prototypes) ( I'll quote again),

6.7.5.3 Function declarators (including prototypes)
"An empty list in a function declarator that is part of a
definition of that function specifies that the function has no
parameters"

Now, int main(){ /**/}, here, satisfies the above stated rule. So, main
() has no parameters. ---> (1)

Also,

5.1.2.2.1 Program startup

" It shall be defined with a return type of int and with no
^^^^^^^
parameters "
^^^^^^^^^^

(I intentionally ignore the example provided in the standard. The
example might include a void, but there is no wording in the
standard to push the use of void in the definition)

From (1), I conclude that int main(){} satisfies the above rule. So, I
see no omission to make the case for int main(){} to be undefined.

That's an interesting argument. However, the definitions that you
intentionally ignored are not examples. There are numerous examples
in the standard, and they're specifically marked as such; these are
not.

Quoting the standard yet again (not to imply that you missed anything,
just to keep the context near the commentary):

[...] It shall be defined with a return type of int and with no
parameters:
int main(void) { /* ... */ }
or with two parameters (referred to here as argc and argv, though
any names may be used, as they are local to the function in which
they are declared):
int main(int argc, char *argv[]) { /* ... */ }
or equivalent;9) or in some other implementation-defined manner.

If the second definition were to be ignored, we'd have no way to
know the types of the two parameters; the fact that they're int
and char*[] (or char**) is only stated in the definition.

So the standard doesn't just say that main can be defined with
a return type of int and no parameters (which "int main()" would
satisfy); it specifies *how* it can be defined with a return type
of int and no parameters ("int main(void)" or equivalent).

The fact that:

The use of function declarators with empty parentheses (not
prototype-format parameter type declarators) is an obsolescent
feature.

(C99 6.11.6) doesn't directly bear on the issue, since old-style
declarators are still fully part of the language. But it might
suggest that, if this issue is going to be resolved, it's more
likely to be made irrelevant by dropping old-style declarators
from the language than by actually settling the issue of whether
"int main()" is equivalent to "int main(void)".
 
Ad

Advertisements

K

Keith Thompson

Seebs said:
I'm pretty sure that they're "equivalent" in the relevant sense.

An identifier list declares only the identifiers of the parameters of
the function. An empty list in a function declarator that is part of a
definition of that function specifies that the function has no
parameters.

The special case of an unnamed parameter of type void as the only item
in the list specifies that the function has no parameters.

It seems to me that these are the same definition. There are differences
elsewhere, but they are still equivalent.

In short, I think your argument is good, except that I believe "equivalent"
doesn't quite mean the same thing as "identical". Consider:

int main(int argc, char **argv) {
return argc != 1;
}
vs
int main(int intArgumentCount, char **arrayArrayArgumentVector) {
return argc != 1;
}

One of these requires a diagnostic, so clearly the two declarations are
not equivalent. :) (That said, note that the standard explicitly mentions
that the names are local variables and don't matter.)

Basically, there's a distinction to be made between "make the same
specification of the function" and "be precisely identical in all ways
including information available to the compiler".

Another interesting argument. I haven't changed my mind, but I'm
considering it.

It's not obvious what the "relevant sense" of "equivalent" really is.

One counterargument I thought of is this: Suppose a compiler uses a
different calling sequence for
int foo(void) { /* ... */ }
than for
int foo() { /* ... */ }
There might be good reasons for doing so; the latter might pass
additional information for error checking (to detect calls whose
behavior is undefined) while the former doesn't need to, since any
such errors will be caught at compile time. The standard already
requires the calling environment to support two different calling
sequences for main; it doesn't require a third.

But I think that argument falls apart because an implementation is
already required to be able to call either function without knowing
its type. The following, as far as I can tell, is strictly
conforming:

#include <stdio.h>

int foo(void) {
puts("void foo(void)");
return 0;
}

int bar() {
puts("void bar()");
return 0;
}

int main(void) {
int (*ptr)();
ptr = foo;
ptr();
ptr = bar;
ptr();
return 0;
}

and 6.7.5.3 seems to say that the types of foo and bar are compatible.

So I think the argument in favor of "int main()"'s behavior being
defined is that "equivalent" means equivalent with respect to the
definition itself, not to its impact on code outside the definition.
And I'm still not convinced. I still think it's not equivalent,
because it can affect the legality of other code in the same
translation unit.
 

Top