Understanding a simple C program

R

Richard

Martien Verbruggen said:
Martien Verbruggen wrote:

[removal of various levels of quotation]
Mistake 4 is the only part of the code that I'm talking about.

Even if there were no other mistakes,
mistake 4 would be sufficient to prevent the program from
being a "correct program".

Whatever behavior results from running that program,
is not defined between the code and the standard.
Even if the behavior is unremarkable, it's still undefined.

Can you quote the bit from the standard that says that it produces
undefined behaviour?

My understanding is, as I have already stated, that it is implementation
defined, not undefined. I base that opinion on 7.19.2-2 in the c99
standard, which defines a text stream, and states:

Whether the last line requires a terminating new-line character is
implementation-defined.

I believe the C90 standard has similar, if not identical, wording.
Previous discussions on this group tend to agree that the above wording
also means that whether a prorgam needs to end its (textual) output with
a newline is implementation defined, not undefined.

A slightly silly question maybe, but how does one know if and when
something is implementation defined? e.g suppose no newline meant system
A crashed every time? Is this now defined behaviour by that
implementation?
 
C

Chris Dollin

Richard said:
A slightly silly question maybe, but how does one know if and when
something is implementation defined?

If I recall correctly, the Standard will /say/ that it's implementation
defined, and the implementation is required to document its definition.
e.g suppose no newline meant system
A crashed every time? Is this now defined behaviour by that
implementation?

Only if the documentation says so.
 
R

Richard Bos

Martien Verbruggen said:
Martien Verbruggen wrote:

[removal of various levels of quotation]
Whatever behavior results from running that program,
is not defined between the code and the standard.
Even if the behavior is unremarkable, it's still undefined.

Can you quote the bit from the standard that says that it produces
undefined behaviour?

My understanding is, as I have already stated, that it is implementation
defined, not undefined. I base that opinion on 7.19.2-2 in the c99
standard, which defines a text stream, and states:

Whether the last line requires a terminating new-line character is
implementation-defined.

That's the bit you're asking for. Specifically, it says that the
implementation must define whether the last line requires a newline; but
where the newline is required, that nor any other clause in the Standard
says what happens if there isn't one. So
- where the implementation defines that the last line does not require a
newline, the above code works;
- where the implementation defines that the last line _does_ require a
newline, the code in question has undefined behaviour _by default_,
i.e., not because it is explicitly said to have UB, but because there
is nothing which does define its behaviour.

So this code has defined behaviour on some implementations, and
undefined behaviour on others; and therefore, /in toto/, its behaviour
is undefined by ISO C.
(In this, it is no different from code in which, e.g., the constant
40000 is converted to int; where INT_MAX is greater than 39999, the
result will be a normal int with the value 40000, but where INT_MAX is
less than 40000 (typically 32767), the behaviour is undefined.)

Richard
 
M

Martien Verbruggen

Martien Verbruggen said:
Martien Verbruggen wrote:

[removal of various levels of quotation]
mistake 4: missing \n in printf causing undefined behavoir
Whatever behavior results from running that program,
is not defined between the code and the standard.
Even if the behavior is unremarkable, it's still undefined.

Can you quote the bit from the standard that says that it produces
undefined behaviour?

My understanding is, as I have already stated, that it is implementation
defined, not undefined. I base that opinion on 7.19.2-2 in the c99
standard, which defines a text stream, and states:

Whether the last line requires a terminating new-line character is
implementation-defined.

That's the bit you're asking for. Specifically, it says that the
implementation must define whether the last line requires a newline; but
where the newline is required, that nor any other clause in the Standard
says what happens if there isn't one. So
- where the implementation defines that the last line does not require a
newline, the above code works;
- where the implementation defines that the last line _does_ require a
newline, the code in question has undefined behaviour _by default_,
i.e., not because it is explicitly said to have UB, but because there
is nothing which does define its behaviour.

So this code has defined behaviour on some implementations, and
undefined behaviour on others; and therefore, /in toto/, its behaviour
is undefined by ISO C.
(In this, it is no different from code in which, e.g., the constant
40000 is converted to int; where INT_MAX is greater than 39999, the
result will be a normal int with the value 40000, but where INT_MAX is
less than 40000 (typically 32767), the behaviour is undefined.)

I find the above logic a little stretched. The standard specifies that
whether or not a newline is needed is implementation defined. You now
argue that the behaviour is undefined, because the standard does not
describe what the behaviour should be if the implementation requires a
newline, but none is provided. I'd expect that the implementation
documents what happens. That should be its job after this clause.

How could the standard ever document all possible behaviours? I would
expect that, if the behaviour was supposed to be undefined, that the
standard would specifically have stated that. It does so in many other
places. The standard could have included one single small sentence to
call it undefined behaviour. The fact that that wasn't done to me
indicates that it was never meant to be undefined.

Do you maybe have a reference to an interpretation or discussion to this
group or comp.std.c where this is resolved either way? I couldn't find
anything I'd call a resolution, just more or less a majority leaning
towards interpreting this whole area as implementation defined, not
undefined.

Martien
 
S

Sheth Raxit

Hi Friends -

This is a pretty naive question, so be patient. I have been thinking
about the following simple C program:

void main()
{
int a, b, c;
a=1;
b=2;
c=a+b;
printf("%d", c);

}

Let's imagine what the compiler does to this program. Probably it says
"OK, so here we have some variables. We've got an accumulator and some
registers X and Y hanging around. Let's do something like:
LDA 0x1 # put 1 into accumulator
LDX 0x2 # put 2 into register X
ADC X # add register X to accumulator with carry
for example."

So let's fast forward a bit, and our program is executing. But now maybe
we've got A and X set up, but before we get chance to execute the add
instruction, we get interrupted and the kernel passes control to some
other process. In this case, probably all the registers will have to be
pushed onto the stack, then popped back when control returns to our
process.

But haven't we then lost all the benefits of using registers for storage
instead of RAM?

Or to put it another way, why do people worry so much about hitting
cache and optimizing register usage, when this can easily (and probably
will be) wiped out by control passing to another process in the mean
time?

Interesting but Off-Topic question.
the real problem is OS is not smart enough, it is not knowing which
time is less...contextswitch or execution.


The best scheduling algorithm is < I may Wrong...:) > Shortest Job
First, but for Schduler it is very hard to Predict, which job is
shortest <except in Specialized environment>.

-raxit
 
R

Richard Bos

Martien Verbruggen said:
I find the above logic a little stretched. The standard specifies that
whether or not a newline is needed is implementation defined. You now
argue that the behaviour is undefined, because the standard does not
describe what the behaviour should be if the implementation requires a
newline, but none is provided. I'd expect that the implementation
documents what happens. That should be its job after this clause.

How could the standard ever document all possible behaviours? I would
expect that, if the behaviour was supposed to be undefined, that the
standard would specifically have stated that. It does so in many other
places. The standard could have included one single small sentence to
call it undefined behaviour. The fact that that wasn't done to me
indicates that it was never meant to be undefined.

Section 4, clause 2, of the Standard:
'If a "shall" or "shall not" requirement that appears outside of a
constraint is violated, the behavior is undefined. Undefined behavior
is otherwise indicated in this International Standard by the words
"undefined behavior" ***or by the omission of any explicit definition
of behavior. There is no difference in emphasis among these three; they

all describe "behavior that is undefined"***.'

Emphasis mine.

Richard
 
L

Larry__Weiss

Larry__Weiss said:
So code, even if never executed, can cause undefined behaviour?

I've been reminded via email that I didn't think about complier
optimizations. The effects of undefined code could well have effects,
even if that code is never executed.

- Larry
 
R

Richard

Larry__Weiss said:
I've been reminded via email that I didn't think about complier
optimizations. The effects of undefined code could well have effects,
even if that code is never executed.

- Larry

You should never have to worry about the compiler in either defined or
undefined situations.
 
J

James Kuyper

Sorry - I hadn't meant to send this as e-mail. I pressed the wrong
button the first time:

Larry__Weiss said:
user923005 said:
Does the side-effect of the undefined behavior have to wait to manifest
itself until that statement's time has come to be executed?

"Undefined behavior" isn't a side effect. If a program contains code
which makes the behavior of that program undefined, that means it can do
anything.

As a practical matter, if the code has undefined behavior that that
might or might not happen, depending upon the program's inputs, then the
code must continue operating in accordance with the other applicable
requirements, right up until the point where it becomes inevitable that
the situation allowing undefined behavior will occur. That's because the
compiler might make optimizations based upon the assumption that the
undefined behavior can't occur. Some of those optimizations might have
unexpected effects which occur far earlier than the actual execution of
the code whose undefined behavior allows those unexpected effects to occur.

In principle, if predestination were true, then everything that will
ever happen became inevitable from the moment the universe started. In
that case, a conforming implementation of C could have some of the
consequences of the undefined behavior occur before the author was even
born, simply because it correctly predicts that the program is
predestined to be written, translated, and executed. :)
 
P

pete

Martien said:
Martien Verbruggen wrote:

[removal of various levels of quotation]
Mistake 4 is the only part of the code that I'm talking about.

Even if there were no other mistakes,
mistake 4 would be sufficient to prevent the program from
being a "correct program".

Whatever behavior results from running that program,
is not defined between the code and the standard.
Even if the behavior is unremarkable, it's still undefined.

Can you quote the bit from the standard that says that it produces
undefined behaviour?

My understanding is, as I have already stated, that it is implementation
defined, not undefined. I base that opinion on 7.19.2-2 in the c99
standard, which defines a text stream, and states:

Whether the last line requires a terminating new-line character is
implementation-defined.

Yes.
So the question is:
When an implementation requires a terminating new-line character
on last line of a text stream and the thing that was opened
as a text stream on that implementation,
doesn't have a terminating newline character in its last line,
then what happens?
The standard doesn't say what happens,
and the standard also doesn't say that the implementation
is required to document what happens in that case.

The limits of behavior of that program are unbounded.
 
P

pete

Richard said:
Martien Verbruggen said:
Martien Verbruggen wrote:

[removal of various levels of quotation]
mistake 4: missing \n in printf causing undefined behavoir
Whatever behavior results from running that program,
is not defined between the code and the standard.
Even if the behavior is unremarkable, it's still undefined.

Can you quote the bit from the standard that says that it produces
undefined behaviour?

My understanding is,
as I have already stated, that it is implementation
defined, not undefined. I base that opinion on 7.19.2-2 in the c99
standard, which defines a text stream, and states:

Whether the last line requires
a terminating new-line character is
implementation-defined.

That's the bit you're asking for. Specifically, it says that the
implementation must define whether
the last line requires a newline; but
where the newline is required,
that nor any other clause in the Standard
says what happens if there isn't one. So
- where the implementation defines that
the last line does not require a
newline, the above code works;
- where the implementation defines that the last line _does_ require a
newline, the code in question has undefined behaviour _by default_,
i.e., not because it is explicitly
said to have UB, but because there
is nothing which does define its behaviour.

So this code has defined behaviour on some implementations, and
undefined behaviour on others; and therefore, /in toto/, its behaviour
is undefined by ISO C.

That's the way that I see it.
 
P

pete

user923005 said:
It isn't undefined because the implementation *has* to define it.
It's in the same category as what the maximum value of a double is.

I also think it is a mistake, but the mistake does not result in
undefined behavior. It results in implementation defined behavior.
(That behavior might be that 'Hello world!' does not show up on stdout
and that might not be the _expected_ behavior, but it is acceptable if
that is what the implementation says it should do).

There is no requirement for an implementation
that requires the last line of a text stream to have
a terminating newline,
to document what happens if the character isn't there.
 
A

Army1987

I've been reminded via email that I didn't think about complier
optimizations. The effects of undefined code could well have effects,
even if that code is never executed.

So will
a = ++a[j];
cause UB because i could equal j, regardless of whether they're
actually equal?
Will
#define d 0
if (d) {
r /= d;
}
cause UB even if that expression statement will never be executed
(nor translated, if the compiler is smart enough)?
 
C

Charlie Gordon

Army1987 said:
I've been reminded via email that I didn't think about complier
optimizations. The effects of undefined code could well have effects,
even if that code is never executed.

So will
a = ++a[j];
cause UB because i could equal j, regardless of whether they're
actually equal?


It invokes UB *if* i equals j and only then.
Will
#define d 0
if (d) {
r /= d;
}
cause UB even if that expression statement will never be executed
(nor translated, if the compiler is smart enough)?

This IMHO does not invoke UB.

On the other hand, the function below may invoke UB if invoked with well
chosen arguments.

int smart_divide(int a, int b) {
return b ? a / b : 0; /* not so smart */
}
 
R

Richard Bos

Army1987 said:
I've been reminded via email that I didn't think about complier
optimizations. The effects of undefined code could well have effects,
even if that code is never executed.

So will
a = ++a[j];
cause UB because i could equal j, regardless of whether they're
actually equal?
Will
#define d 0
if (d) {
r /= d;
}
cause UB even if that expression statement will never be executed
(nor translated, if the compiler is smart enough)?


No in both cases. And since in the first case the implementation can
very probably, and in the latter it can most definitely not prove that
the theoretically-UB-having code is ever executed, it may not even
refuse to translate that translation unit. Contrastingly, if the code is

i=j;
a=++a[j];

or

x=a && b;
if (x)
c/=(x-1);
else
c/=x;

then the implementation could, if clever enough, prove that UB _must_
occur, and AFAICT is allowed to refuse to compile it.

Richard
 
K

Keith Thompson

So this code has defined behaviour on some implementations, and
undefined behaviour on others; and therefore, /in toto/, its behaviour
is undefined by ISO C.
Agreed.

(In this, it is no different from code in which, e.g., the constant
40000 is converted to int; where INT_MAX is greater than 39999, the
result will be a normal int with the value 40000, but where INT_MAX is
less than 40000 (typically 32767), the behaviour is undefined.)

That turns out to be a poor example. If a value outside the range of
a signed type is converted to that type, the behavior is not
undefined; either the result is an implementation-defined value, or an
implementation-defined signal is raised.

Signed arithmetic overflow does cause undefined behavior. (I can't
think of any good reason why conversion and arithmetic should be
inconsistent, but they are.)
 
K

Keith Thompson

Martien Verbruggen said:
I find the above logic a little stretched. The standard specifies that
whether or not a newline is needed is implementation defined. You now
argue that the behaviour is undefined, because the standard does not
describe what the behaviour should be if the implementation requires a
newline, but none is provided. I'd expect that the implementation
documents what happens. That should be its job after this clause.

No, the implementation's only job here is to document whether a
trailing new-line is required or not. The standard doesn't require
anything more than that (though an implementer is free to provide
additional documentation).
 
M

Martien Verbruggen

No, the implementation's only job here is to document whether a
trailing new-line is required or not. The standard doesn't require
anything more than that (though an implementer is free to provide
additional documentation).

Thanks for the explanations, from both you and Richard Bos.

Martien
 
K

Kenneth Brody

Keith said:
No, the implementation's only job here is to document whether a
trailing new-line is required or not. The standard doesn't require
anything more than that (though an implementer is free to provide
additional documentation).

Is an implementation allowed to say that the behavior of leaving off
the terminating newline is dependent on things outside the scope of
the C compiler?

For example, suppose you are sending the output to a printer, which
won't print the last line if it's not newline-terminated, even though
the C implementation guarantees that it will send everything up to
the final non-newline character? However, if going to the screen,
the output will be just fine.

--
+-------------------------+--------------------+-----------------------+
| Kenneth J. Brody | www.hvcomputer.com | #include |
| kenbrody/at\spamcop.net | www.fptech.com | <std_disclaimer.h> |
+-------------------------+--------------------+-----------------------+
Don't e-mail me at: <mailto:[email protected]>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,535
Members
45,007
Latest member
obedient dusk

Latest Threads

Top