C Test Incorrectly Uses printf() - Please Confirm

W

Willem

Shao Miller wrote:
) This comment of mine was specifically in regards to operand evaluation
) order. If it were standardized, programmers could expect consistent
) treatment on any conforming implementation. It was not meant an
) all-encompassing statement.

Then the question becomes: Why would you single out that one specific
issue as something that should be standardized, when there are so many
other things where programmers cannot expect consistent treatment ?


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
 
S

Shao Miller

Willem said:
Shao Miller wrote:
) This comment of mine was specifically in regards to operand evaluation
) order. If it were standardized, programmers could expect consistent
) treatment on any conforming implementation. It was not meant an
) all-encompassing statement.

Then the question becomes: Why would you single out that one specific
issue as something that should be standardized, when there are so many
other things where programmers cannot expect consistent treatment ?
Please re-read Mr. K. Brody's piece within the post of Mr. R. Bos. My
comment therein is a response to Mr. K. Brody's piece. Please ask Mr.
K. Brody.

Please also note that I have not and do not suggest that this issue
should be standardized.

Thank you.
 
W

Willem

Shao Miller wrote:
) Please also note that I have not and do not suggest that this issue
) should be standardized.

Then why are you making supportive statements to this suggestion ?


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
 
S

Shao Miller

Willem said:
Shao Miller wrote:
) Please also note that I have not and do not suggest that this issue
) should be standardized.

Then why are you making supportive statements to this suggestion ?
Why shouldn't I? Please note that "do not suggest that this issue
should be standardized" is not the same as "suggest that is issue should
not be standardized."

Imagine a world in which operand evaluation order was standardized, if
you will. Imagine that 6.5p2 from C99 is then different or gone. Suddenly,

int a = 1;
printf("%d", ++a, a + 5);

meets with (b) as the correct answer in the original post. Then also,
programmers are free to use such constructs.

Further imagine, if you will, that implementations retain their license
to optimize, as long as the actual semantics yield results identical to
the abstract semantics.

Now we might lose speed in translation, since each operand must have its
dependencies on any previous operands noted and properly processed. For
a left-to-right standard order, this would mean that the third operand
above would have to have its read of 'a' scheduled for after the
modification of 'a'. 30 more operands might not depend on 'a', and
might be scheduled concurrently.

Compilers would thus require re-working; there is thus an associated
cost for changing the meaning of conformance.

Does that help to clarify?

The current C1X draft defines "indeterminately sequenced" versus
"unsequenced". It seems that those responsible for that portion of the
draft perceive a value in distinguishing between the two. The current
case is that the arguments and expression denoting the called function
in a function call (which could be considered as similar to operands for
an N-ary operator) are evaluated in an "unsequenced" manner by this C1X
draft.

That's too bad for programmers who might enjoy well-defined behaviour for:

int a = 1;
printf("%d", ++a, a + 5);

It's not too bad for those who would need to re-work an implementation.
It's not too bad for those who would not appreciate the associated
translation-time cost.
 
T

Tim Streater

On the contrary, it's a piss-poor piece of hardware which will not
optimise the running time of my correct code just to bail out the
piss-poor code which overly-clever newbie hacks write.

In 45 years of writing code I've never come across a processor that
suffered from such deficiencies. If you're having to write for one, then
you have my condolences.
 
F

FredK

Tim Streater said:
In 45 years of writing code I've never come across a processor that
suffered from such deficiencies. If you're having to write for one, then
you have my condolences.

I'm with you there - although I've only programmed for a living since 1978 -
during which time I've done such things as being the platform implementation
lead for an OS. I look at the arguments expressed here as simply stretching
reality a bit to prove a point... essentially suggesting a suggesting a
multi-CPU/thread architecture which cannot handle different threads
concurrently accessing memory... yet allowing the compiler and applications
to attempt the operation - and then to apply it to a specific operation
where the compiler would have had to emit *very* odd code that does
something the HW can't do.

The point however is taken: The statement by the rules is UB. Since the
quiz provided that as one of the answers... that answer should be accepted
as valid. Pragmatically, the likelyhood is that every C compiler on every
platform that actually compiles the statement would print "2". In the
absence of exotic broken HW, the question of what the UB "a + 5" evaluates
to - is pretty much immaterial.
 
T

Tim Streater

"FredK said:
I'm with you there - although I've only programmed for a living since 1978 -
during which time I've done such things as being the platform implementation
lead for an OS. I look at the arguments expressed here as simply stretching
reality a bit to prove a point... essentially suggesting a suggesting a
multi-CPU/thread architecture which cannot handle different threads
concurrently accessing memory... yet allowing the compiler and applications
to attempt the operation - and then to apply it to a specific operation
where the compiler would have had to emit *very* odd code that does
something the HW can't do.

The point however is taken: The statement by the rules is UB. Since the
quiz provided that as one of the answers... that answer should be accepted
as valid. Pragmatically, the likelyhood is that every C compiler on every
platform that actually compiles the statement would print "2". In the
absence of exotic broken HW, the question of what the UB "a + 5" evaluates
to - is pretty much immaterial.

Yes. I'm not suggesting that the standard or anything else be altered to
define what a+5 would have yielded.
 
S

Seebs

In 45 years of writing code I've never come across a processor that
suffered from such deficiencies. If you're having to write for one, then
you have my condolences.

How would you know that you have never encountered such a processor, when
it is quite possible that you have, except that compilers or other tools
have hidden it, or you've simply never triggered one of the affected cases?

I've seen plenty of processors where surprising results (as in, not merely
one of the two or three obvious execution outcomes) could happen from code
with undefined behavior that had solely to do with side-effects and sequence
points. At least sometimes it looked like an object larger than some natural
size had gotten copied in two hunks each of which came from a different
result, but I've also seen "WTF random bits?!?!" results.

-s
 
D

Denis McMahon

I took an online C test a few months ago. I actually thought the test
was better than some I've taken, but one question in particular I
think has the wrong answer. The question is this:

What is printed:

int a = 1;
printf("%d", ++a, a + 5);

a. 1
b. 2
c. 7
d. undefined

I selected d. This is the explanation given as to why b is the correct
answer.

The first expression in the parameter list following the format
string is paired with the first (and only) conversion specification.
The increment is a prefix so the result of the operation is the a + 1.
Since there are more items in the value list than there are conversion
specifications, the extra value is not shown.

I believe the correct answer is d because according to K&R2 (and by
implication the Standard) the order in which function arguments are
evaluated is not defined; in fact, K&R2's example, in Section 2.12,
shows the variable n being used twice in the same printf call (albeit
with the correct number of conversion specifications).

Am I correct that d is the correct answer?

I don't think so. Evaluating a+5 will not affect the value of a, so
regardless of whether a+5 is evaluated before ++a or after ++a, the only
evaluation that will affect the value of a is ++a.

Hence, at the point that ++a is evaluated, a has the initial value 1,
irrespective of whether a+5 has been evaluated or not, and the prefix
operator applies according to the test answer explanation.

If they were asking either of the following, then undefined would be
correct for the reasons you give:

printf("%d", ++a, a+=5);
printf("%d", a+5, ++a);

Rgds

Denis McMahon
 
T

Tim Streater

Seebs said:
How would you know that you have never encountered such a processor, when
it is quite possible that you have, except that compilers or other tools
have hidden it, or you've simply never triggered one of the affected cases?

I've written at least some assembler for every CPU I've used up to about
1990. Now, a lot of water under the bridge since then, but an inability
to hand concurrency properly is not something I've ever sen or heard
about, before or since.

Indeed, with the ability to put billyuns and billyuns of gates on a
square micron of silicon these days, I'd find it hard to believe that a
shortage of real-estate has anything to do with it either. If hardware
designers are saying, here's this dual-core processor but be careful how
you use it, then I'd say the tail was wagging the dog. Never let
hardware people specify processors.
 
S

Seebs

I've written at least some assembler for every CPU I've used up to about
1990. Now, a lot of water under the bridge since then, but an inability
to hand concurrency properly is not something I've ever sen or heard
about, before or since.

Interesting. I am pretty sure that at least SPARC had operations where
you could get really strange stuff if you accessed a register between
writes to the high and low portions of it.

Actually, that points out another obvious case:

64-bit ints on 32-bit machines, or 32-bit ints on 16-bit machines.

Both of these can easily result in a case where, even if every instruction
is atomic, you can easily get an intermediate result if two execution
threads are running in parallel.

Furthermore...

Isn't it odd that a lot of processor manuals seem to describe SOME
instructions as "atomic"? Doesn't that imply that some of the others
aren't?
Indeed, with the ability to put billyuns and billyuns of gates on a
square micron of silicon these days, I'd find it hard to believe that a
shortage of real-estate has anything to do with it either.

Huh? That makes no sense. If you dedicate a large amount of processing
to something else, it takes away from what you'd do otherwise.

-s
 
M

Malcolm McLean

I've written at least some assembler for every CPU I've used up to about
1990. Now, a lot of water under the bridge since then, but an inability
to hand concurrency properly is not something I've ever sen or heard
about, before or since.
One games processor I worked with shipped with a hardware bug. I've
forgotten the details, but it was something like if you read from an
address and then wrote to it in the next instruction, it caused
hardware damage.

It was easier to patch the compiler and tell assembly programmers to
be careful than to recut the die and recall machines from consumers.
 
T

Tim Streater

Malcolm McLean said:
One games processor I worked with shipped with a hardware bug. I've
forgotten the details, but it was something like if you read from an
address and then wrote to it in the next instruction, it caused
hardware damage.

It was easier to patch the compiler and tell assembly programmers to
be careful than to recut the die and recall machines from consumers.

This was by accident, then. Not the same as saying, we could handle
concurrency properly, we're just not going to.
 
T

Tim Streater

Seebs said:
Interesting. I am pretty sure that at least SPARC had operations where
you could get really strange stuff if you accessed a register between
writes to the high and low portions of it.

Actually, that points out another obvious case:

64-bit ints on 32-bit machines, or 32-bit ints on 16-bit machines.

You mean (say) 32-bit ints on a machine with only 16 bit regs (oh, sorry
Kenny, registers)? Where you add the low end 16 bits and in the next
instruction do an add-carry on the upper 16 bits? I wouldn't expect
concurrency to work on that. You'd have to assign each core to separate
processes.
Both of these can easily result in a case where, even if every instruction
is atomic, you can easily get an intermediate result if two execution
threads are running in parallel.

Then I don't see how you can guarantee correct operation of the machine.
If you have two parallel threads, and you're expecting them to work on
the same area of memory, it then becomes an accident of timing as to
whether the two threads happen to want to access the same memory
location.
Furthermore...

Isn't it odd that a lot of processor manuals seem to describe SOME
instructions as "atomic"? Doesn't that imply that some of the others
aren't?

I'd assume this applied to instructions that do a lot (that is, a lot
more than some people might expect). I recall the Vax having some
instructions to do with procedure calls that moved parameters around and
did some stuff on the stack, indivisibly. I think they farmed these out
to firmware in the MicroVax, presumably with interrupts disabled.
 
S

Seebs

You mean (say) 32-bit ints on a machine with only 16 bit regs (oh, sorry
Kenny, registers)? Where you add the low end 16 bits and in the next
instruction do an add-carry on the upper 16 bits? I wouldn't expect
concurrency to work on that. You'd have to assign each core to separate
processes.

And that kind of thing is why C is pretty careful to leave room for
surprises if you mix reads and writes inappropriately.
Then I don't see how you can guarantee correct operation of the machine.
If you have two parallel threads, and you're expecting them to work on
the same area of memory, it then becomes an accident of timing as to
whether the two threads happen to want to access the same memory
location.

Someone should invent mutexes, semaphores, and other ways of ensuring that
the behavior of such code can be predictable.

Hint: This is WHY those things exist -- because this was a problem which
needed to be solved.
I'd assume this applied to instructions that do a lot (that is, a lot
more than some people might expect). I recall the Vax having some
instructions to do with procedure calls that moved parameters around and
did some stuff on the stack, indivisibly. I think they farmed these out
to firmware in the MicroVax, presumably with interrupts disabled.

What it comes down to is that, on every architecture I'm aware of, there
are some operations which, if you execute them while something else is
touching the same memory, can yield surprising results. You may not
have encountered this, but then, you may not have been writing code which
needed to pay attention to it -- a great deal of code doesn't, but that
doesn't mean you can't write invalid code on these targets.

Basically, once you started having multiple simultaneous execution
units possible (say, multiple arithmetic operators which could run
in parallel), it became an issue, and ever since, the solution has been
"don't do that, then". C's restriction here is not abnormal. The
manuals for a lot of FPU and vector chips have warnings about when
results become available. In some cases, that stalls execution; in
others, it results in unspecified outputs.

-s
 
B

Ben Bacarisse

Denis McMahon said:
I don't think so. Evaluating a+5 will not affect the value of a, so
regardless of whether a+5 is evaluated before ++a or after ++a, the only
evaluation that will affect the value of a is ++a.

Hence, at the point that ++a is evaluated, a has the initial value 1,
irrespective of whether a+5 has been evaluated or not, and the prefix
operator applies according to the test answer explanation.

What about the rule that has now been quoted so often? Rather than
introduce your own argument about what can and can't influence the value
of one or other expression, to argue against 'd' you would have to come
up with a reason why the C rule that makes it undefined does not apply.

It seems clear that it does: 'a' is updated and also read for a purpose
other than determining the new value for 'a' -- all between two sequence
points.
If they were asking either of the following, then undefined would be
correct for the reasons you give:

printf("%d", ++a, a+=5);
printf("%d", a+5, ++a);

Well, again, I disagree. They are indeed both undefined but not for the
reason the OP gave. Both your explanation and that of the OP
concentrate on the order of evaluation, but that doesn't matter here.
It makes no difference what order is used -- left to right, right to
left, unspecified -- they all violate 6.5 p2.

Only the addition of a sequence point somewhere between the argument
evaluations (or the abolition on 6.5 p2, of course) would prevent
undefined behaviour. Furthermore, to get predictable behaviour on all
implementations you need both an extra sequence point and a defined
order of evaluation. It is likely that if C were to specify the order
of evaluation of function arguments then sequence points would also be
added between them, simply because having one with the other is not very
helpful. Nonetheless, the two are still separate ideas.
 
B

blmblm

You mean (say) 32-bit ints on a machine with only 16 bit regs (oh, sorry
Kenny, registers)? Where you add the low end 16 bits and in the next
instruction do an add-carry on the upper 16 bits? I wouldn't expect
concurrency to work on that. You'd have to assign each core to separate
processes.

Say what? I was under the impression that "multicore" meant
"multiple CPU-like entities" , and that this would necessarily
mean that each core would have its own set of registers. No?
Or maybe that's missing your point anyway?

[ snip ]
 
S

Seebs

Say what? I was under the impression that "multicore" meant
"multiple CPU-like entities" , and that this would necessarily
mean that each core would have its own set of registers. No?
Or maybe that's missing your point anyway?

Typically multicore means each core has its own registers, but it
gets weirder:

* In some cases, there is more than one register set for a single core.
* A single core may have multiple processing elements which share a
register set.

-s
 
T

Tim Streater

Say what? I was under the impression that "multicore" meant
"multiple CPU-like entities" , and that this would necessarily
mean that each core would have its own set of registers. No?
Or maybe that's missing your point anyway?

Well, whatever :)

I got bored with the discussion, I perhaps used the word "core" loosely.
In any case, the only machine with several execution units I've done
assembler for is the CDC 6600. You could start a floating mult on two
registers and another on two other registers, and these would proceed in
parallel. If one of the operations specified as destination register one
used by the other operation, then the two operations would proceed
serially. No conflict, no possibility of anything "funny" going on. That
would seem a reasonable way to me for hardware to operate.
 
S

Shao Miller

Ike said:
[ about printf("%d", ++a, a + 5); ]
'++a' and 'a + 5' are separate expressions.
The value of 'a' is intact until the sequence point.

It looks like you are mistaken about the nature of sequence points.
A sequence point is a point in time where nothing happens,
a stability point, a point of rest. It is _not_ a point of action.
Transitions happen _between_ sequence points; at the sequence point
itself, the machine for a moment has come to a halt, all transitions
that happened before the sequence point have completed, and all
transitions that will happen after the sequence point have not yet started.
Ok. After Mr. K. Thompson's response, it was clear that this claim was
incomplete. The treatment was: Reading non-volatile objects' values
isn't a side effect, but can be required to accomplish evaluation. It
seemed natural that all writes should pend for _just_before_ the
sequence point (after evaluations are complete), thus allowing for
non-volatile reads to have ensured consistency for the duration of
evaluation. That is: Previous sequence point 1-> All reads and
computations 2-> All writes 3-> Next sequence point.

But that's not the case. Though I meant 3-> by "until", it's all
invention unless established by agreement beforehand... Which it's not
at all.
We could draw a timeline, like this:

activity rest activity rest activity
---------------+--------------------+--------------
seq seq

From what you write, it looks like you think that all the action
happens _at_ sequence points (sorry if I'm understanding you wrong).
Well yes, "at" as "just before, but after all evaluations have been
completed," and only in as much as the context of the write to 'a'. But
C is not so strict. Side effects just happen at random between sequence
points, as far as analysis with only a C Standard and not the
implementation's specifics goes, right?
In `` printf("%d", ++a, a + 5); '' the first sequence point, S0, is
before the statement starts. The next sequence point, S1, is before the
call to printf. In between those sequence points, the arguments
(the expressions ``"%d"'', ``++a'' and ``a+5'') are evaluated.
Several accesses to ``a'' will happen here: the ``a'' in ``++a''
will be read to obtain its previous value (event ar0),
the ``a'' in ``++a'' will be modified to store
the incremented value (event aw0) and ``a'' in ``a+5'' will be
read to compute the sum of ``a'' and ``5'' (event ar1).
Yes, an excellent explanation of the goings-ons, thanks.
Necessarily, ar0 must happen before aw0, because the value written
at aw0 depends on the value read at ar0. But ar1 can happen at any
time, so, the following interleavings are possible:

ar1,ar0,aw0
--+---------------+--
S0 S1

ar0,ar1,aw0
--+---------------+--
S0 S1

ar0,aw0,ar1
--+---------------+--
S0 S1

That's why the value of the third argument ``a+5'' is not well-defined.
It depends on the order in which aw0 and ar1 occur, and any order is
possible.
Right. But why not "unspecified" rather than "undefined behaviour"?
Isn't it easy to get unspecified program results which are not
undefined, but still useful based on knowledge of the implementation, or
still portable in that a program does not crash and is guaranteed to
translate?
In standardese: the variable ``a'' is both read and modified
between two sequence points, and the value is read (at ar1) for another
purpose than to compute the value written (at aw0).
....And all within evaluation of an expression (the function call),
apparently.
When the write does not depend on the read, the implementation has
the freedom to schedule the read and the write in any order it wishes.
Sure.

Here are three chunks about "Expressions", from C99's section 6.5:

"1 An expression is a sequence of operators and operands that specifies
computation of a value, or that designates an object or a function, or
that generates side effects, or that performs a combination thereof.

"2 Between the previous and next sequence point an object shall have its
stored value modified at most once by the evaluation of an
expression.72) Furthermore, the prior value shall be read only to
determine the value to be stored.73)

"3 The grouping of operators and operands is indicated by the syntax.74)
Except as specified later (for the function-call (), &&, ||, ?:, and
comma operators), the order of evaluation of subexpressions and the
order in which side effects take place are both unspecified."

So what we must accept is:

- At least one of:
- Each function call "argument" is an "operand" and the function call
is an N-ary operator.
- An optional argument-expression-list is an "operand" and the
postfix-expression for the denoting the called function is the other
"operand".
- Zero or more commas within the argument-expression-list are part of
the function call operator and not a syntactic means of creating
expressions disjoint from the whole function call expression.

- Having a "previous" and "next" sequence point does not imply a linear
(even if unspecified or specified) ordering of object modifications and
reads.

- The evaluation of an expression endures the evaluation of all
sub-expressions.

- Any expression which reads or modifies objects and which appears
within a syntactically encompassing expression can be said to contribute
those attributes to the larger expression.

- The first "shall" in p2 is not a constraint on implementation
conformance, but a constraint on source code. That is, it's not a
constraint which means, "a conforming implementation shall ensure that
expression evaluation does not result in multiple modifications of an
object between sequence points," but means, "the programmer shall not
design an expression which appears to modify an object multiple times
between sequence points."

- The second "shall" in p2 is not a constraint on implementation
conformance, but a constraint on source code. That is, it's not a
constraint which means, "a conforming implementation shall ensure that
an expression and all sub-expressions do not yield any reads of an
object outside of using those values to determine a value to modify that
object with, if and only if that object is modified," but means, "the
programmer shall not design an expression which contains a
sub-expression which modifies an object, but also contains other
sub-expressions outside of that sub-expression which read the object."

- p2 does not prevent the read of an object from contributing to not
just a new value for the object, but even towards computations for
syntactically encompassing expressions. [(i = j = j + 1)] [(i = (j = j
+ 1) + 1)] (But wait a minute...)

- p3 does not call attention to the function call operator to
distinguish it from _all_ other operators, but because it has a sequence
point just as the other noted operators do, and that group also has some
detail about what comes before the sequence point and what comes after.

- Whereas type-name and assignment-operator in the syntax are not by
themselves a sub-expression, the optional argument-expression-list and
any included, syntactic commas do constitute something which is
considered part of the evaluation of the expression containing the
function call.

- The list of expressions in "Function calls", section 6.5.2.2, which
constitute the arguments with their unspecified evaluation order, are to
be regarded as sub-expressions of the whole function call expression; a
modification in one is a modification in the whole.

Ok. Isn't it interesting how "any order" for the function calls in
example 6.5.2.2p12 means "one at a time" but we are allowed simultaneous
order for the write of '++a' and the read of 'a + 5'? Or does it?
Isn't it interesting that 6.5.2.2p10 details the unspecified order for
"the function designator, the actual arguments, and subexpressions
within the actual arguments." Why the distinction on "subexpressions"
here? Why not the distinction in 6.5p2?

It's interesting that the Standard grants a license (optimization iff
congruent results to abstract semantics) to the implementation, then the
implementation returns the favour (specify anything other than a linear
order for evaluation, because I might optimize using concurrent
evaluations).

Even in such a case, rather than making 6.5p2 violations unspecified
behaviour, it's outright undefined behaviour... Meaning an
implementation can refuse to translate, or can accidentally optimize a
simultaneous read and write to the same memory, rather than making an
arbitrary choice for a linear order, and landing the original post at
answer (b). Very well!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,584
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top