offsetof() Macro Attempt

T

Tim Rentsch

Harald van Dijk said:
Harald said:
[snip]
Consider:

1. Examples in the Rationale show that some expressions with
undefined behavior still can be valid ICE's and not run afoul
of 6.6p4;
Yes, but those examples are undefined in non-constant expressions for
other reasons than 6.5p5.

That doesn't matter. Undefined behavior is undefined behavior;
there aren't different kinds of UB (per 4p2).

4p2 doesn't say that undefined behaviour with constraint violation is
equivalent to undefined behaviour without constraint violation. Some
forms of undefined behaviour in non-constant expressions violate a
constraint in constant expressions, others don't.

The rules for UB and CV's are very clear: if there is a
constraint violation, there must be at least one diagnostic
message issued, whether there is UB or not; if there isn't a
constraint violation, UB allows an implementation complete
freedom to do whatever it wants. I don't see any basis for your
assertion. There is nothing magical about 6.5p5; it's just a
shorthand way of identifying numerous circumstances that have
undefined behavior, and nothing more than that.

I understand. There are multiple implementations that do define it as
an extension.

(Just curious - can you name some? I'm a little surprised that
they would actually document it.)

If the same wording in 6.5p5 can only logically be taken to refer to
the mathematical value, I assume it also refers to the mathematical
value in 6.6p4.

But it isn't the same wording. It's true some words are the
same, but others aren't. If what was meant in 6.6p4 were the
same condition as in 6.5p5, normally we would expect the earlier
defined term ('exceptional condition') to be used in the later
paragraph. It isn't.

Just to be clear: I'm not claiming any sort of absolute truth
here. I believe the most natural readings for 6.5p5 and 6.6p4
are talking about different things, more specifically one being
mathematical and the other being defined by C semantics. That
doesn't mean other people might not read them differently (and
I mean reasonable and rational people, not insane ones :).

Unsigned types are reduced modulo 2**N. Taking UINT_MAX+1, the
mathematical result of (UINT_MAX+1) modulo (UINT_MAX+1) is zero, which
is within the range of unsigned int.

This strikes me as a tautological statement that doesn't
address the point of my comment. Can you rephrase it
so there is a clearer relationship?

I do not believe either 6.5p5 or 6.6p4 applies or is intended to apply
to pointer arithmetic.

Leaving aside 6.5p5 for the moment, I don't see any reason
why 6.6p4 would not apply to pointer arithmetic (in cases
where the condition holds).
Do you know of an implementation that treats it
that way -- that diagnoses

a.c:
int array[1];

b.c:
extern int array[];
int pointer = array + 20;

as a constraint violation?

No, because the result of evaluating 'array + 20' is in the
range of representable values for its type. It might not be
a very useful value, but it is a representable value. (And
if it is _not_ in the range of representable values, then
6.6p4 requires a diagnostic message -- doesn't it?)
I believe this is simply undefined
behaviour because of 6.5.6p8, both in constant and in non-constant
expressions.

It surely is undefined behavior, but if 6.6p4 is not violated
then it must also be in the range of representable values for
that pointer type.

Surely 6.6p5 is meant to

[ 6.5p5? ]

Yes (my brain corrected that typo, but my fingers didn't...)
The only difference in the wording is that 6.5p5 allows for cases
where the result is not mathematically defined.

That is a difference, but it is not the only difference.
The wording "in the
range of representable values for its type" is identical between 6.5p5
and 6.6p4, so how could it mean two different things?

The surrounding text is different.
Besides, 6.6p4
makes more sense to me when reading it as referring to the
mathematical value than otherwise.

My counter-example is floating-point, where X*Y can
be mathematically out-of-range, but under evaluation
rules might be +INF, which is representable. (I am
not an expert, but I believe such things are allowed
as CE's when IEEE floating-point is in effect.)

Unless 6.6p4 is intended to apply to 0/0 -- of which I am now unsure,
as already stated:

*If* the implementation defines 0/0 as 1/!__has_entered_main(), *and*
documents it as such, it may technically be permitted. It is a silly
implementation that only attempts to stretch the rules, and I will not
take such an implementation any more seriously than I would one that
diagnoses every program with only "hello".

I agree it's a silly implementation. Imagining silly
implementations is sometimes useful to explore the
boundaries of what the Standard allows.

I do not believe there is any inconsistency, either in your
interpretation or in mine.

I'm not sure I understand what you're saying.
[...]
1. Undefined behavior always means an out-of-range value
(for any type);

This is not something I have claimed, or you have claimed. If you do
believe this, or believe that I do, then [snip]

I don't, for either one. The statement (1.) above is meant to be
a natural generalization (put that in quotes if you prefer) of
examples in the DR's. That's a problem generically with many
DR's - they only give examples, and don't actually clarify
what the rules are more generally, except by having to guess.
 
H

Harald van Dijk

[snip]

Please recall that I had, possibly incorrectly, taken 6.6p4 to apply
to the same conditions as 6.5p5.

[ wrapping signed integer overflow ]
(Just curious - can you name some?  I'm a little surprised that
they would actually document it.)

gcc, when using the -fwrapv option. clang supports the same option
with the same behaviour, but doesn't actually mention the option in
its documentation. TenDRA (at least the version I am occasionally
working on).
But it isn't the same wording.  It's true some words are the
same, but others aren't.

I believe the relevant words are the same.
 If what was meant in 6.6p4 were the
same condition as in 6.5p5, normally we would expect the earlier
defined term ('exceptional condition') to be used in the later
paragraph.  It isn't.

Yes, that would have made more sense.
This strikes me as a tautological statement that doesn't
address the point of my comment.  Can you rephrase it
so there is a clearer relationship?

The reduction modulo 2**N of unsigned arithmetic means unsigned
operations never violate 6.6p4, regardless of whether it is taken to
refer to the mathematical result of the operation. So it doesn't serve
as a counterexample of my interpretation of the text.
Leaving aside 6.5p5 for the moment, I don't see any reason
why 6.6p4 would not apply to pointer arithmetic (in cases
where the condition holds).

Based solely on the standard, there is no reason why it would not
apply, but do you know of a real-world implementation that diagnoses
constant expression overflows for pointer types?
Do you know of an implementation that treats it
that way -- that diagnoses
a.c:
  int array[1];
b.c:
  extern int array[];
  int pointer = array + 20;
as a constraint violation?

No, because the result of evaluating 'array + 20' is in the
range of representable values for its type.  It might not be
a very useful value, but it is a representable value.

Are you sure? I was under the impression that array + 20 was not a
value, and that the representation that would correspond to that value
if the array were larger is / may be a trap representation. Do you
believe a pointer type (on a typical modern desktop system with a
linear address space and all bits contributing to the value) has trap
representations at all? (If it doesn't, then C99 permits reading
uninitialised pointers.)

Besides, consider this, then, on a 32-bit system:

extern int array[];
int pointer = array + (20LL << 40);

Does your compiler diagnose this as an error? Do you believe it
violates any constraints? If not, why not?
 (And
if it is _not_ in the range of representable values, then
6.6p4 requires a diagnostic message -- doesn't it?)

Yes, that was my point: we're both taking 6.6p4 not to apply to the
above, but for completely different reasons.
It surely is undefined behavior, but if 6.6p4 is not violated
then it must also be in the range of representable values for
that pointer type.

If 6.6p4 applies to pointer types at all. FWIW, I agree that the
literal wording does not exclude pointer types.
That is a difference, but it is not the only difference.


The surrounding text is different.

In what way that matters is the surrounding text different? What part
of the surrounding text alters the meaning of those words?
My counter-example is floating-point, where X*Y can
be mathematically out-of-range, but under evaluation
rules might be +INF, which is representable.  (I am
not an expert, but I believe such things are allowed
as CE's when IEEE floating-point is in effect.)

Fair point, but how would you apply 6.5p5 to X*Y on that system? Is it
an exceptional condition? Is the behaviour defined?
 
T

Tim Rentsch

Harald van Dijk (e-mail address removed)> writes:
[some snipping of an unrelated side point was done]
[snip]

Please recall that I had, possibly incorrectly, taken 6.6p4 to apply
to the same conditions as 6.5p5.
But it isn't the same wording. It's true some words are the
same, but others aren't.

I believe the relevant words are the same.

Usually I expect, when the Standard uses different phrasings in
different places, that the phrasings are different because they
delimit requirements which may be different in the two cases
(assuming no other evidence to the contrary). Of course this
doesn't prove anything about 6.6p4, but it does suggest that
the differences may be relevant and additional investigation is
warranted.
Yes, that would have made more sense.


The reduction modulo 2**N of unsigned arithmetic means unsigned
operations never violate 6.6p4, regardless of whether it is taken to
refer to the mathematical result of the operation. So it doesn't serve
as a counterexample of my interpretation of the text.

Going back and re-reading 6.5p5, it now appears to me that the
part about "not in the range of representable values" is talking
about C semantics, not mathematical semantics. That strengthens
my conviction that 6.6p4 is talking about C semantics and not
about mathematical value.
Based solely on the standard, there is no reason why it would not
apply, but do you know of a real-world implementation that diagnoses
constant expression overflows for pointer types?

I don't, but I expect pointer arithmetic is defined so
that the result is always plausibly a pointer, ie, one
that could be valid in some execution. (It's also true
that I don't have much experience with addressing schemes
other than the usual linear ones, so there may some such
systems out there that I don't know about.)
Do you know of an implementation that treats it
that way -- that diagnoses
a.c:
int array[1];
b.c:
extern int array[];
int pointer = array + 20;
as a constraint violation?

No, because the result of evaluating 'array + 20' is in the
range of representable values for its type. It might not be
a very useful value, but it is a representable value.

Are you sure? I was under the impression that array + 20 was not a
value, and that the representation that would correspond to that value
if the array were larger is / may be a trap representation. Do you
believe a pointer type (on a typical modern desktop system with a
linear address space and all bits contributing to the value) has trap
representations at all? (If it doesn't, then C99 permits reading
uninitialised pointers.)

The relationship between pointer values and trap representations is
rather fuzzy, because which ones are trap representations changes
over time. I think the question boils down to whether 'array + 20'
produces a well-formed address, regardless of whether the result is
a currently valid address. If 'array + 20' did not produce a
well-formed address, ie an address that could not be legal in any
possible execution, then I think the Standard requires a diagnostic
to be issued. In most cases though I think implementations simply
define pointer arithmetic so that expressions like this always
produce a well-formed address.
Besides, consider this, then, on a 32-bit system:

extern int array[];
int pointer = array + (20LL << 40);

Does your compiler diagnose this as an error? Do you believe it
violates any constraints? If not, why not?

It doesn't, but I believe no constraints are violated because
the initializing expression produces (on my implementation) a
well-formed address.
Yes, that was my point: we're both taking 6.6p4 not to apply to the
above, but for completely different reasons.


If 6.6p4 applies to pointer types at all. FWIW, I agree that the
literal wording does not exclude pointer types.


In what way that matters is the surrounding text different? What part
of the surrounding text alters the meaning of those words?

6.5p5 is talking about a condition that occurs _during_ evaluation.
6.6p4 references a value produced _by_ evaluating, ie, after the
evaluation is finished. Perhaps that isn't what was meant, but
it is (I would say) how the existing text reads.
Fair point, but how would you apply 6.5p5 to X*Y on that system? Is it
an exceptional condition? Is the behaviour defined?

The behavior is undefined by the Standard, defined by the
implementation. I think 6.5p5 is what grants a license so that
the implementation may define it to be +INF (and so, yes, this
expression would correspond to an exceptional condition in the
sense of 6.5p5).

Here's a related example which I'm pretty sure is right:

static int foo = (int) (0. / 0.);

Under IEEE arithmetic, I believe the initializing expression is
guaranteed to be defined and produce a legal (though unspecified)
int value. The conversion to (int) is the more interesting part:
for this to be allowed (ie, and not a constraint violation), we
must get a representable value of type (int). Whether that
happens clearly depends on the evaluation rules in effect in the
implementation in question.


At this point in the discussion I think I should retreat to the
position that the existing wording is just unclear in its meaning
or intention, and ought to be revised and clarified. You offer
some excellent counter-points, sir! Thank you for providing
much thought-provoking discussion.
 
T

Tim Rentsch

Michael Press said:
Tim Rentsch said:
5. Looking at the specifics, 'INT_MAX + 2' has a well-understood
mathematical value, and one outside the range of int, but '0/0'
does not -- it would not be unreasonable to define '0/0' as 1,
and any other integer over 0 as an exceptional condition;

I think it is unreasonable to try to define 0/0.
[snip mathematical elaboration]

To clarify my earlier statement - it would not be unreasonable
for a C implementation to define, for pragmatic reasons,
a value for evaluating 0/0. I believe this is true regardless
of any mathematical reasoning about whether 0/0 has a value.
 
H

Harald van Dijk

At this point in the discussion I think I should retreat to the
position that the existing wording is just unclear in its meaning
or intention, and ought to be revised and clarified.

At last, something we can agree on!
 You offer
some excellent counter-points, sir!  Thank you for providing
much thought-provoking discussion.

Thank you, and likewise, it was interesting. I will resist addressing
the points you made in those last two messages. :)
 
T

Tim Rentsch

Harald van Dijk said:
At last, something we can agree on!


Thank you, and likewise, it was interesting. I will resist addressing
the points you made in those last two messages. :)

You are most welcome, and nolo contendere. :)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,011
Latest member
AjaUqq1950

Latest Threads

Top