Sizeof query

J

James Kuyper

Dik said:
Isn't sizeof(int) implementation defined?

"The expression ++E is equivalent to (E+=1)." (6.5.3.1p2)
"The type of an assignment expression is the type of the left operand
...." (6.5.16p3).
 
B

Ben Bacarisse

The hard part being, of course, remembering what sort of automatic
conversions apply to new++

I seem to recall these being defined in various circumstances:

usual arithmetic conversions
integer promotions
default argument promotions
default maze of promotion
usual twisty conversions

...all different.

Now will one of those result in new++ being an int? It means new=new+1, and
the 1 is an int. Something could happen.

I don't think conversions come into it. sizeof needs to know the type
of new++. The description of ++ directs us to the section on += where
we read that the type of this expression if that of the left hand side
(modulo type qualification). I don't think conversions come into it.
Before running the code, I guessed correctly. But I have a feeling it was
because there were a pair of hidden conversions that cancelled each other
out, not because there were no hidden conversions at all.

I think it is the latter -- there are no conversions involved.
 
J

James Kuyper

James said:
something in my > desire to become evil?

Isn't sizeof(int) implementation defined?

"The expression ++E is equivalent to (E+=1)." (6.5.3.1p2)[/QUOTE]

Sorry - wrong operator, wrong citation - I'm not doing well this
morning. Correction:

You're apparently assuming that integer promotions must be applied to
new++. However, as footnote 48 says, "48) The integer promotions are
applied only: as part of the usual arithmetic conversions, to certain
argument expressions, to the operands of the unary +, -, and ~
operators, and to both operands of the shift operators, as specified by
their respective subclauses." Footnotes are not normative, but footnote
48 correctly summarizes the fact, that you can verify by reading the
normative text, that those are the only locations where the integer
promotions apply. The postfix ++ operator is not one of those locations.
 
B

Ben Bacarisse

James Kuyper said:
Sorry - wrong operator, wrong citation - I'm not doing well this
morning. Correction:

You're apparently assuming that integer promotions must be applied to
new++. However, as footnote 48 says, "48) The integer promotions are
applied only: as part of the usual arithmetic conversions, to certain
argument expressions, to the operands of the unary +, -, and ~
operators, and to both operands of the shift operators, as specified
by their respective subclauses." Footnotes are not normative, but
footnote 48 correctly summarizes the fact, that you can verify by
reading the normative text, that those are the only locations where
the integer promotions apply. The postfix ++ operator is not one of
those locations.

I disagree. Specifically, I disagree that conversions are excluded
from the ++ operator. Not that it matters in this case, since there
is no evaluation involved, but the description of ++ (6.5.2.4 p2)
says: "See the discussions of additive operators and compound
assignment for information on constraints, types, and /conversions/
[...]" (my emphasis). The description of compound assignment is clear
that, except to the fact that the lvalue E1 is evaluated only once,
E1 += E2 behaves exactly like E1 = E1 + E2.

I think this only matters on "odd" architectures.
 
W

William Hughes

An object is, for purposes of that, equivalent to an array of one object,
so "&obj+1" is a defined value which would be the address of the next
object if there were an array of them.



Ok, we now have (&obj+1) - (&obj) = 1

However, It would seem to me that (when obj is not a char)

(char*)(&obj+1) and (char*)(&obj)
do not point within the same object

so the subtraction is still undefined. (admittedly, only
a perverse implementation would get this wrong)

- William Hughes
 
K

Keith Thompson

William Hughes said:
Ok, we now have (&obj+1) - (&obj) = 1

However, It would seem to me that (when obj is not a char)

(char*)(&obj+1) and (char*)(&obj)
do not point within the same object

so the subtraction is still undefined. (admittedly, only
a perverse implementation would get this wrong)

They don't point *within* the same object, but (char*)(&obj+1) points
just past the end of the object, which is permitted.

Any object may be treated as an array of char. (I don't have my copy
of the standard handy at the moment.)
 
W

William Hughes

[...]
Ok, we now have (&obj+1) - (&obj) = 1
However,  It would seem to me that  (when obj is not a char)
     (char*)(&obj+1) and (char*)(&obj)
     do not point within the same object
so the subtraction is still undefined.  (admittedly, only
a perverse implementation would get this wrong)

They don't point *within* the same object, but (char*)(&obj+1) points
just past the end of the object, which is permitted.

Indeed. (note to self, when being insanely pedantic, do
not be sloppy)
Any object may be treated as an array of char.  (I don't have my copy
of the standard handy at the moment.)

Yes, one character past an object that *may be treated* as an
array of char, not one character past an object that *is*
an array of char.

So my question becomes: Is this enough
wiggle room for the DS2K to insert
nasal demons? (We can't invoke the "as if"
rule without known what the "correct" behaviour
is)

On the other hand can you argue the type of pointer
makes no difference in determining whether where
they point is "legal" for the purpose of pointer
subtraction?

(Boy the light emanating from here won't reach practical
for decades)

- William Hughes
 
J

jameskuyper

William said:
[...]
Ok, we now have (&obj+1) - (&obj) = 1
However, It would seem to me that (when obj is not a char)
(char*)(&obj+1) and (char*)(&obj)
do not point within the same object
so the subtraction is still undefined. (admittedly, only
a perverse implementation would get this wrong)

They don't point *within* the same object, but (char*)(&obj+1) points
just past the end of the object, which is permitted.

Indeed. (note to self, when being insanely pedantic, do
not be sloppy)
Any object may be treated as an array of char. (I don't have my copy
of the standard handy at the moment.)

Yes, one character past an object that *may be treated* as an
array of char, not one character past an object that *is*
an array of char.

No, that's not the case, and it's not what's relevant. What is
relevant is that the rules for pointer arithmetic have explicit
special exceptions for pointers one past the end of an array; It is
for the purpose of those rule, among others, that a single object can
be treated as a 1-element array.

Section 6.5.6p8 says: "... if the expression P points to the last
element of an array object, the expression (P)+1 points one past the
last element of the array object, ..."; since obj is the only element
in the array, &obj points at the last element of the array, and you
can safely write &obj+1 (you cannot safely write &obj+2 or &obj-1).
You can treat any object as an array of unsigned char, and (unsigned
char*)(&obj+1) points one char beyond the end of the char array
corresponding to obj.

Section 6.5.6p9 says: "When two pointers are subtracted, both shall
point to elements of the same array object, or one past the last
element of the array object; ...", so the subtraction is also
perfectly acceptable.

So my question becomes: Is this enough
wiggle room for the DS2K to insert
nasal demons? ...
No.

....
(Boy the light emanating from here won't reach practical
for decades)

As a purely practical matter, checked-pointer implementations are
pretty rare, and used mainly for debugging purposes. However, I
believe that there are non-DS9K implementations which do perform
optimizations (such as removal of anti-aliasing checks) that are
guaranteed to work as intended only if developers avoid performing
pointer arithmetic under circumstances where that arithmetic has
undefined behavior. Therefore, it is of practical important to know
whether or not that is the case.
 
W

William Hughes

William said:
[...]
Ok, we now have (&obj+1) - (&obj) = 1
However,  It would seem to me that  (when obj is not a char)
     (char*)(&obj+1) and (char*)(&obj)
     do not point within the same object
so the subtraction is still undefined.  (admittedly, only
a perverse implementation would get this wrong)
They don't point *within* the same object, but (char*)(&obj+1) points
just past the end of the object, which is permitted.
Indeed.  (note to self, when being insanely pedantic, do
not be sloppy)
Yes, one character past an object that *may be treated* as an
array of char, not one character past an object that *is*
an array of char.

No, that's not the case, and it's not what's relevant. What is
relevant is that the rules for pointer arithmetic have explicit
special exceptions for pointers one past the end of an array;  It is
for the purpose of those rule, among others, that a single object can
be treated as a 1-element array.

Section 6.5.6p8 says: "... if the expression P points to the last
element of an array object, the expression (P)+1 points one past the
last element of the array object, ..."; since obj is the only element
in the array, &obj points at the last element of the array, and you
can safely write &obj+1 (you cannot safely write &obj+2 or &obj-1).
You can treat any object as an array of unsigned char, and (unsigned
char*)(&obj+1) points one char beyond the end of the char array
corresponding to obj.

Ok we now have several ways of looking at things

obj + 1, a "valid" place to point
We know we can treat obj as a character array
There is a char array corresponding to obj

<are the last two equivalent?>


My question remains: Is there enough
wiggle room for the DS2K to insert
nasal demons?

Possible analyses.

A Defined: If the pointers point into an array, or 1 past
the array things are defined. We can treat obj as
an array of char, so things are definied.

B Undefined: For things to be defined we need both
pointers to point into an array or one past the
array. The fact that we can treat obj as an
array of char does not mean we have an array
of char.

C Defined: For things to be defined we need both
pointers to point into an array or one past the
array. The type of the array does not have
to match the type of pointers. Casting to
(char*) may change what a pointer points at, it
does not change where it points.

Personally, I think B trumps A (being able to
treat obj as an array of char does not
provide enough existence of the char array to
make the subtraction defined) but C trumps B.

-William Hughes
 
K

Keith Thompson

William Hughes said:
Ok we now have several ways of looking at things

obj + 1, a "valid" place to point
We know we can treat obj as a character array
There is a char array corresponding to obj

<are the last two equivalent?>


My question remains: Is there enough
wiggle room for the DS2K to insert
nasal demons?

Possible analyses.

A Defined: If the pointers point into an array, or 1 past
the array things are defined. We can treat obj as
an array of char, so things are definied.

B Undefined: For things to be defined we need both
pointers to point into an array or one past the
array. The fact that we can treat obj as an
array of char does not mean we have an array
of char.

I disagree with your point B. If a pointer just past the end of the
notional char array isn't a valid pointer, then we're not able to
treat the object as an array of char. But the standard says that we
*can* treat an object as an array of char; therefore, the pointer just
past the end of the char array is valid.

On the other hand, I don't think the standard actually says that an
object can be treated as an array of char, at least not in so many
words. It's implied by C99 6.2.6.1:

Except for bit-fields, objects are composed of contiguous
sequences of one or more bytes, the number, order, and encoding of
which are either explicitly specified or implementation-defined.

...

Values stored in non-bit-field objects of any other object type
consist of n * CHAR_BIT bits, where n is the size of an object of
that type, in bytes. The value may be copied into an object of
type unsigned char [n] (e.g., by memcpy); the resulting set of
bytes is called the object representation of the value.

(I've replaced the multiplication symbol by *.)

Note that it talks about *copying* the value into an array of unsigned
char, not treating it in place as if it were an array of unsigned
char. There's may be other wording that guanttes that the latter will
work as well.

The distinction between char and unsigned char may or may not be
significant. (I prefer to use unsigned char for this kind of thing
anyway.)

[snip]
 
W

William Hughes

[...]


Ok we now have several ways of looking at things
       obj + 1, a "valid" place to point
       We know we can treat obj as a character array
       There is a char array corresponding to obj
       <are the last two equivalent?>
My question remains: Is there enough
wiggle room for the DS2K to insert
nasal demons?
Possible analyses.
A Defined:  If the pointers point into an array, or 1 past
the array things are defined.  We can treat obj as
an array of char, so things are definied.
B Undefined:  For things to be defined we need both
pointers to point into an array or one past the
array.   The fact that we can treat obj as an
array of char does not mean we have an array
of char.

I disagree with your point B.  If a pointer just past the end of the
notional char array isn't a valid pointer, then we're not able to
treat the object as an array of char.

Ok, you have convinced me.

 But the standard says that we
*can* treat an object as an array of char; therefore, the pointer just
past the end of the char array is valid.

But just when you thought it was safe to go
back in the water.
On the other hand, I don't think the standard actually says that an
object can be treated as an array of char, at least not in so many
words.  It's implied by C99 6.2.6.1:

    Except for bit-fields, objects are composed of contiguous
    sequences of one or more bytes, the number, order, and encoding of
    which are either explicitly specified or implementation-defined.

    ...

    Values stored in non-bit-field objects of any other object type
    consist of n * CHAR_BIT bits, where n is the size of an object of
    that type, in bytes. The value may be copied into an object of
    type unsigned char [n] (e.g., by memcpy); the resulting set of
    bytes is called the object representation of the value.

(I've replaced the multiplication symbol by *.)

Note that it talks about *copying* the value into an array of unsigned
char, not treating it in place as if it were an array of unsigned
char.

And I change back. Clearly a statement that something
can exist is not the same as a statement that something
does exist.

 There's may be other wording that guanttes that the latter will
work as well.
The distinction between char and unsigned char may or may not be
significant.  (I prefer to use unsigned char for this kind of thing
anyway.)

Oh well, there is always C.

- William Hughes
 
J

jameskuyper

William Hughes wrote:
....
Possible analyses.

A Defined: If the pointers point into an array, or 1 past
the array things are defined. We can treat obj as
an array of char, so things are definied.

B Undefined: For things to be defined we need both
pointers to point into an array or one past the
array. The fact that we can treat obj as an
array of char does not mean we have an array
of char.

The fact that it can be treated as an array of char is all we need to
know in order to make the behavior defined. That's what "can be
treated as an array of char" means.
C Defined: For things to be defined we need both
pointers to point into an array or one past the
array. The type of the array does not have
to match the type of pointers.

The behavior of pointer arithmetic is defined in terms of the elements
of the corresponding array. In order for the rules of pointer
arithmetic to make any sense whatsoever, the element type of the array
being referred to must be the same at the type the pointers point at.
If that were not the case, then in the context of a declaration

int array[10];

according to the rules in 6.5.6p8, the expression (char*)array + 5
would refer to array[5], regardless of what value sizeof(int) has.
Personally, I think B trumps A (being able to
treat obj as an array of char does not
provide enough existence of the char array to
make the subtraction defined) but C trumps B.

Correctness trumps everything, and of the three options you've given,
only A is correct.
 
J

jameskuyper

Keith Thompson wrote:
....
On the other hand, I don't think the standard actually says that an
object can be treated as an array of char, at least not in so many
words. It's implied by C99 6.2.6.1:

Except for bit-fields, objects are composed of contiguous
sequences of one or more bytes, the number, order, and encoding of
which are either explicitly specified or implementation-defined.

...

Values stored in non-bit-field objects of any other object type
consist of n * CHAR_BIT bits, where n is the size of an object of
that type, in bytes. The value may be copied into an object of
type unsigned char [n] (e.g., by memcpy); the resulting set of
bytes is called the object representation of the value.

(I've replaced the multiplication symbol by *.)

Note that it talks about *copying* the value into an array of unsigned
char, not treating it in place as if it were an array of unsigned
char. There's may be other wording that guanttes that the latter will
work as well.

The key point in putting together the inference is the memcpy()
reference. That memcpy() is given only as an example, and not
explicitly stated as being the only way of doing it, implies that
there's nothing magical about memcpy(). In other words, any code which
has the same defined behavior as memcpy() should do the job equally
well. It's perfectly feasible to write ordinary C code that does the
same thing as memcpy(), though not perhaps as efficiently as the built-
in version. Given the definition of memcpy(), it's pretty hard for me
to see how such code could perform such a copy unless the object being
copied could indeed be treated "in place as if it were an array of
unsigned char."
 
W

William Hughes

William Hughes wrote:

...




The fact that it can be treated as an array of char is all we need to
know in order to make the behavior defined. That's what "can be
treated as an array of char" means.


Indeed. I have agreed to this. However, it has been
noted that the standard does not use the form
"can be treated as an array of char", but only
mandates that the bytes can be copied. I do not
agree that this means that the pointer arithmetic must
be defined.
The behavior of pointer arithmetic is defined in terms of the elements
of the corresponding array. In order for the rules of pointer
arithmetic to make any sense whatsoever, the element type of the array
being referred to must be the same at the type the pointers point at.

Indeed. The *value* of

ptr1-ptr2

depends on the type of ptr1 (which is the same
as the type of ptr2). However, the *validity* of

ptr1-ptr2

is defined in terms of both pointing into
or one beyond, the same object.
If pointers ptr3 and ptr4 with different types can be said
to point to the same location then the
validity may be determined by an object whose
type is different from *ptr1

- William Hughes
 
F

Flash Gordon

Dik said:
Isn't sizeof(int) implementation defined?

That's what I was thinking of, but I was wrong to think it was relevant.

Oh well, I'm nver going to use sizeof like that anyway.
 
W

William Hughes

Keith Thompson wrote:

...


On the other hand, I don't think the standard actually says that an
object can be treated as an array of char, at least not in so many
words.  It's implied by C99 6.2.6.1:
    Except for bit-fields, objects are composed of contiguous
    sequences of one or more bytes, the number, order, and encoding of
    which are either explicitly specified or implementation-defined..
    ...
    Values stored in non-bit-field objects of any other object type
    consist of n * CHAR_BIT bits, where n is the size of an object of
    that type, in bytes. The value may be copied into an object of
    type unsigned char [n] (e.g., by memcpy); the resulting set of
    bytes is called the object representation of the value.
(I've replaced the multiplication symbol by *.)
Note that it talks about *copying* the value into an array of unsigned
char, not treating it in place as if it were an array of unsigned
char.  There's may be other wording that guanttes that the latter will
work as well.

The key point in putting together the inference is the memcpy()
reference. That memcpy() is given only as an example, and not
explicitly stated as being the only way of doing it, implies that
there's nothing magical about memcpy(). In other words, any code which
has the same defined behavior as memcpy() should do the job equally
well. It's perfectly feasible to write ordinary C code that does the
same thing as memcpy(), though not perhaps as efficiently as the built-
in version. Given the definition of memcpy(), it's pretty hard for me
to see how such code could perform such a copy unless the object being
copied could indeed be treated "in place as if it were an array of
unsigned char."

I don't see this at all. It is not even necessary
for memcpy to read the array as char
(The array might have to be read in larger chunks)
And even if we do conclude that this section means that
we must be able to form and dereference a char
pointer that points within the object, the section
says nothing about doing pointer arithmetic
with (char*)(&obj +1)
- William Hughe
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,432
Messages
2,571,680
Members
48,796
Latest member
Greg L.

Latest Threads

Top