Representation of Pointer-to-Struct

S

Shao Miller

Good day to All,

Is the translation and execution behaviour of the following program
well-defined by C99?

Does it matter which member of 'the_ptrs' is last used to modify the
value of 'the_ptrs' object, in this particular instance, if pointers
to structs have the same representation?

Thanks. :)

Some possibly relevant references from the draft 'n1256.pdf':
- 6.2.5 "Types", point 27 (and non-normative footnote 39)
- 6.5 "Expressions", point 7
- 6.5.2.3 "Structure and union members", points 3 (with non-normative
footnote 82) and 5
- 6.7.2.1 "Structure and union specifiers", point 14

struct foo {
int i;
};

struct bar {
double d;
};

union baz {
struct foo f;
struct bar b;
};

union eep {
struct foo *fp;
struct bar *bp;
};

int main(void) {
union baz the_objs;
union eep the_ptrs;
struct bar *test;

/* Effective type established here */
the_objs.f.i = 5;

/* Effective type established here */
the_ptrs.fp = &the_objs.f;

/* Undefined behaviour? */
test = the_ptrs.bp;

return 0;
}
 
E

Ersek, Laszlo

Is the translation and execution behaviour of the following program
well-defined by C99?

I believe it is implementation-dependent.

struct foo {
int i;
};

struct bar {
double d;
};

union baz {
struct foo f;
struct bar b;
};

union eep {
struct foo *fp;
struct bar *bp;
};

int main(void) {
union baz the_objs;
union eep the_ptrs;
struct bar *test;

/* Effective type established here */
the_objs.f.i = 5;

/* Effective type established here */
the_ptrs.fp = &the_objs.f;

/* Undefined behaviour? */
test = the_ptrs.bp;

return 0;
}

If the sizes of both pointer types match on the given implementation, and
any valid bit pattern for the one is a valid bit pattern for the other
(segmentation, alignment etc), then I think the code is valid.

(If you try to dereference the type punned pointer, you'll invoke
implementation-dependent behavior again.)

What's your reason not to use (void *) and pointer conversions instead of
union eep?

lacos
 
B

Ben Bacarisse

Ersek said:
I believe it is implementation-dependent.

I think it's OK.
If the sizes of both pointer types match on the given implementation,
and any valid bit pattern for the one is a valid bit pattern for the
other (segmentation, alignment etc), then I think the code is valid.

All pointers to structures must have the same size (6.2.5 p27). They
all have the same representation and alignment requirements in a manner
that means that are interchangeable as members of a union.
(If you try to dereference the type punned pointer, you'll invoke
implementation-dependent behavior again.)

I think that will be undefined.
What's your reason not to use (void *) and pointer conversions instead
of union eep?

It's important to note that using a union like this does not convert the
pointers. For example:

union {
void *vp;
int *ip;
} u;

int i = 0;
u.ip = *i;
memset(u.vp, 0, sizeof u); /* undefined */
memset((void *)u.ip, 0, sizeof u); /* ok */

On a word addressed machine, the cast to void * may well generate code
to convert the pointer. Using the union re-interprets the bits in a way
that bypasses the crucial conversion. That's just an example but it
helps to explain /why/ the above is undefined.
 
E

Ersek, Laszlo

All pointers to structures must have the same size (6.2.5 p27). They
all have the same representation and alignment requirements in a manner
that means that are interchangeable as members of a union.

Hm yes, thanks. (In my copy, it's actually 6.2.5 p26: "[...] All pointers
to structure types shall have the same representation and alignment
requirements as each other. [...]")

I think that will be undefined.

I guess I was wrong. 6.5 p7?

It's important to note that using a union like this does not convert the
pointers. For example:

union {
void *vp;
int *ip;
} u;

int i = 0;
u.ip = *i;

ITYM &i.

memset(u.vp, 0, sizeof u); /* undefined */
memset((void *)u.ip, 0, sizeof u); /* ok */

On a word addressed machine, the cast to void * may well generate code
to convert the pointer. Using the union re-interprets the bits in a way
that bypasses the crucial conversion. That's just an example but it
helps to explain /why/ the above is undefined.

Yes -- I was curious why the OP wants to reinterpret pointers' contents.

Thanks,
lacos
 
B

Ben Bacarisse

Ersek said:
All pointers to structures must have the same size (6.2.5 p27).
They all have the same representation and alignment requirements in
a manner that means that are interchangeable as members of a union.

Hm yes, thanks. (In my copy, it's actually 6.2.5 p26: "[...] All
pointers to structure types shall have the same representation and
alignment requirements as each other. [...]")

Ah yes. It looks like a paragraph got added into n1256.pdf by one of
the TCs.
I guess I was wrong. 6.5 p7?

Probably! But also the bits of the floating point value might be a trap
representation for an int.

Yes, thanks. Sadly, they are next to each other on my keyboard.

<snip>
 
S

Shao Miller

... and "sizeof i", right?
Thanks, Ben and lacos. :)

Yes I am imagining a theoretical implementation which might encode
various information into its pointer representation, beyond the mere
notion of "an address." "Fat pointers" that carry bounds info, as an
example, have been discussed before[1] and I have the impression that
in the strictest treatment of the C Standard, one requires casts to
ensure that pointer conversions yield defined results... Except
possibly where pointer representation is defined to be the same for
pointer types.

My concern is that in the code of the original post, if the pointer
has a representation something like "address,end-address", then
although that representation is the same when used as 'the_ptrs.bp',
we might subsequently get some unusual results if we try to use the
pointer.

/*
* We have a proper ptr-to-struct value.
* We have the right alignment where we point.
* We have enough _real_ space where we point, but...
* What if the bounds are those of struct foo?
* Or the pointer contains other esoteric info?
*/
test->d = 6.0;

[1] By Dennis Ritchie, at least: http://cm.bell-labs.com/cm/cs/who/dmr/vararray.html
 
S

Shao Miller

My concern is that in the code of the original post, if the pointer
has a representation something like "address,end-address", then
I should have typed "position,start-address,end-address".
 
B

Barry Schwarz

I think it's OK.


All pointers to structures must have the same size (6.2.5 p27). They
all have the same representation and alignment requirements in a manner
that means that are interchangeable as members of a union.


I think that will be undefined.


It's important to note that using a union like this does not convert the
pointers. For example:

union {
void *vp;
int *ip;
} u;

int i = 0;
u.ip = *i;

Did you mean &i?
memset(u.vp, 0, sizeof u); /* undefined */
memset((void *)u.ip, 0, sizeof u); /* ok */

Did you mean sizeof i? As it stands, if the size of u exceeds the
size of i, memset will set some bytes to 0 beyond those which i
occupies causing undefined behavior.

Since the first argument of memset is declared as void* in string.h,
does the cast actually do anything that wouldn't happen automatically
anyway?
 
B

Ben Bacarisse

Barry Schwarz said:
Did you mean &i?


Did you mean sizeof i?

Yes to both, though you are little behind the curve!
As it stands, if the size of u exceeds the
size of i, memset will set some bytes to 0 beyond those which i
occupies causing undefined behavior.

Since the first argument of memset is declared as void* in string.h,
does the cast actually do anything that wouldn't happen automatically
anyway?

No, but I wanted to make it explicit since the code was an attempt to
highlight what a cast does over simple re-interpretation. In what
followed it was easier to talk about "the cast" rather than "the
conversion caused by the prototype".
 
S

Shao Miller

Yes to both, though you are little behind the curve!



No, but I wanted to make it explicit since the code was an attempt to
highlight what a cast does over simple re-interpretation.  In what
followed it was easier to talk about "the cast" rather than "the
conversion caused by the prototype".
Agreed. The cast is an explicit conversion. Use of the union member
which happens to have the same representation and alignment
requirements as another union member is not even an implicit
conversion; the programmer is essentially saying "This is what you
have. Use it." And as you point out, 'void *' needn't have the same
representation as 'int *'.
 
E

Ersek, Laszlo

(Sorry for following up this lately.)

Probably! But also the bits of the floating point value might be a trap
representation for an int.

I did think of that -- I looked at 6.2.6 "Representation of types" and I
think whatever I saw there was "unspecified" or
"implementation-dependent". Umm... 6.2.6.1 p1-2.

That is, accessing or producing a trap representation is surely UB
(6.2.6.1 p5), but what constitutes a trap representation and whether it is
possible at all is at worst unspecified. Or so it seems to me. This is why
I only mentioned 6.5 p7 above.

Cheers,
lacos
 
T

Tim Rentsch

Ben Bacarisse said:
I think it's OK.

I believe the issue is actually somewhat more subtle. But the
question isn't really too interesting, because no sensible
person would ever write code like that.
 
S

Shao Miller

Tim said:
I believe the issue is actually somewhat more subtle. But the
question isn't really too interesting, because no sensible
person would ever write code like that.
It's a demonstration, Tim. It's minimal.

The point is in regards to pointer representation. Since all we have is
a notion that pointers point to the lowest byte address for the
pointed-to type, this demonstration shows how certain implementation
details could lead to unexpected results (like "fat pointers" might).

Suppose you have:

#include <stdlib.h>
#include <stdio.h>

enum obj_type {
obj_type_a,
obj_type_b
};

struct obj {
enum obj_type type;
};

struct a {
struct obj obj;
int i;
};

union any_obj_ptr {
struct obj *as_obj;
struct a *as_a;
};

static int access_count = 0;

static union any_obj_ptr gimme_thing(void) {
static struct a foo = {
.obj = { .type = obj_type_a },
.i = 15,
};
access_count++;
return (union any_obj_ptr){ .as_a = &foo };
}

int main(void) {
switch (gimme_thing().as_obj->type) {
case obj_type_a:
printf("%d\n", gimme_thing().as_a->i);
printf("accesses: %d\n", access_count);
return EXIT_SUCCESS;
case obj_type_b:
default:
return EXIT_FAILURE;
}
}

If instead we had declared a 'union any_obj_ptr' in 'main' and assigned
to it the result of 'gimme_thing', then worked with that, we could
bypass the 'access_count' incrementation.

Same thing if we'd cast and assigned and worked with a 'struct a *'.

The alternative,

printf("%d\n", ((struct a *)gimme_thing())->i);

is that really as attractive? Given the possibility of "fat pointers"
and the general fact that pointers are opaque beyond what's specified in
a C Standard, do we have a choice? Does the specification that all
pointers to 'struct's have the same representation allow us to avoid the
cast?
 
T

Tim Rentsch

Shao Miller said:
Tim said:
I believe the issue is actually somewhat more subtle. But the
question isn't really too interesting, because no sensible
person would ever write code like that.
It's a demonstration, Tim. It's minimal. [snip elaboration]

Three comments:

1. I wasn't addressing you. It's not appropriate to respond as
though I was.

2. When I said the question isn't interesting, I meant that in
particular it isn't interesting to me. Do you not understand
what it means when someone says something isn't interesting?
Maybe you should look up 'interesting' in a dictionary.

3. In polite society, sir, you don't call someone by their first
name unless you've been invited to do so. I certainly haven't
done that.
 
S

Shao Miller

Tim said:
Shao Miller said:
Tim said:
On Fri, 6 Aug 2010, Shao Miller wrote:

Is the translation and execution behaviour of the following program
well-defined by C99?

struct foo {
int i;
};

struct bar {
double d;
};

union baz {
struct foo f;
struct bar b;
};

union eep {
struct foo *fp;
struct bar *bp;
};

int main(void) {
union baz the_objs;
union eep the_ptrs;
struct bar *test;

/* Effective type established here */
the_objs.f.i = 5;

/* Effective type established here */
the_ptrs.fp = &the_objs.f;

/* Undefined behaviour? */
test = the_ptrs.bp;

return 0;
}
I believe it is implementation-dependent.
I think it's OK.
I believe the issue is actually somewhat more subtle. But the
question isn't really too interesting, because no sensible
person would ever write code like that.
It's a demonstration, Tim. It's minimal. [snip elaboration]

Three comments:

1. I wasn't addressing you. It's not appropriate to respond as
though I was.
Well please forgive my misunderstanding. I will try not to make the
same mistake again. I was responding as though it was a public forum
where anyone is free to respond to anyone else's posts. I started the
thread, I've appreciated your expertise in previous threads, so I wished
to elaborate. You've chosen to discard the elaboration and chosen not
to address it. So be it! :)
2. When I said the question isn't interesting, I meant that in
particular it isn't interesting to me. Do you not understand
what it means when someone says something isn't interesting?
Maybe you should look up 'interesting' in a dictionary.
Well gee, "...because no sensible person would ever write code like
that" right underneath my code; I thought that made another good reason
for elaboration. I attempted to describe why it mightn't meet your
expectations for what a "sensible person would ever write".
3. In polite society, sir, you don't call someone by their first
name unless you've been invited to do so. I certainly haven't
done that.
In polite society, sir, one apologizes when one offends another. I
apologize, Mr. T. Rentsch.

It can be a challenge to please all of the people all of the time. Your
feedback warrants further efforts.
 
K

Keith Thompson

Tim Rentsch said:
3. In polite society, sir, you don't call someone by their first
name unless you've been invited to do so. I certainly haven't
done that.

I suppose it depends on the society. In my experience, in most
circumstances, one doesn't require permission to use someone's
first name, and people do it all the time in this newsgroup.
YMMV.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,056
Latest member
GlycogenSupporthealth

Latest Threads

Top