About casts (and pointers)

S

sunglo

Some time a go, in a discussion here in comp.lang.c, I learnt that it's
better not to use a (sometype **) where a (void **) is expected (using
a cast). Part of the discussion boiled down to the rule: if I cast a
(sometype **) to a (void **) I am making a number of assumptions about
the implementation's (void **) representation and length. Specifically,
if I do the above cast I'm assuming that a (sometype **) and a (void
**) have the same size and representation, and this might not always be
true. Ok, all clear up to this point.

But now my question is: does the above rule generalize to *every*
possible cast (expecially those where pointers are involved)? Is every
explicit cast unsafe? (Here I'm not talking about conversions to/from
void *, which I know are safe if performed implicitly by the language.
I also know that casts are better avoided unless strictly needed. But
I'm curious to know how things work).

As an example, look at the widely used cast (struct sockaddr_in *) to
(struct sockaddr *). Is that safe?
Seems to me that, as in the case of (void **), that cast assumes that
the size and representation of a (struct sockaddr *) are the same of a
(struct sockaddr_in *). Should this be done using an intermediate (void
*)?

What is (if any) the general rule? Or, in general, what can be said and
assumed about the casted object?

Thanks and sorry for the possibly silly questions.
 
E

Eric Sosman

Some time a go, in a discussion here in comp.lang.c, I learnt that it's
better not to use a (sometype **) where a (void **) is expected (using
a cast). Part of the discussion boiled down to the rule: if I cast a
(sometype **) to a (void **) I am making a number of assumptions about
the implementation's (void **) representation and length. Specifically,
if I do the above cast I'm assuming that a (sometype **) and a (void
**) have the same size and representation, and this might not always be
true. Ok, all clear up to this point.

Almost clear, but still slightly murky. When you (try
to) use a `sometype **' and a `void **' interchangeably, the
assumption is that `sometype *' (one asterisk) and `void *'
have the same representation.
But now my question is: does the above rule generalize to *every*
possible cast (expecially those where pointers are involved)? Is every
explicit cast unsafe? (Here I'm not talking about conversions to/from
void *, which I know are safe if performed implicitly by the language.
I also know that casts are better avoided unless strictly needed. But
I'm curious to know how things work).

Some explicit casts are safe, some are not. For example,
any data pointer can be converted to `unsigned char *' and
then used to access the individual bytes of the original object;
this it perfectly safe (although what you actually do to the
bytes might not be).

Honesty is the best policy. If you've got an object
of type `sometype', use a `sometype *' to point to it.
If you've got an object of type `sometype *', point to it
with a `sometype **'. If that's not possible (for example,
when writing a qsort() comparison function), then converting
a data pointer to `void *' (one asterisk) and back is all
right. Other pointer conversions should be viewed with
suspicion, although not necessarily with horror.
As an example, look at the widely used cast (struct sockaddr_in *) to
(struct sockaddr *). Is that safe?
Seems to me that, as in the case of (void **), that cast assumes that
the size and representation of a (struct sockaddr *) are the same of a
(struct sockaddr_in *). Should this be done using an intermediate (void
*)?

Ah, yes, well, you've encountered a special case. If
you've got two struct types that begin with the same sequence
of elements, then you can use a pointer to either type to get
at those initial elements. A `struct sockaddr' (presumably)
starts with a few elements that let the called function
decide whether it's been given a `struct sockaddr_in' or a
`struct sockaddr_something_else', and thereafter the called
function can convert the pointer to the proper struct type
so as to access the remaining elements.

Fans of object-oriented languages sneer at this "poor
man's polymorphism," and some of their sneers are perhaps
justified: it works, but it's fragile in the sense that the
compiler usually cannot warn you about simple errors. If you
must use an API that indulges in this sort of thing -- well,
that's what the API demands, and you haven't a lot of choice.
When designing your own functions, though, I'd suggest you avoid
this practice unless you find compelling reasons to adopt it.
 
C

Christian Bau

Eric Sosman said:
Ah, yes, well, you've encountered a special case. If
you've got two struct types that begin with the same sequence
of elements, then you can use a pointer to either type to get
at those initial elements. A `struct sockaddr' (presumably)
starts with a few elements that let the called function
decide whether it's been given a `struct sockaddr_in' or a
`struct sockaddr_something_else', and thereafter the called
function can convert the pointer to the proper struct type
so as to access the remaining elements.

Strictly speaking, this is only true if you define a union containing a
"struct sockaddr_in" and a "struct sockaddr_something_else", and the
compiler must have seen the declaration of that union before your code
handles "struct sockaddr_in" and "struct sockaddr_something_else"
interchangably.

In practice, it will always work except perhaps on the DeathStation 9000
because it would be quite difficult for a compiler to make it work when
the compiler is forced to make it work but not in other cases.
 
E

Eric Sosman

Christian said:
Strictly speaking, this is only true if you define a union containing a
"struct sockaddr_in" and a "struct sockaddr_something_else", and the
compiler must have seen the declaration of that union before your code
handles "struct sockaddr_in" and "struct sockaddr_something_else"
interchangably.

Yeah, I thought about the "in a union" thing when composing
my post, but decided to ignore it. As far as I can tell, the
compiler could only behave perversely if it could somehow prove
that that the two structs could never possibly appear as members
of the same union, even in translation units the compiler has not
yet seen that might or might not be linked into the same final
program. I believe such a proof is beyond the capabilities of
current compilers, and is likely to remain so until I die and no
longer care about it ...
In practice, it will always work except perhaps on the DeathStation 9000
because it would be quite difficult for a compiler to make it work when
the compiler is forced to make it work but not in other cases.

Perversity is always a possibility. It's not very marketable,
though, outside the realm of popular "music."
 
D

Daniel Vallstrom

Some time a go, in a discussion here in comp.lang.c, I learnt that it's
better not to use a (sometype **) where a (void **) is expected (using
a cast). Part of the discussion boiled down to the rule: if I cast a
(sometype **) to a (void **) I am making a number of assumptions about
the implementation's (void **) representation and length. Specifically,
if I do the above cast I'm assuming that a (sometype **) and a (void
**) have the same size and representation, and this might not always be
true. Ok, all clear up to this point.

But now my question is: does the above rule generalize to *every*
possible cast (expecially those where pointers are involved)? Is every
explicit cast unsafe? (Here I'm not talking about conversions to/from
void *, which I know are safe if performed implicitly by the
language.

The standard defines exact width integer types intN_t in a strict
way saying that there must be no padding etc. As a result, IMO/AFAIK
it's safe to convert a pointer from one intN_t type to a different
intM_t type. For example, the following is safe:

uint32_t * a; ...
uint8_t * b = a;
b[3] = b[5];
a = b+4;
a[7] = a[9];

(On the other hand, the intN_t types need not be supported of
course.)

Daniel Vallstrom
 
K

Keith Thompson

Daniel Vallstrom said:
The standard defines exact width integer types intN_t in a strict
way saying that there must be no padding etc.
Yes.

As a result, IMO/AFAIK
it's safe to convert a pointer from one intN_t type to a different
intM_t type.


How does that follow?
For example, the following is safe:

uint32_t * a; ...
uint8_t * b = a;

There is no implicit conversion from uint32_t* to uint8_t*, so you
need an explicit cast here (though some compilers may allow it).

uint8_t is a special case because, if it exists, it's very likely
(certain?) to be a typedef for unsigned char. But let's change the
example to:

uint32_t arr[10];
uint32_t *a = arr;
uint16_t *b = a;

You can safely convert a pointer to one type to a pointer to another
type *and back again* if the intermediate pointer is correctly
aligned; if it isn't, the conversion invokes undefined behavior. It's
likely that int32_t has stricter alignment requirements than int16_t;
if so, converting from int16_t* to int32_t* can invoke undefined
behavior. (It's possible, but unlikely in reality, that int16_t has
stricter alignment requirements than int32_t.)

Even if the conversion is allowed, the result isn't necessarily going
to be useful.
 
D

Daniel Vallstrom

Keith said:
Daniel Vallstrom said:
The standard defines exact width integer types intN_t in a strict
way saying that there must be no padding etc.
Yes.

As a result, IMO/AFAIK
it's safe to convert a pointer from one intN_t type to a different
intM_t type.


How does that follow?


It doesn't. (It would if pointers were addresses.)

There is no implicit conversion from uint32_t* to uint8_t*, so you
need an explicit cast here (though some compilers may allow it).
Right.

uint8_t is a special case because, if it exists, it's very likely
(certain?) to be a typedef for unsigned char.

Right. Change 8 to 16.
But let's change the
example to:

Let's not;) I'll instead apologies for a very poor post riddled
with errors and try again with a proper example:

This, I believe, should be safe --- even though it's rather pointless:

int16_t * a = malloc( sizeof *a * 10 ); ...
int32_t * b = (int32_t*)(a+2);
int16_t * c = (int16_t*)b - 2;
assert( a == c );

You can safely convert a pointer to one type to a pointer to another
type *and back again* if the intermediate pointer is correctly
aligned; if it isn't, the conversion invokes undefined behavior. It's
likely that int32_t has stricter alignment requirements than int16_t;
if so, converting from int16_t* to int32_t* can invoke undefined
behavior. (It's possible, but unlikely in reality, that int16_t has
stricter alignment requirements than int32_t.)

Even if the conversion is allowed, the result isn't necessarily going
to be useful.

Right. Thanks for all the corrections.


Daniel Vallstrom
 
C

Chris Croughton

This, I believe, should be safe --- even though it's rather pointless:

int16_t * a = malloc( sizeof *a * 10 ); ...
int32_t * b = (int32_t*)(a+2);
int16_t * c = (int16_t*)b - 2;
assert( a == c );

No, because the intX_t types are guaranteed only to be of a type with at
least X bits. An int16_t could have 16 bits and an int32_t 64 bits, for
instance. Or an int32_t might have to be aligned at 8 byte boundaries
but an int16_t only need to be aligned at 1 byte boundaries. Even
setting a pointer to the type might result in a trap.

A pathological but possible implementation:

char (int8_t) is 12 bits (1 byte)
short (int16_t) is 24 bits (2 bytes), aligned on a 2-byte boundary
int (int32_t) is 36 bits (3 bytes), aligned on a 3-byte boundary.

Your code would generate b as (char*)a + 4, which is not on the 3-byte
boundary required for an int. The compiler could legitimately round the
address up (or down) when converting to an int32_t* (it could also
launch ICBMs at the White House, cause a plague of frogs or any other
undefined behaviour) and converting it back to an int16_t* could do the
same...

You can convert a pointer to any type to a pointer to any other type,
providing that its alignment is satisfactory. I can find no guarantee
that doing pointer arithmetic on it and converting it back will result
in anything defined.

Chris C
 
R

Richard Bos

Chris Croughton said:
No, because the intX_t types are guaranteed only to be of a type with at
least X bits. An int16_t could have 16 bits and an int32_t 64 bits, for
instance. Or an int32_t might have to be aligned at 8 byte boundaries
but an int16_t only need to be aligned at 1 byte boundaries. Even
setting a pointer to the type might result in a trap.

Nope.

# The typedef name intN_t designates a signed integer type with width N,
# no padding bits, and a two’s complement representation. Thus, int8_t
# denotes a signed integer type with a width of exactly 8 bits.

(7.18.1.1#1)

You're thinking of int_leastN_t.

Richard
 
M

Mark Piffer

Eric said:
Ah, yes, well, you've encountered a special case. If
you've got two struct types that begin with the same sequence
of elements, then you can use a pointer to either type to get
at those initial elements. A `struct sockaddr' (presumably)
starts with a few elements that let the called function
decide whether it's been given a `struct sockaddr_in' or a
`struct sockaddr_something_else', and thereafter the called
function can convert the pointer to the proper struct type
so as to access the remaining elements.

To add another question to this special case:
the standard guarantees in 6.2.5#26 that (struct sockaddr *) and
(struct sockaddr_in *) are of the same representation and alignment,
which would allow us to assume that (struct sockaddr **) and (struct
sockaddr_in **) are portably convertible into each other (and usable
afterwards) as they are pointing to mutually correctly aligned
locations. Is this reasoning correct?

Mark
 
E

Eric Sosman

Mark said:
of a



To add another question to this special case:
the standard guarantees in 6.2.5#26 that (struct sockaddr *) and
(struct sockaddr_in *) are of the same representation and alignment,
which would allow us to assume that (struct sockaddr **) and (struct
sockaddr_in **) are portably convertible into each other (and usable
afterwards) as they are pointing to mutually correctly aligned
locations. Is this reasoning correct?

The conclusion is correct (they are interconvertible),
but I think the reasoning is faulty. It's not that the target
structs have the same alignment requirement -- they needn't --
but that the representations of the two types of struct pointer
are identical. At "the bare bits level," the conversion from
one type to the other is therefore a no-op, hence no information
is lost and when the pointer is converted back again it still
compares equal to the original and properly points to the same
target.

There are a bunch of special cases about pointers, mostly
(I think) to protect pre-Standard code -- a Standard that broke
a significant fraction of that large amount of code would have
had difficulty gaining acceptance! Off the top of my head:

- Qualified and unqualified versions of pointers to the
same type have the same representation. Thus, `int*'
and `const int*' and `volatile int*' ... look alike.

- `void*' and `char*' and `unsigned char*' and `signed char*'
share the same representation.

- Pointers to all kinds of structs look alike.

- Pointers to all kinds of unions look alike.

- A pointer to a struct can be converted to and from a
pointer to the struct's first element (if it isn't a
bit field) safely.

- A pointer to a union can be converted to and from a
pointer to any element of the union (except bit fields)
safely.

- There's the above-mentioned rule about structs with
identical initial sequences of elements, although (as
Christian Bau mentioned) there are a few additional
conditions on this one.

- ... and there may be a few others I can't think of at
the moment.

Still and all, honesty remains the best policy whenever you
can get away with it. Point to T objects with T* pointers and
you're sure to be safe. Always.
 
L

Lawrence Kirby

Some time a go, in a discussion here in comp.lang.c, I learnt that it's
better not to use a (sometype **) where a (void **) is expected (using
a cast). Part of the discussion boiled down to the rule: if I cast a
(sometype **) to a (void **) I am making a number of assumptions about
the implementation's (void **) representation and length.

That's not the real problem (or at least only half of the problem).
Casting something ** to void ** gives the compiler the chance to change
the representation of the result to something appropriate for a void **
value. There's no guarantee that void ** can represent the result, so
there is a possibilty of failure, but this can work even if something **
and void ** have different representations.

The bigger problem is what you do with the void ** value once you have it.
You probably want to dereference it to get a void * value (trying to use
void ** as some sort of "generic" pointer to pointer). This is where
things get very nasty. Your sometype ** pointer is presumably pointing at
a sometype * pointer. If you dereference the void ** value you are trying
to reinterpret the something * pointer as if it is a void * pointer. This
would be a direct reinterpretation of the object representation which
would fail if the representations are different.
Specifically,
if I do the above cast I'm assuming that a (sometype **) and a (void
**) have the same size and representation, and this might not always be
true. Ok, all clear up to this point.

They don't have to, but something * and void * probably do.
But now my question is: does the above rule generalize to *every*
possible cast (expecially those where pointers are involved)? Is every
explicit cast unsafe? (Here I'm not talking about conversions to/from
void *, which I know are safe if performed implicitly by the language.
I also know that casts are better avoided unless strictly needed. But
I'm curious to know how things work).

Conversion to and from pointers to character types is as safe as void *.
As an example, look at the widely used cast (struct sockaddr_in *) to
(struct sockaddr *). Is that safe?

As others have pointed out it would be difficult for an implementation to
make conversion between pointers to structure types fail. But that's not
really the correct way of looking at this. If we assume that the types you
mention are part of an implementation provided library that adheres to a
specification that requires such conversions to work (i.e. as an extension
to C) then those implementations will have to make sure that it works
irrespective of what C itself guarantees.

So, yes, this is safe. But not because C says that it is safe.

From the Unix world you just have to look at dlsym() for something far
more evil (mixing function and object pointers), but the same
considerations apply.
Seems to me that, as in the case of (void **), that cast assumes that
the size and representation of a (struct sockaddr *) are the same of a
(struct sockaddr_in *). Should this be done using an intermediate (void
*)?

Again, when you convert between two types those types don't have to have
the same representation, the conversion can make any necessary
adjustments, except that this doesn't guarantee that the value is
representable in the target type.

Lawrence
 
C

Chris Croughton

Nope.

# The typedef name intN_t designates a signed integer type with width N,
# no padding bits, and a two’s complement representation. Thus, int8_t
# denotes a signed integer type with a width of exactly 8 bits.

(7.18.1.1#1)

You're thinking of int_leastN_t.

So I am, the intX_t ones are the ones which are not useful because they
might not exist at all. However, the alignment issues still exist. An
int32_t might only have 32 bits usable but take up 8 bytes because of
alignment requirements.

Chris C
 
K

Keith Thompson

Daniel Vallstrom said:
This, I believe, should be safe --- even though it's rather pointless:

int16_t * a = malloc( sizeof *a * 10 ); ...
int32_t * b = (int32_t*)(a+2);
int16_t * c = (int16_t*)b - 2;
assert( a == c );

I suspect it would be safe under any non-perverse implementation, but
I don't think the standard guarantees it. As Chris Croughton points
out, the standard guarantees that you can convert from int16_t* to
int32_t* and back to int16_t* (assuming the intermediate pointer is
properly aligned), but performing arithmetic on the intermediate
pointer before converting it back voids the warranty.

Note that if a were a pointer to a declared object rather than to a
chunk of memory allocated by malloc(), there could be alignment
problems.

I'm sure this breaks on the mythical DS9K; it would be interesting to
see how.
 
K

Keith Thompson

Chris Croughton said:
So I am, the intX_t ones are the ones which are not useful because they
might not exist at all. However, the alignment issues still exist. An
int32_t might only have 32 bits usable but take up 8 bytes because of
alignment requirements.

I don't think that's possible. If an integer type with a width of 32
bits requires 64 bits of storage, then it has 32 padding bits and
isn't eligible to be called int32_t.

It's still likely that int16_t and int32_t have different alignment
requirements, and even possible (but not likely) that int16_t has
stricter alignment requirements than int32_t.
 
C

Christian Bau

Eric Sosman said:
Yeah, I thought about the "in a union" thing when composing
my post, but decided to ignore it. As far as I can tell, the
compiler could only behave perversely if it could somehow prove
that that the two structs could never possibly appear as members
of the same union, even in translation units the compiler has not
yet seen that might or might not be linked into the same final
program. I believe such a proof is beyond the capabilities of
current compilers, and is likely to remain so until I die and no
longer care about it ...

Well, the compiler is "in practice" forced to use the same layout for
all structs starting with members of the same type (for example, for all
structs starting with a short and an int, offsetof gives the same result
for the second struct member).

However, the compiler is free to assume that two pointers to different
struct types cannot access the same memory without undefined behavior.
So if you only have

#include <stdio.h>
#include <stdlib.h>

struct s1 { short x; int y; }
struct s2 { short a; int b; double c; }

void f (struct s1* p1, struct s2* p2) {
p1->y = 0;
p2->b = 1;

// Here the compiler can assume that because the types
// of *p1 and *p2 are different, p1 and p2 cannot point
// to the same memory without undefined behavior.
if (p1->y == 1) printf ("It worked!\n");
if (p1->y == 0) printf ("It didn't work!\n");
}

int main (void) {
void* p = malloc (sizeof (struct s1) + sizeof (struct s2));
if (p) f (p, p);
return 0;
}

could print either "It worked!" or "It didn't work!". The compiler can
assume that the assignment to p2->b cannot change p1->y (unless there is
undefined behavior).

You need a union containing both structs to avoid the undefined behavior.
 
O

Old Wolf

Keith said:
I suspect it would be safe under any non-perverse implementation, but
I don't think the standard guarantees it.

I'm sure this breaks on the mythical DS9K; it would be interesting to
see how.

Let's try to figure it out.
'a' is guaranteed to be correctly aligned for int32_t, since it's the
result of a malloc.
It seems to be guaranteed that sizeof(int32_t) == 2 * sizeof(int16_t).
So (a+2) must be the same as (int32_t *)a + 1, which must also be
correctly aligned for int32_t.
Looks good to me so far...

Is it guaranteed that you can dereference any of these pointers,
without triggering a trap representation or something?
 
C

Christian Bau

Chris Croughton said:
So I am, the intX_t ones are the ones which are not useful because they
might not exist at all. However, the alignment issues still exist. An
int32_t might only have 32 bits usable but take up 8 bytes because of
alignment requirements.

A type with 32 bits usable but taking up 8 bytes would have at least 32
padding bits. int32_t doesn't have any padding bits, so its size cannot
be 8 bytes.
 
C

Christian Bau

Keith Thompson said:
I suspect it would be safe under any non-perverse implementation, but
I don't think the standard guarantees it. As Chris Croughton points
out, the standard guarantees that you can convert from int16_t* to
int32_t* and back to int16_t* (assuming the intermediate pointer is
properly aligned), but performing arithmetic on the intermediate
pointer before converting it back voids the warranty.

I think this code is safe (but needs to be handled very careful),
because sizeof (int32_t) must be twice the sizeof (int16_t), so (int32_t
*) (a+2) and ((int32_t *) a) + 1 must be the same pointer, and therefore
(int32_t *) (a+2) must be properly aligned.
 
C

Christian Bau

"Old Wolf said:
Let's try to figure it out.
'a' is guaranteed to be correctly aligned for int32_t, since it's the
result of a malloc.
It seems to be guaranteed that sizeof(int32_t) == 2 * sizeof(int16_t).
So (a+2) must be the same as (int32_t *)a + 1, which must also be
correctly aligned for int32_t.
Looks good to me so far...

Is it guaranteed that you can dereference any of these pointers,
without triggering a trap representation or something?

No. We know that a[2] and a[3] occupy the same space as b [0]. But if
you wrote

a [2] = 100;
a [3] = 200;

... b [0] ...

you'll have undefined behavior because you accessed the memory using an
lvalue that has type int32_t, and you have to use either the type of the
stored data (int16_t) or a char type. And int32_t cannot be a char type
if int16_t exists!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

Pointer casts for OOP 2
Casts 81
casts and pointers 0
function casts 27
Union and pointer casts? 13
Help with pointers 1
Sizes of pointers 233
casts and lvalues 68

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,022
Latest member
MaybelleMa

Latest Threads

Top