Type punning question

Travis Vitek · Jun 17, 2009

I'm maintaining some code that uses the SunOS priocntl system call
[*], and this has led me to some questions about type punning. To
avoid a long-winded explaination, I'm just going to post some source
and then ask my questions.

#define PC_CLNMSZ 16
#define PC_CLINFOSZ (32 / sizeof (int))

// The pc_clinfo member is used to return data describing the
// attributes of a specific class. The format of this data is
// class-specific and is described under the appropriate
// heading.

typedef struct pcinfo
{
int pc_cid; /* class id */
char pc_clname[PC_CLNMSZ]; /* class name */
int pc_clinfo[PC_CLINFOSZ]; /* class information */
} pcinfo_t;

// realtime class attributes in the pc_clinfo buffer are in
// this format.

typedef struct rtinfo
{
short rt_maxpri;
} rtinfo_t;

int main ()
{
pcinfo_t pc;

// assume a call to priocntl() will initialize pc with
// something like this...

// pc.pc_clname [0] = 'R';
// pc.pc_clname [1] = 'T';
// pc.pc_clname [2] = 0;
//if (-1L == priocntl (idtype_t(), id_t(), PC_GETCLINFO, &pc))
// return -1;

//
rtinfo_t* prt = (rtinfo_t*)pc.pc_clinfo;

return 0 == prt->rt_maxpri;
}

The gcc-4.1.1 compiler generates the following warning when casting
pc.pc_clinfo (an array of int) to rtinfo_t*.

t.cpp: In function 'int main()':
t.cpp:38: warning: dereferencing type-punned pointer will break
strict-aliasing rules

It is clear that we are type-punning the pointer, but I do not see how
this particular case is dangerous. While some of these questions may
be more appropriate on the gcc list, I'm hoping to better understand
how this is supposed to work, not how it happens to work on one
implementation. My questions are as follows...

1. is this actually a bogus warning?
2. is this a dangerous thing to do?
2. why does changing pc_clinfo to array of char eliminate the
warning? i.e., does that make the cast 'safe'?
3. what techniques are there for avoiding problems like this?

If my understanding of the standard is correct, the above does indeed
invoke undefined behavior, but I'm not certain how exactly to best
work around the issue and to avoid dangerous punning in the future.

Travis

[*] http://docs.sun.com/app/docs/doc/816-5167/priocntl-2

Alf P. Steinbach · Jun 17, 2009

* Travis Vitek:

I'm maintaining some code that uses the SunOS priocntl system call
[*], and this has led me to some questions about type punning. To
avoid a long-winded explaination, I'm just going to post some source
and then ask my questions.

Why are you posting in a C++ group?

This is C code.

#define PC_CLNMSZ 16
#define PC_CLINFOSZ (32 / sizeof (int))

// The pc_clinfo member is used to return data describing the
// attributes of a specific class. The format of this data is
// class-specific and is described under the appropriate
// heading.

typedef struct pcinfo
{
int pc_cid; /* class id */
char pc_clname[PC_CLNMSZ]; /* class name */
int pc_clinfo[PC_CLINFOSZ]; /* class information */
} pcinfo_t;

// realtime class attributes in the pc_clinfo buffer are in
// this format.

typedef struct rtinfo
{
short rt_maxpri;
} rtinfo_t;

int main ()
{
pcinfo_t pc;

// assume a call to priocntl() will initialize pc with
// something like this...

// pc.pc_clname [0] = 'R';
// pc.pc_clname [1] = 'T';
// pc.pc_clname [2] = 0;
//if (-1L == priocntl (idtype_t(), id_t(), PC_GETCLINFO, &pc))
// return -1;

//
rtinfo_t* prt = (rtinfo_t*)pc.pc_clinfo;

return 0 == prt->rt_maxpri;
}

The gcc-4.1.1 compiler generates the following warning when casting
pc.pc_clinfo (an array of int) to rtinfo_t*.

t.cpp: In function 'int main()':
t.cpp:38: warning: dereferencing type-punned pointer will break
strict-aliasing rules

It is clear that we are type-punning the pointer, but I do not see how
this particular case is dangerous.

On some machines breaking aliasing rules can cause a hardware exception, on some
machines and OSes that exception is caught by the OS and fixed but then
amounting to inefficient access.

While some of these questions may
be more appropriate on the gcc list, I'm hoping to better understand
how this is supposed to work, not how it happens to work on one
implementation. My questions are as follows...

1. is this actually a bogus warning?
No.

2. is this a dangerous thing to do?
Yes.

2. why does changing pc_clinfo to array of char eliminate the
warning? i.e., does that make the cast 'safe'?

Not sure why it eliminates the warning, although the fact that you can convert
any POD value to a sequence of chars and back is probably involved in making the
compiler not see the problem. It doesn't make the cast safe.

3. what techniques are there for avoiding problems like this?

If my understanding of the standard is correct, the above does indeed
invoke undefined behavior, but I'm not certain how exactly to best
work around the issue and to avoid dangerous punning in the future.

Travis

[*] http://docs.sun.com/app/docs/doc/816-5167/priocntl-2

C++ code would go like this (off the cuff) -- look ma, no casts!

struct RtInfo
{
short maxPriority;
char filler[32 - sizeof(short)];
};

struct PcInfo
{
int id;
char className[16];
RtInfo rtInfo;
};

int main ()
{
PcInfo pc;

// assume a call to priocntl() will initialize pc with
// something like this...

// pc.classname [0] = 'R';
// pc.classname [1] = 'T';
// pc.classname [2] = 0;
//if (-1L == priocntl (idtype_t(), id_t(), PC_GETCLINFO, &pc))
// return -1;

return (0 == pc.rtInfo.maxPriority);
}

Cheers & hth.,

- Alf

PS: What *is* this thing/urge that programmers have about introducing cryptic
shortenings, like "hlt" instead of "halt", or, as the designer later described
as the only thing he'd correct in Unix, "creat" instead of "create"?

Alf P. Steinbach · Jun 17, 2009

* Alf P. Steinbach:

* Travis Vitek:

I'm maintaining some code that uses the SunOS priocntl system call
[*], and this has led me to some questions about type punning. To
avoid a long-winded explaination, I'm just going to post some source
and then ask my questions.

Click to expand...

Why are you posting in a C++ group?

This is C code.

#define PC_CLNMSZ 16
#define PC_CLINFOSZ (32 / sizeof (int))

// The pc_clinfo member is used to return data describing the
// attributes of a specific class. The format of this data is
// class-specific and is described under the appropriate
// heading.

typedef struct pcinfo
{
int pc_cid; /* class id */
char pc_clname[PC_CLNMSZ]; /* class name */
int pc_clinfo[PC_CLINFOSZ]; /* class information */
} pcinfo_t;

// realtime class attributes in the pc_clinfo buffer are in
// this format.

typedef struct rtinfo
{
short rt_maxpri;
} rtinfo_t;

int main ()
{
pcinfo_t pc;

// assume a call to priocntl() will initialize pc with
// something like this...

// pc.pc_clname [0] = 'R';
// pc.pc_clname [1] = 'T';
// pc.pc_clname [2] = 0;
//if (-1L == priocntl (idtype_t(), id_t(), PC_GETCLINFO, &pc))
// return -1;

//
rtinfo_t* prt = (rtinfo_t*)pc.pc_clinfo;

return 0 == prt->rt_maxpri;
}

The gcc-4.1.1 compiler generates the following warning when casting
pc.pc_clinfo (an array of int) to rtinfo_t*.

t.cpp: In function 'int main()':
t.cpp:38: warning: dereferencing type-punned pointer will break
strict-aliasing rules

It is clear that we are type-punning the pointer, but I do not see how
this particular case is dangerous.

Click to expand...

On some machines breaking aliasing rules can cause a hardware exception,
on some machines and OSes that exception is caught by the OS and fixed
but then amounting to inefficient access.

While some of these questions may
be more appropriate on the gcc list, I'm hoping to better understand
how this is supposed to work, not how it happens to work on one
implementation. My questions are as follows...

1. is this actually a bogus warning?

Click to expand...

No.

Wait a minute, sorry.

In this concrete case (but not in general) I believe it's bogus, because 'short'
cannot have stricter alising than 'int' -- I didn't see / think or something.

But it's so easy to just do things Right... ;-)

2. is this a dangerous thing to do?
Yes.

2. why does changing pc_clinfo to array of char eliminate the
warning? i.e., does that make the cast 'safe'?

Click to expand...

Not sure why it eliminates the warning, although the fact that you can
convert any POD value to a sequence of chars and back is probably
involved in making the compiler not see the problem. It doesn't make the
cast safe.

3. what techniques are there for avoiding problems like this?

If my understanding of the standard is correct, the above does indeed
invoke undefined behavior, but I'm not certain how exactly to best
work around the issue and to avoid dangerous punning in the future.

Travis

[*] http://docs.sun.com/app/docs/doc/816-5167/priocntl-2

Click to expand...

C++ code would go like this (off the cuff) -- look ma, no casts!

struct RtInfo
{
short maxPriority;
char filler[32 - sizeof(short)];
};

struct PcInfo
{
int id;
char className[16];
RtInfo rtInfo;
};

int main ()
{
PcInfo pc;

// assume a call to priocntl() will initialize pc with
// something like this...

// pc.classname [0] = 'R';
// pc.classname [1] = 'T';
// pc.classname [2] = 0;
//if (-1L == priocntl (idtype_t(), id_t(), PC_GETCLINFO, &pc))
// return -1;

return (0 == pc.rtInfo.maxPriority);
}

Cheers & hth.,

- Alf

PS: What *is* this thing/urge that programmers have about introducing
cryptic shortenings, like "hlt" instead of "halt", or, as the designer
later described as the only thing he'd correct in Unix, "creat" instead
of "create"?

Travis Vitek · Jun 17, 2009

* Alf P. Steinbach:

* Travis Vitek:

I'm maintaining some code that uses the SunOS priocntl system call
[*], and this has led me to some questions about type punning. To
avoid a long-winded explaination, I'm just going to post some source
and then ask my questions.

Click to expand...

Why are you posting in a C++ group?

This is C code.

I just happen to have reduced the testcase down to C code. Obviously
it is also perfectly legal C++ code, and the project I'm working is
entirely C++ code. If there are any caveats that are specific to the C+
+ language, I want to be sure to understand them and take them into
consideration.

Not sure why it eliminates the warning, although the fact that you can convert
any POD value to a sequence of chars and back is probably involved in making the
compiler not see the problem. It doesn't make the cast safe.

Right. Converting to a sequence of bytes is fine. It is the conversion
back where things become problematic.

C++ code would go like this (off the cuff) -- look ma, no casts!

struct RtInfo
{
short maxPriority;
char filler[32 - sizeof(short)];
};

struct PcInfo
{
int id;
char className[16];
RtInfo rtInfo;
};

Just so I'm sure that we're on the same page. It looks like you're
suggesting that I create a structure that is the same size as pcinfo_t
and just use that. That seems problematic, especially since the layout
of MY structure isn't guaranteed to be the same (RtInfo is only
required to meet the alignment requirement for short, but the array
that it replaces must be int aligned). I'd think that passing anything
that isn't layout compatible with procinfo_t to priocntl (in this use
case) would result in undefined behavior.

Regardless, the solution that I started to implement was

struct my_pcinfo_t
{
id_t pc_cid;
char pc_clname [PC_CLNMSZ];

union {
int pc_clinfo [PC_CLINFOSZ];
rtinfo_t rt_info;
tsinfo_t ts_info;
} u;
};

Unfortunately, I believe that this suffers from the same potential
problem.

Wait a minute, sorry.

In this concrete case (but not in general) I believe it's bogus, because 'short'
cannot have stricter alising than 'int' -- I didn't see / think or something.

But it's so easy to just do things Right... ;-)

So you're saying that you believe the straightforward cast is safe?

Travis

ld · Jun 17, 2009

I'm maintaining some code that uses the SunOS priocntl system call
[*], and this has led me to some questions about type punning. To
avoid a long-winded explaination, I'm just going to post some source
and then ask my questions.

#define PC_CLNMSZ 16
#define PC_CLINFOSZ (32 / sizeof (int))

// The pc_clinfo member is used to return data describing the
// attributes of a specific class. The format of this data is
// class-specific and is described under the appropriate
// heading.

typedef struct pcinfo
{
int pc_cid; /* class id */
char pc_clname[PC_CLNMSZ]; /* class name */
int pc_clinfo[PC_CLINFOSZ]; /* class information */
} pcinfo_t;

// realtime class attributes in the pc_clinfo buffer are in
// this format.

typedef struct rtinfo
{
short rt_maxpri;
} rtinfo_t;

int main ()
{
pcinfo_t pc;

// assume a call to priocntl() will initialize pc with
// something like this...

// pc.pc_clname [0] = 'R';
// pc.pc_clname [1] = 'T';
// pc.pc_clname [2] = 0;
//if (-1L == priocntl (idtype_t(), id_t(), PC_GETCLINFO, &pc))
// return -1;

//
rtinfo_t* prt = (rtinfo_t*)pc.pc_clinfo;

return 0 == prt->rt_maxpri;
}

The gcc-4.1.1 compiler generates the following warning when casting
pc.pc_clinfo (an array of int) to rtinfo_t*.

t.cpp: In function 'int main()':
t.cpp:38: warning: dereferencing type-punned pointer will break
strict-aliasing rules

It is clear that we are type-punning the pointer, but I do not see how
this particular case is dangerous. While some of these questions may
be more appropriate on the gcc list, I'm hoping to better understand
how this is supposed to work, not how it happens to work on one
implementation. My questions are as follows...

1. is this actually a bogus warning?

no, it's invalid to convert a pointer to ints to a pointer to a
struct.

2. is this a dangerous thing to do?

yes, you may have a change in the pointer value by the game of (re)-
alignment constraint.

2. why does changing pc_clinfo to array of char eliminate the
warning? i.e., does that make the cast 'safe'?

yes, as far as you know what you do afterward. In effect, you should
cast it to void*, and for historical reason, char* is similar to void*
for conversion (but differs for other points) because it was the only
solution for void*-like usage before the latter was introduced.

3. what techniques are there for avoiding problems like this?

use correct cast or rewrite the code to avoid it.

If my understanding of the standard is correct, the above does indeed
invoke undefined behavior, but I'm not certain how exactly to best
work around the issue and to avoid dangerous punning in the future.

a+, ld.

Alf P. Steinbach · Jun 17, 2009

* Travis Vitek:

* Alf P. Steinbach:

* Travis Vitek:

I'm maintaining some code that uses the SunOS priocntl system call
[*], and this has led me to some questions about type punning. To
avoid a long-winded explaination, I'm just going to post some source
and then ask my questions.

Click to expand...

Why are you posting in a C++ group?

This is C code.

Click to expand...

I just happen to have reduced the testcase down to C code. Obviously
it is also perfectly legal C++ code, and the project I'm working is
entirely C++ code. If there are any caveats that are specific to the C+
+ language, I want to be sure to understand them and take them into
consideration.

Not sure why it eliminates the warning, although the fact that you can convert
any POD value to a sequence of chars and back is probably involved in making the
compiler not see the problem. It doesn't make the cast safe.

Click to expand...

Right. Converting to a sequence of bytes is fine. It is the conversion
back where things become problematic.

C++ code would go like this (off the cuff) -- look ma, no casts!

struct RtInfo
{
short maxPriority;
char filler[32 - sizeof(short)];
};

struct PcInfo
{
int id;
char className[16];
RtInfo rtInfo;
};

Click to expand...

Just so I'm sure that we're on the same page. It looks like you're
suggesting that I create a structure that is the same size as pcinfo_t
and just use that. That seems problematic, especially since the layout
of MY structure isn't guaranteed to be the same (RtInfo is only
required to meet the alignment requirement for short, but the array
that it replaces must be int aligned). I'd think that passing anything
that isn't layout compatible with procinfo_t to priocntl (in this use
case) would result in undefined behavior.

The members preceding rtInfo are multiples of sizeof(int) on all machines where
sizeof(int) = 2^n bytes with n<=4, i.e., in practice for any int type that is
128 bits or less.

So it would take a really perverse compiler to insert e.g. a 2 byte padding
before the rtInfo member: it's already int-size aligned.

Theoretically a compiler is free to insert silly-padding, but hey. ;-)

Regardless, the solution that I started to implement was

struct my_pcinfo_t
{
id_t pc_cid;
char pc_clname [PC_CLNMSZ];

union {
int pc_clinfo [PC_CLINFOSZ];
rtinfo_t rt_info;
tsinfo_t ts_info;
} u;
};

Unfortunately, I believe that this suffers from the same potential
problem.

Wait a minute, sorry.

In this concrete case (but not in general) I believe it's bogus, because 'short'
cannot have stricter alising than 'int' -- I didn't see / think or something.

But it's so easy to just do things Right... ;-)

Click to expand...

So you're saying that you believe the straightforward cast is safe?

Yes, technically, in this concrete case.

But then, the safety of the code I presented also depends on the concrete case.
It is, after all, all about layout compatibility no matter how one goes about
it. It's just so much easier to do, and easier to get right, by avoiding all
that casting or union stuff.

And if you feel that it can't really be that easy, what about TANSTAAFL?

,
then add some compile time assertion about the offsets to be really safe, e.g. like

#include <stddef.h> // offsetof

#define STATIC_ASSERT( x ) typedef char ShouldBeTrue[x]

// This stuff would presumably come from some system header, yes?
namespace detail {
#define PC_CLNMSZ 16
#define PC_CLINFOSZ (32 / sizeof (int))

// The pc_clinfo member is used to return data describing the
// attributes of a specific class. The format of this data is
// class-specific and is described under the appropriate
// heading.

typedef struct pcinfo
{
int pc_cid; /* class id */
char pc_clname[PC_CLNMSZ]; /* class name */
int pc_clinfo[PC_CLINFOSZ]; /* class information */
} pcinfo_t;
}

struct RtInfo
{
short maxPriority;
char filler[32 - sizeof(short)];
};

struct PcInfo
{
int id;
char className[16];
RtInfo rtInfo;
};
STATIC_ASSERT(
offsetof( PcInfo, rtInfo ) ==
offsetof( detail:

cinfo, pc_clinfo )
);

Note that for a POD type you're guaranteed that the offset of the first member
is 0 -- but then, you already relied on that so you're probably aware.

Cheers & hth.,

- Alf

Joshua Maurice · Jun 18, 2009

I'm maintaining some code that uses the SunOS priocntl system call
[*], and this has led me to some questions about type punning. To
avoid a long-winded explaination, I'm just going to post some source
and then ask my questions. [snip code]
It is clear that we are type-punning the pointer, but I do not see how
this particular case is dangerous. While some of these questions may
be more appropriate on the gcc list, I'm hoping to better understand
how this is supposed to work, not how it happens to work on one
implementation. My questions are as follows...

1. is this actually a bogus warning?
2. is this a dangerous thing to do?
2. why does changing pc_clinfo to array of char eliminate the
warning? i.e., does that make the cast 'safe'?
3. what techniques are there for avoiding problems like this?

If my understanding of the standard is correct, the above does indeed
invoke undefined behavior, but I'm not certain how exactly to best
work around the issue and to avoid dangerous punning in the future.

First, the behavior is undefined by the C++ standard. However, the C
and C++ standards are not the end-all-be-all guarantees of programming
in C or C++. For example, the posix interfaces, when used properly
have undefined behavior according to the C and C++ standards. However,
it does have defined behavior according to the posix standard.

If this is a system call, and this is the intended way to use said
system call, then there's nothing wrong with that. Make double sure
though that this is the correct way to use it.

According to the C and C++ standards, accessing an object through an
lvalue of the "wrong type" is undefined behavior.

Overall, this is a very black art the more I learn about it, where
black art means not well documented, easy to make mistakes, and the
specification itself, the C and C++ languages, are buggy in this
regard. (There was a recent post demonstrating that if you have two
elements of different types in a union, and let references to them
escape the translation unit, most compilers will assume they do not
alias because their types are different, contrary to the literal
wording of the C and C++ standards. The problem is that the union
rules are just inconsistent with the separate compilation rules; it is
a bug in the C and C++ standards.)

For example, the following is not entirely correct:

no, it's invalid to convert a pointer to ints to a pointer to a
struct.

You can actually cast a "pointer to int" to a "pointer to struct" and
have well defined behavior according to the C++ standard, as long as:
1- the struct is POD,
2- the first member of the struct is an int
3- and you only access that int, not other parts of the struct

Apparently this was a common idiom in C to have tagged unions, aka to
support some form of inheritance, where each struct started with the
tag.

There's the char and unsigned char exception mentioned earlier. The
intent of the standard is relatively clear that you can access POD
types through char and unsigned char, but it never actually explicitly
says this. (Note that encoding of types is still platform dependent,
so you can write to a float through a char*, but a subsequent read of
the float through a float lvalue might crash the system because the
float had a bogus value).

Then there's alignment issues, which others have mentioned.

Then there's the exception that you can cast between pointers to POD
structs with reinterpret_cast (or C-style casts, or static_cast'ing
through void*, etc.) and access the common leading part, if any.
Related to supporting inheritance in C via tagged unions, as mentioned
above.

My favorite exception is the one which specifically says that the
following is guaranteed not to assert.
T x;
T & y = x;
T * z = reinterpret_cast<T*>(y);
assert(z == & x);

Overall, lots of little exceptions and it's very easy to shoot
yourself in the foot as the compiler will not help you at all doing
any of this. Bypassing type safety leaves you on your own.

Finally, I think you were trying to get at asking "Why is it done this
way?". IIRC, 2 reasons.
1- Strict aliasing as an optimization. If different types are not
allowed to alias, then the compiler can reorder loads and stores to
different types, possibly concurrently for modern processors.
2- Pointers can have different sizes. Yes sir, sizeof(void*) can be
unequal to sizeof(int*). The only good argument I heard for this was
some rare systems where the only addressable unit of memory is like 64
bits, but they didn't want a char to have 64 bits. They wanted
something more like 8, so the pointer in the C world is not a memory
location according to the hardware. Instead some C pointers on these
rare systems are a hardware memory location and an offset into that
"byte" (where byte is defined as the smallest addressable unit as done
by the C and C++ standards, not 8 bits as commonly understood).

Bo Persson · Jun 18, 2009

Travis said:
* Alf P. Steinbach:

Right. Converting to a sequence of bytes is fine. It is the
conversion
back where things become problematic.

The compiler actually do see the problem, and potentially produces
worse code when char is involved.

The thing here isn't about alignment, but about aliasing. As a char is
also the C++ equivalent of a byte, a char* can potentially point to
any byte in any object. This limits the amout of optimizations the
compiler can perform, like keeping values in registers, when it cannot
tell exactly what the pointer points to.

On the other hand, the compiler is allowed to assume that objects of
other types do not overlap, unless they are part of a common
structure. This means, for example, that a pcinfo struct and an array
of int can never be at the same address.

Now you try to tell the compiler that sometimes they can!

Bo Persson

James Kanze · Jun 18, 2009

I'm maintaining some code that uses the SunOS priocntl system
call [*], and this has led me to some questions about type
punning. To avoid a long-winded explaination, I'm just going
to post some source and then ask my questions.

#define PC_CLNMSZ 16
#define PC_CLINFOSZ (32 / sizeof (int))

// The pc_clinfo member is used to return data describing the
// attributes of a specific class. The format of this data is
// class-specific and is described under the appropriate
// heading.

typedef struct pcinfo
{
int pc_cid; /* class id */
char pc_clname[PC_CLNMSZ]; /* class name */
int pc_clinfo[PC_CLINFOSZ]; /* class information */
} pcinfo_t;

// realtime class attributes in the pc_clinfo buffer are in
// this format.

This is a typical, but somewhat problematic, use for Posix
functions to return varying types of information. A cleaner
solution, here, would be for the system to specify a union for
pc_clinfo. An array of char or unsigned char would also be
acceptable, provided they ensured that the preceding
declarations were such that no padding would ever be inserted
according to the rules of the system API.

typedef struct rtinfo
{
short rt_maxpri;
} rtinfo_t;

int main ()
{
pcinfo_t pc;

// assume a call to priocntl() will initialize pc with
// something like this...

// pc.pc_clname [0] = 'R';
// pc.pc_clname [1] = 'T';
// pc.pc_clname [2] = 0;
//if (-1L == priocntl (idtype_t(), id_t(), PC_GETCLINFO, &pc))
// return -1;

//
rtinfo_t* prt = (rtinfo_t*)pc.pc_clinfo;

return 0 == prt->rt_maxpri;
}

The gcc-4.1.1 compiler generates the following warning when
casting pc.pc_clinfo (an array of int) to rtinfo_t*.

t.cpp: In function 'int main()':
t.cpp:38: warning: dereferencing type-punned pointer will break
strict-aliasing rules

It is clear that we are type-punning the pointer, but I do not
see how this particular case is dangerous.

Three considerations concerning "type-punning" in general:

-- As far as the standard goes, any access through ptr is
undefined behavior, since all sorts of considerations
(alignment restrictions, possible trapping representations,
etc.) apply, and it is impossible for the standard to really
delineate all of the possibilities.

-- There is an obvious intent that implementations should make
this do something reasonable for the targetted platform.
Otherwise, there would be no point in having
reinterpret_cast at all.

-- Compiler implementers need a maximum of restrictions on
possible aliasing; the fact that pointers to different types
cannot alias the same memory is important to them.

From a QoI point of view, the fact that the cast is visible (and
that there really is a possibility of intentional aliasing)
means that the second consideration should predominate, and
there should be no problem.

That's in general. In this particular case, the issue is even
clearer. You never access pc.pc_clinfo as an int[], so there's
no aliasing involved. I don't think your code contains any
undefined behavior at all; in fact, I think that accessing
pc.pc_clinfo without the case would result in undefined
behavior. The system clearly "constructed" an rtinfo_t object
in the memory defined (reserved) by pc.pc_clinfo. In this case,
it would be using pc.pc_clinfo (as an int[] or an int*) without
first casting it into a rtinfo_t* which would result in
undefined behavior, see §3.8/4 ("A program may end the lifetime
of any object by reusing the storage which the object
occupies"---i.e. what the system does with pc.pc_clinfo in
priocntl) and §3.10/15. ("If a program attempts to acces the
stored value of an object through an lvalue of other than one of
the following types the behavior is undefined [...]"; the system
has reused the memory to construct an rtinfo_t here, so only
access through an lvalue with the possibly cv_qualified type
rtinfo_t, char or unsigned char is allowed.)

While some of these questions may be more appropriate on the
gcc list, I'm hoping to better understand how this is supposed
to work, not how it happens to work on one implementation.

The standard itself is very vague about this.

My questions are as follows...

1. is this actually a bogus warning?

In this case, definitely. It's bogus unless there are actual
accesses through both types concerned (int[] and rtinfo_t). It
would be valid from a standards point of view (but not from a
QoI point of view) if you accessed through both types, and it
would be valid, generally, if you let one or both pointers
"escape", by passing them to an external function. (E.g. it
might be valid if you did the cast before calling priocntl,
supposing that g++ doesn't know what priocntl does.)

It's a reinterpret_cast, and traditionally, reinterpret_cast
(and casts in general) have been the way to tell the compiler
that you know something it can't know. And to shut up warnings.
From a QoI point of view, warning that a reinterpret_cast might
behave like a reinterpret_cast seems a poor decision to me.

2. is this a dangerous thing to do?

In a few cases.

It's a reinterpret_cast, and the usual restrictions concerning
reinterpret_cast apply: don't use it in code you consider
"portable", and don't use it unless you really know what you're
doing. In your code, there's not the slightest problem.

2. why does changing pc_clinfo to array of char eliminate the
warning? i.e., does that make the cast 'safe'?

Again, it's not so much the cast which is unsafe, but what you
might do with the resulting pointer. In the case of char* or
unsigned char*, the standard says that accesses are allowed
through the resulting pointers, which means that the compiler
must consider that a char* and a rtinfo_t* might point to the
same memory, and avoid any reordering that would cause a program
to fail if they did. In the case of int* and rtinfo_t*, the
standard says that if they are in fact aliases, and you access
through both, your code has undefined behavior, so the compiler
can assume no aliasing.

3. what techniques are there for avoiding problems like this?

Use only cleanly designed and cleanly written code, and the
problem won't occur. Of course, this means that you can forget
about using the Windows or the Posix API (or any other system
API I've seen), and just about all third party libraries. Which
is fine as long as you don't need anything not part of the
standard: no system level IO, no communications with other
processes or machines, no GUI, no threading...

In this case, I'd encapsulate the direct interface to the system
in a very low level component (not just because of the warning),
and document that this component does generate this particular
warning, and that it can safely be ignored.

If my understanding of the standard is correct, the above does
indeed invoke undefined behavior, but I'm not certain how
exactly to best work around the issue and to avoid dangerous
punning in the future.

See above. There's nothing dangerous in what you're doing. The
only thing that worries me is that such spurious warnings
strongly suggest that the people (or at least some of the
people) working on g++ don't understand low (system) level
programming.

James Kanze · Jun 18, 2009

* Travis Vitek:

I'm maintaining some code that uses the SunOS priocntl
system call [*], and this has led me to some questions about
type punning. To avoid a long-winded explaination, I'm just
going to post some source and then ask my questions.

Click to expand...

Why are you posting in a C++ group?

This is C code.

I've got a lot of similar stuff in my C++. (I code for Posix
based machines, too.)

#define PC_CLNMSZ 16
#define PC_CLINFOSZ (32 / sizeof (int))
// The pc_clinfo member is used to return data describing the
// attributes of a specific class. The format of this data is
// class-specific and is described under the appropriate
// heading.
typedef struct pcinfo
{
int pc_cid; /* class id */
char pc_clname[PC_CLNMSZ]; /* class name */
int pc_clinfo[PC_CLINFOSZ]; /* class information */
} pcinfo_t;
// realtime class attributes in the pc_clinfo buffer are in
// this format.
typedef struct rtinfo
{
short rt_maxpri;
} rtinfo_t;

Click to expand...

Just a note, but all of the preceding has in fact been cribbed
from the system header, in order to present the problem. It's
not his code.

int main ()
{
pcinfo_t pc;
// assume a call to priocntl() will initialize pc with
// something like this...
// pc.pc_clname [0] = 'R';
// pc.pc_clname [1] = 'T';
// pc.pc_clname [2] = 0;
//if (-1L == priocntl (idtype_t(), id_t(), PC_GETCLINFO, &pc))
// return -1;
//
rtinfo_t* prt = (rtinfo_t*)pc.pc_clinfo;
return 0 == prt->rt_maxpri;
}
The gcc-4.1.1 compiler generates the following warning when
casting pc.pc_clinfo (an array of int) to rtinfo_t*.
t.cpp: In function 'int main()':
t.cpp:38: warning: dereferencing type-punned pointer will break
strict-aliasing rules
It is clear that we are type-punning the pointer, but I do
not see how this particular case is dangerous.

Click to expand...

On some machines breaking aliasing rules can cause a hardware
exception, on some machines and OSes that exception is caught
by the OS and fixed but then amounting to inefficient access.

And on most, if not all machines, compilers will assume that
lvalues expressions with different types can't alias the same
memory, with certain exceptions (most notably: lvalues with char
or unsigned char).

The hardware considerations are fairly irrelevant here, however,
He's dealing with a system interface, and presumably, the people
who specified the system API have ensured that it doesn't
violate any hardware constraints. As for the aliasing issues:
the actual memory is only accessed as a rtinfo_t, never as the
int[] it was declared as. So there isn't any actual conflict:
in fact, the system function priocntl will have constructed an
rtinfo_t in the memory reserved by pc.pc_clinfo, and accessing
it as anything other than an rtinfo_t (or a char or an unsigned
char) will result in undefined behavior. (I won't even try to
defend the quality of the system API---it's a true horror. But
as a normal user, you can't do anything about it; you just have
to live with it.)

No.

Definitely yes. He's used a reinterpret_cast to tell the
compiler to shut up, that he knows more than the compiler here
(which exceptionally happens to be true). The whole reason
d'être of reinterpret_cast is to shut up warnings (and errors);
there's something almost perverted for one to trigger a warning.

Yes.

What he's doing is practically the cleanest and the safest thing
possible, given the interface he has to deal with.

Not sure why it eliminates the warning, although the fact that
you can convert any POD value to a sequence of chars and back
is probably involved in making the compiler not see the
problem. It doesn't make the cast safe.

It eliminates the warning, because the compiler doesn't see a
possible aliasing that it doesn't take into consideration. The
standard requires compilers to consider possible aliasing when
char and unsigned char are involved.

Potentially, of course, it could result in the code not working,
since char normally has less restrictive alignment requirements
than int. If the preceding char array had an odd size, for
example, using the results of the cast could result in a core
dump. (Of course, the authors of the API ensured that the
preceding array didn't have an odd size.)

C++ code would go like this (off the cuff) -- look ma, no casts!

struct RtInfo
{
short maxPriority;
char filler[32 - sizeof(short)];
};

struct PcInfo
{
int id;
char className[16];
RtInfo rtInfo;
};

int main ()
{
PcInfo pc;

// assume a call to priocntl() will initialize pc with
// something like this...

// pc.classname [0] = 'R';
// pc.classname [1] = 'T';
// pc.classname [2] = 0;
//if (-1L == priocntl (idtype_t(), id_t(), PC_GETCLINFO, &pc))
// return -1;

return (0 == pc.rtInfo.maxPriority);
}

Except, of course, that he can't redefine the types, since
they're part of a system API. And the whole point of the
interface is that the real type of the data in the pc_clinfo
field depends on the preceding data (pc_cid and pc_clname).
pc_clinfo is basically a buffer (declared int to ensure adequate
alignment), in which the function priocntl "constructs" an
object whose type depends on other information in the structure.

Conceptually, the system request is returning a polymorphic
object; if all of the code were in C++, including the system API
(and we had garbage collection---not just over the process, but
at the system level, accross system and process boundaries), the
function would return a pointer to a base type, and he'd use
dynamic_cast to get at the additional data in the derived type.
As the system API's (at least those of both Posix and Windows)
are defined in a very primitive C, however, such elegant options
aren't available, and we have to play such funny games. If I
were writing a C++ wrapper for the function, I think I'd define
the class hierarchy (including a derived class for the error
cases), and return an std::auto_ptr to an instance of it. But I
don't expect to see anything that evolved in Posix or Windows
anytime soon.

PS: What *is* this thing/urge that programmers have about
introducing cryptic shortenings, like "hlt" instead of "halt",

A tradition from the PDP-11 days which said that names had to be
either three or six characters long. On a PDP-11 (under its
native OS), many things used a base 40 encoding which allowed
putting three characters into a 16 bit word. So in the original
Unix, all of the directories in the root directory had exactly
three characters (bin, lib, usr, etc, tmp...)---even though the
file system systematically reserved 14 characters (no more, no
less) for each filename; the habit was there. (The base 40
encoding also makes no case distinction---obviously, since it
only allows 40 different characters. Unix always has, since by
making case distinctive, you gain a line of code here and
there.)

or, as the designer later described as the only thing he'd
correct in Unix, "creat" instead of "create"?

Yes, but that one doesn't fit the pattern. "create" is six
characters---unless you reduce it to three, there's no point.

Giovanni Deretta · Jun 18, 2009

* Alf P. Steinbach:

Wait a minute, sorry.

In this concrete case (but not in general) I believe it's bogus, because 'short'
cannot have stricter alising than 'int'

I think you are confusing aliasing with alignment.

Alf P. Steinbach · Jun 18, 2009

* James Kanze:

* Travis Vitek:

I'm maintaining some code that uses the SunOS priocntl
system call [*], and this has led me to some questions about
type punning. To avoid a long-winded explaination, I'm just
going to post some source and then ask my questions.

Click to expand...

Click to expand...

Why are you posting in a C++ group?

Click to expand...

This is C code.

Click to expand...

I've got a lot of similar stuff in my C++. (I code for Posix
based machines, too.)

You can save quite a bit of work by simply not adding the complexities of the
lowest abstraction level possible. ;-)

#define PC_CLNMSZ 16
#define PC_CLINFOSZ (32 / sizeof (int))
// The pc_clinfo member is used to return data describing the
// attributes of a specific class. The format of this data is
// class-specific and is described under the appropriate
// heading.
typedef struct pcinfo
{
int pc_cid; /* class id */
char pc_clname[PC_CLNMSZ]; /* class name */
int pc_clinfo[PC_CLINFOSZ]; /* class information */
} pcinfo_t;
// realtime class attributes in the pc_clinfo buffer are in
// this format.
typedef struct rtinfo
{
short rt_maxpri;
} rtinfo_t;

Click to expand...

Click to expand...

Just a note, but all of the preceding has in fact been cribbed
from the system header, in order to present the problem. It's
not his code.
Yes.

int main ()
{
pcinfo_t pc;
// assume a call to priocntl() will initialize pc with
// something like this...
// pc.pc_clname [0] = 'R';
// pc.pc_clname [1] = 'T';
// pc.pc_clname [2] = 0;
//if (-1L == priocntl (idtype_t(), id_t(), PC_GETCLINFO, &pc))
// return -1;
//
rtinfo_t* prt = (rtinfo_t*)pc.pc_clinfo;
return 0 == prt->rt_maxpri;
}
The gcc-4.1.1 compiler generates the following warning when
casting pc.pc_clinfo (an array of int) to rtinfo_t*.
t.cpp: In function 'int main()':
t.cpp:38: warning: dereferencing type-punned pointer will break
strict-aliasing rules
It is clear that we are type-punning the pointer, but I do
not see how this particular case is dangerous.

Click to expand...

Click to expand...

On some machines breaking aliasing rules can cause a hardware
exception, on some machines and OSes that exception is caught
by the OS and fixed but then amounting to inefficient access.

Click to expand...

And on most, if not all machines, compilers will assume that
lvalues expressions with different types can't alias the same
memory, with certain exceptions (most notably: lvalues with char
or unsigned char).

I'm not sure your statement is completely correct. C99 reportedly has a strict
aliasing requirement, that lvalues of different types can't refer to the same
(or overlapping) location. AFAIK C++ doesn't.

The hardware considerations are fairly irrelevant here, however,

Depends on what you mean by "here".

For the concrete case, since the cast is OK, the hardware is irrelevant.

In discussing the significance of such a warning and dangers of using such
casts, it's very relevant whether the aliased pointer points to a type that may
have stricter alignment requirements than the original data, which is a hardware
issue (or virtual machine issue).

He's dealing with a system interface, and presumably, the people
who specified the system API have ensured that it doesn't
violate any hardware constraints. As for the aliasing issues:
the actual memory is only accessed as a rtinfo_t, never as the
int[] it was declared as. So there isn't any actual conflict:
in fact, the system function priocntl will have constructed an
rtinfo_t in the memory reserved by pc.pc_clinfo, and accessing
it as anything other than an rtinfo_t (or a char or an unsigned
char) will result in undefined behavior. (I won't even try to
defend the quality of the system API---it's a true horror. But
as a normal user, you can't do anything about it; you just have
to live with it.)

Yes.

Definitely yes.

Yes, I agree, as you'd seen if you'd read my immediate correction.

He's used a reinterpret_cast to tell the
compiler to shut up, that he knows more than the compiler here
(which exceptionally happens to be true). The whole reason
d'être of reinterpret_cast is to shut up warnings (and errors);
there's something almost perverted for one to trigger a warning.

I suspect he may be compiling with C99 extensions enabled.

What he's doing is practically the cleanest and the safest thing
possible, given the interface he has to deal with.

In general it's quite dangerous.

And it's not at all clean.

On the contrary, the code deals with two differently typed views of the same
data, hence (the possibility of the silly-) the warning; as demonstrated, a
single view is possible and cleaner.

It eliminates the warning, because the compiler doesn't see a
possible aliasing that it doesn't take into consideration. The
standard requires compilers to consider possible aliasing when
char and unsigned char are involved.

I suspect that you're trying to describe the correct explanation, but I don't
quite understand it (that is, I don't quite understand what's going on in the
compiler's "mind" here, in order to produce or not produce the warning).

Potentially, of course, it could result in the code not working,
since char normally has less restrictive alignment requirements
than int. If the preceding char array had an odd size, for
example, using the results of the cast could result in a core
dump. (Of course, the authors of the API ensured that the
preceding array didn't have an odd size.)

C++ code would go like this (off the cuff) -- look ma, no casts!

Click to expand...

struct RtInfo
{
short maxPriority;
char filler[32 - sizeof(short)];
};

Click to expand...

struct PcInfo
{
int id;
char className[16];
RtInfo rtInfo;
};

Click to expand...

int main ()
{
PcInfo pc;

Click to expand...

// assume a call to priocntl() will initialize pc with
// something like this...

Click to expand...

// pc.classname [0] = 'R';
// pc.classname [1] = 'T';
// pc.classname [2] = 0;
//if (-1L == priocntl (idtype_t(), id_t(), PC_GETCLINFO, &pc))
// return -1;

Click to expand...

return (0 == pc.rtInfo.maxPriority);
}

Click to expand...

Except, of course, that he can't redefine the types, since
they're part of a system API.

The above doesn't redefine the types.

It simply replaces a silly processing sequence

[type A -> call -> cast to B* -> type B]

that's so roundabout and redundant that it causes a warning, with

[type B -> call]

Note that the called routine is variadic (or at least is documented as such).

It's that simple: using a single type, the result type, instead of 2 different
types with casting to the result type. Avoiding the casting isn't the main
issue, though, for with some other routine there would perhaps have to be a cast
in the call. Avoiding two differently typed views of the same data is an issue.

And the whole point of the
interface is that the real type of the data in the pc_clinfo
field depends on the preceding data (pc_cid and pc_clname).
pc_clinfo is basically a buffer (declared int to ensure adequate
alignment), in which the function priocntl "constructs" an
object whose type depends on other information in the structure.

Yes, the API design is sh*tty.

Conceptually, the system request is returning a polymorphic
object; if all of the code were in C++, including the system API
(and we had garbage collection---not just over the process, but
at the system level, accross system and process boundaries), the
function would return a pointer to a base type, and he'd use
dynamic_cast to get at the additional data in the derived type.

No.

With a proper design there'd be separate routines with differently typed
arguments instead of a single routine with subfunctions chosen by an argument.

As the system API's (at least those of both Posix and Windows)
are defined in a very primitive C, however, such elegant options
aren't available, and we have to play such funny games.

No, that's no excuse; see right above.

If I
were writing a C++ wrapper for the function, I think I'd define
the class hierarchy (including a derived class for the error
cases), and return an std::auto_ptr to an instance of it. But I
don't expect to see anything that evolved in Posix or Windows
anytime soon.

IMHO, don't force unrelated types into a hierarchy.

Instead, I think it best to define each "logical" routine (sub-function) as a
distinct routine.

That allows static type checking.

A tradition from the PDP-11 days which said that names had to be
either three or six characters long.

Oh, thanks!

I didn't remember, and don't remember, but then I only did very little PDP-11
programming, in college.

Cheers,

- Alf

Alf P. Steinbach · Jun 18, 2009

* Giovanni Deretta:

I think you are confusing aliasing with alignment.

I think it was a typo. ;-)

Cheers,

- Alf

Alf P. Steinbach · Jun 18, 2009

* Pete Becker:

When you're
writing code for the system's API, write code that looks like all the
examples you've seen.

That may work in SunOS, although I highly doubt it, given the vomit-reflex
invoking design of the API in question.

And possibly you meant your comment to apply to only that OS, which if so means
there could be some chance of that advice working to the OP's advantage.

But e.g. most Microsoft code I've seen has been incorrect and needlessly
complicated like a Rube Goldberg contraption, most Microsoft programs I use are
full of bugs and are grossly inefficient (that includes Windows itself), and
generally Microsoft software developers express anything at the lowest possible
level of abstraction, which probably is a main cause of all the myriad bugs.

So e.g. in Windows it's just plain stupid to emulate the most common way to call
the APIs, which is the Microsoft way.

Of course most programmers do, otherwise it wouldn't be most common, and so in
this group we occasionally see things like _tMain instead of main, and similar
idiocy, the blind unreasoning literal adoption of "examples you've seen".

Cheers & hth.,

- Alf

Alf P. Steinbach · Jun 18, 2009

* Pete Becker:

Hmm, I could have sworn I put this in a paragraph that provided more
context. Yes, looking back, that's what I did. Too bad that context got
snipped.

What is the mysterious context that you say that I snipped and that you imply
changes the meaning of your statement to something you didn't intend?

I meant it as a general statement. Striving for portability when writing
system-level code produces results that are hard to understand and hard
to maintain.

What do you mean by "striving for portability"?

Avoiding the compiler warning?

Shrug.

The context here is your advice "When you're writing code for the system's API,
write code that looks like all the examples you've seen".

It's not a good idea as general advice.

Rather, at least when all (most of) that example code is C level, it's a very
bad idea.

My experience (twenty years of system programming, much of it on
Windows) is different.

Does that mean that in your experience the most common way to call APIs is not
the Microsoft way, or does it mean that in your experience it's not stupid to do
it the Microsoft way (which is C level and very often bug-ridden)?

That's not code for the system's API. That's application code.

I'm sorry, but I fail to see that you could fail to understand what you
responded to.

Cheers,

- Alf

Alf P. Steinbach · Jun 18, 2009

* Pete Becker:

Plonk.

That's on the same level as your insinuation that you had been misrepresented
(by a direct quote, of all).

That is, insinuating things about people, refusing to answer questions about
what you mean, and "plonking" those who ask, it's all very childish behavior.

Get a grip, man.

Cheers & hth.,

- Alf

James Kanze · Jun 19, 2009

* James Kanze:

* Travis Vitek:
I'm maintaining some code that uses the SunOS priocntl
system call [*], and this has led me to some questions about
type punning. To avoid a long-winded explaination, I'm just
going to post some source and then ask my questions.
Why are you posting in a C++ group?
This is C code.

Click to expand...

I've got a lot of similar stuff in my C++. (I code for
Posix based machines, too.)

Click to expand...

You can save quite a bit of work by simply not adding the
complexities of the lowest abstraction level possible. ;-)

I'm not sure what you're actually trying to say, but the point
is: Posix has some fairly strange interfaces, which aren't
really very typesafe---the type of the out parameter of many
functions depends on arguments to the function, or in this case,
other results of the function. Sometimes, as with ioctl,
they'll use varargs, other times, it will be a void*, or, like
here, they'll declare some "untyped" buffer. When using these
interfaces, you do end up having to do some strange casts.

[...]

I'm not sure your statement is completely correct. C99
reportedly has a strict aliasing requirement, that lvalues of
different types can't refer to the same (or overlapping)
location. AFAIK C++ doesn't.

Well, it's not where you'd reasonably look for it, but §3.10/15
places a lot of restrictions on what user code can do. It
doesn't speak in terms of aliasing; however, you can only access
an "object" (i.e. a sequence of bytes) through an lvalue
expression with either the correct type, or char or unsigned
char. (Other places state you cannot access memory unless there
is an "object" there, again with the exception of accessing it
as char or unsigned char.)

[...]

I suspect that you're trying to describe the correct
explanation, but I don't quite understand it (that is, I don't
quite understand what's going on in the compiler's "mind"
here, in order to produce or not produce the warning).

The requirements in §3.10/15. "If a program attempts to access
the stored value of an object through an lvalue of other than
one of the following types the behavior is undefined[...]". The
list of allowed types includes char and unsigned char,
regardless of the type of the object, so the compiler must
assume the lvalues of type char or unsigned char might alias the
object, regardless of its type. On the other hand, if the
object has type int, it cannot be legally accessed through an
lvalue of type rtinfo_t (or whatever his type was), and vice
versa, so the compiler can assume that there is no aliasing
between lvalues of type rtinfo_t and int.

Sometimes, at least. C++ allows memory for an object to be
reused, and both C and C++ allow unions (which comes down to
basically the same thing). At any given time, the memory has a
specific type, and can only be accessed as that type (or as char
or unsigned char), but that type can change in time. G++
definitely doesn't take this possibility into consideration,
however, and in its response to a DR, the C committee seems to
indicate that in the case of unions, it really only meant it to
work if all of the accesses where through the union type.
(Which of course makes it hard for us, who can't read minds, and
can only base what we think on what the standard actually says.)

3. what techniques are there for avoiding problems like this?
If my understanding of the standard is correct, the above
does indeed invoke undefined behavior, but I'm not certain
how exactly to best work around the issue and to avoid
dangerous punning in the future.
C++ code would go like this (off the cuff) -- look ma, no casts!
struct RtInfo
{
short maxPriority;
char filler[32 - sizeof(short)];
};
struct PcInfo
{
int id;
char className[16];
RtInfo rtInfo;
};
int main ()
{
PcInfo pc;
// assume a call to priocntl() will initialize pc with
// something like this...
// pc.classname [0] = 'R';
// pc.classname [1] = 'T';
// pc.classname [2] = 0;
//if (-1L == priocntl (idtype_t(), id_t(), PC_GETCLINFO, &pc))
// return -1;
return (0 == pc.rtInfo.maxPriority);
}

Click to expand...

Except, of course, that he can't redefine the types, since
they're part of a system API.

Click to expand...

The above doesn't redefine the types.

It's passing an incorrect type to priocntl. That's undefined
behavior.

It simply replaces a silly processing sequence

[type A -> call -> cast to B* -> type B]

that's so roundabout and redundant that it causes a warning,
with

[type B -> call]

Note that the called routine is variadic (or at least is
documented as such).

Yes. Which means that the compiler can't detect the
error---it's undefined behavior, rather than a required compile
time diagnostic. The specifications of the function say that if
the cmd argument (the third argument) is PC_GETCLINFO (his
case), he must pass a pcinfo_t* as fourth argument. Your PcInfo
is *not* a pcinfo_t, so passing its address is undefined
behavior.

Practically, of course, it's going to work (because Posix
requires all pointers to have the same size and representation)
*IF* your PcInfo has exactly the same size and layout as the
pcinfo_t. But the size and layout of a pcinfo_t isn't
specified, and in fact could easily change when e.g. moving from
32 bits to 64. Not using the "official" structure is formally
illegal, and practically unmaintainable.

[...]

With a proper design there'd be separate routines with
differently typed arguments instead of a single routine with
subfunctions chosen by an argument.

That's also an alternative. Probably a better one, most of the
time---in this case, both solutions would seem appropriate:
different functions for different commands, definitely, but for
the PC_GETCLINFO command, the actual type of some of the data
returned depends on other data returned, not on the command.

No, that's no excuse; see right above.

Right above has undefined behavior, and doesn't really solve the
problem.

IMHO, don't force unrelated types into a hierarchy.

The types are very much related. They have a common base class,
with the id and the classname. Depending on the class, the
specific attributes will vary.

Instead, I think it best to define each "logical" routine
(sub-function) as a distinct routine.

That allows static type checking.

Agreed, but... Sun has already defined the routine, as part of
Solaris, so you can't choose another design. And even if you
split out each command into a different function (which is how
Sun should have designed it), you still have the fact that the
class will dynamically depend on the cid passed into the
function, and that the data concerning the scheduling policy
depends on the class.

Alf P. Steinbach · Jun 19, 2009

* James Kanze:

* James Kanze:

Click to expand...

[snippety]

C++ code would go like this (off the cuff) -- look ma, no casts!
struct RtInfo
{
short maxPriority;
char filler[32 - sizeof(short)];
};
struct PcInfo
{
int id;
char className[16];
RtInfo rtInfo;
};
int main ()
{
PcInfo pc;
// assume a call to priocntl() will initialize pc with
// something like this...
// pc.classname [0] = 'R';
// pc.classname [1] = 'T';
// pc.classname [2] = 0;
//if (-1L == priocntl (idtype_t(), id_t(), PC_GETCLINFO, &pc))
// return -1;
return (0 == pc.rtInfo.maxPriority);
}
Except, of course, that he can't redefine the types, since
they're part of a system API.

Click to expand...

Click to expand...

The above doesn't redefine the types.

Click to expand...

It's passing an incorrect type to priocntl.

On the contrary, it's passing a correct type.

Within priocntl, with the given arguments, the pointer passed is necessarily
treated as a pointer to a type that's identically defined to this type, in order
to store the result there.

As far as the API is concerned you can just as well pass an untyped char[]
buffer, provided it is suitably aligned and of suitable size.

There's absolutely no way for the function to know that the bytes it's storing
into will be interpreted by using a direct declaration of the expected type
instead of pointer cast to that pointee type: *there is no magic*.

What matters is that the calling code access the data via a correct type, and
the above type is correct.

That's undefined behavior.

I'm sorry, that's incorrect in this context.

It simply replaces a silly processing sequence

Click to expand...

[type A -> call -> cast to B* -> type B]

Click to expand...

that's so roundabout and redundant that it causes a warning,
with

Click to expand...

[type B -> call]

Click to expand...

Note that the called routine is variadic (or at least is
documented as such).

Click to expand...

Yes. Which means that the compiler can't detect the
error---it's undefined behavior, rather than a required compile
time diagnostic. The specifications of the function say that if
the cmd argument (the third argument) is PC_GETCLINFO (his
case), he must pass a pcinfo_t* as fourth argument. Your PcInfo
is *not* a pcinfo_t, so passing its address is undefined
behavior.

I'm sorry, that's incorrect: the names do not matter.

Practically, of course, it's going to work

Yes.

(because Posix
requires all pointers to have the same size and representation)

Yes but also no: if you're talking about pointer representation then the
guarantee that the code relies on is the C++ one about pointers to class type,
which stems from the ability to declare pointers to types with unknown size
(a.k.a. "incomplete" types).

*IF* your PcInfo has exactly the same size and layout as the
pcinfo_t. But the size and layout of a pcinfo_t isn't
specified,

I'm sorry, that's incorrect: the size is specified, the layout of the things
used is specified, and that's all that matters -- there is no magic.

and in fact could easily change when e.g. moving from
32 bits to 64.

Porting code from 32-bit to 64-bit isn't a case of just recompiling. However,
there are just three relevant in-practice cases: (1) the sizes of built-in types
change, in which case all's automatic, nothing to fix, at least if the symbolic
size names from the header are used instead of the example's direct numbers, (2)
member variable declaration order is rearranged, which will be caught by any
assert, but is of probability like getting being hit by a wayward lame duck, (3)
the functionality of the API is changed, or the API is removed, or whatever,
fundamental change, in which case the code needs to be fixed, perhaps
reimplemented, anyway. Summing up, one is not worse off. But one is better off.

Not using the "official" structure is formally illegal

I'm sorry, no, that's just meaningless.

, and practically unmaintainable.

Again, sorry, that's just FUD (although I believe that you believe it, and don't
mean it as FUD), see above.

There is however a point there in the direction you point at.

Namely, that if such techniques are applied mindlessly, then one ends up with an
unmaintainable mess. It's the same as always. Mindless programming => mess.

[snippety]

Cheers & hth.,

- Alf

James Kanze · Jun 20, 2009

* James Kanze:

[snippety]

C++ code would go like this (off the cuff) -- look ma,
no casts!
struct RtInfo
{
short maxPriority;
char filler[32 - sizeof(short)];
};
struct PcInfo
{
int id;
char className[16];
RtInfo rtInfo;
};
int main ()
{
PcInfo pc;
// assume a call to priocntl() will initialize pc with
// something like this...
// pc.classname [0] = 'R';
// pc.classname [1] = 'T';
// pc.classname [2] = 0;
//if (-1L == priocntl (idtype_t(), id_t(), PC_GETCLINFO, &pc))
// return -1;
return (0 == pc.rtInfo.maxPriority);
}
Except, of course, that he can't redefine the types, since
they're part of a system API.
The above doesn't redefine the types.

Click to expand...

It's passing an incorrect type to priocntl.

Click to expand...

On the contrary, it's passing a correct type.

No. Read the man page. The only correct type when the third
argument is PC_GETCLINFO is a pcinfo_t* (defined in

Within priocntl, with the given arguments, the pointer passed
is necessarily treated as a pointer to a type that's
identically defined to this type, in order to store the result
there.

I'd suggest that you start by reading the specification of the
function. (Admittedly, it's overly complex, and it's not easy
to find the necessary information.) You can't pass anything
else, even if you think it happens to look the same. And you
don't even know the order of the elements in a pcinfo_t. All
the specification says is that there must be three, with types
id_t, char[PC_CLNMSZ] and int[PC_CLINFOSZ]. And in the return
value, whether pc_clinfo actually contains an rtinfo_t or
something else depends on the pc_clname field---which is set by
the function, not by the caller.

The posted code has obviously been extracted from something more
complex (since the issue was the casting, and not the interface
of priocntl). The actual code probably looked more like:

// ...
if ( priocntl( idtype_t(), id_t(), PC_GETCLINFO, &pc )
== -1 ) {
// handle error...
} else if ( strcmp( pc.pc_clname, "RT" ) == 0 ) {
// the funny cast, etc.
} else {
// this may or may not be expected, but must
// never the less be present. If we get here,
// pc.pc_clinfo contains some other type, and
// not an rtinfo_t.
}

It's in order to handle that last else branch that I suggested
that a clean interface should return a polymorphic type.

As far as the API is concerned you can just as well pass an
untyped char[] buffer, provided it is suitably aligned and of
suitable size.

That's not what the documentation says. You would probably get
away with it, but it's undefined behavior according to both the
C standard and Sun's documentation.

There's absolutely no way for the function to know that the
bytes it's storing into will be interpreted by using a direct
declaration of the expected type instead of pointer cast to
that pointee type: *there is no magic*.

But there's no way for you to know the type it's going to store
there before calling the function. That's *why* the field is
declared as a buffer, and not declared with the correct type.

What matters is that the calling code access the data via a
correct type, and the above type is correct.

I'm sorry, that's incorrect in this context.

You say it's not. The man page from Sun says it is. Who am I
to believe?

[...]

I'm sorry, that's incorrect: the names do not matter.

According to the C and the C++ standards, they do. More
importantly, however, you haven't defined the struct in the same
way. In fact, you can't define the struct in the same way,
because you don't know how it is defined; the specifications of
the function don't provide that information.

Yes.

Yes but also no: if you're talking about pointer
representation then the guarantee that the code relies on is
the C++ one about pointers to class type, which stems from the
ability to declare pointers to types with unknown size (a.k.a.
"incomplete" types).

Yes. There's a lot more to it than that, and I probably
shouldn't have mentionned only the pointer issue. It's the if
that follows, of course, which is most important. (Formally,
both the C standard and Posix allow strict type checking on
varargs, which would mean that passing anything but the struct
defined in the appropriate header would fail. In practice, Sun
doesn't do such checking, and almost certainly won't in the
future.)

I'm sorry, that's incorrect: the size is specified, the layout
of the things used is specified,

Where? All the specification I have says is that a pcinfo_t
struct must have the following members:
id_t pc_cid ;
char pc_clname[ PC_CLNMSZ ] ;
int pc_clinfo[ PC_CLINFOSZ ] ;
I can't find anything that imposes an order; I can't find
anything which forbids additional members; I can't find anything
that specifies the values of PC_CLINFOSZ or PC_CLINFOSZ; and I
can't find anything which says that id_t is guaranteed to be an
int. (In fact, the reason id_t is used is precisely because it
isn't guaranteed to be an int.)

and that's all that matters -- there is no magic.

Porting code from 32-bit to 64-bit isn't a case of just
recompiling.

Sure it is. I do it all the time. I've got lots of code that
was developed in 32 bit mode, then moved to 64 bits (typically
in order to handle larger files), and I've never had to change a
line of code because of it.

However, there are just three relevant in-practice cases: (1)
the sizes of built-in types change, in which case all's
automatic, nothing to fix, at least if the symbolic size names
from the header are used instead of the example's direct
numbers, (2) member variable declaration order is rearranged,
which will be caught by any assert, but is of probability like
getting being hit by a wayward lame duck, (3) the
functionality of the API is changed, or the API is removed, or
whatever, fundamental change, in which case the code needs to
be fixed, perhaps reimplemented, anyway. Summing up, one is
not worse off. But one is better off.

The most likely change is that PC_CLINFOSZ (and possibly
PC_CLNMSZ, in order to ensure a stricter alignment of the
following field) will change. Or the type of id_t.

If you wanted to go the route you're suggesting (IMHO, it's more
trouble than it's worth), then you'd still have to define your
types more along the lines of:

struct RtInfo
{
short maxPriority ;
};

struct PcInfo
{
id_t id ;
char className[ PC_CLNMSZ ] ;
union {
RtInfo rtInfo ;
int forAlignment ;
char forSize[ PC_CLINFOSZ * sizeof( int ) ] ;
} ;
} ;

There's still a risque with regards to the order, or additional
members, but it's probably fairly small (although personally, I
don't like to count on anything that isn't guaranteed). But
between the union and the reinterpret_cast, what do you gain?

Alf P. Steinbach · Jun 20, 2009

* James Kanze:

* James Kanze:

* James Kanze:
* Travis Vitek:

Click to expand...

[snippety]
C++ code would go like this (off the cuff) -- look ma,
no casts!
struct RtInfo
{
short maxPriority;
char filler[32 - sizeof(short)];
};
struct PcInfo
{
int id;
char className[16];
RtInfo rtInfo;
};
int main ()
{
PcInfo pc;
// assume a call to priocntl() will initialize pc with
// something like this...
// pc.classname [0] = 'R';
// pc.classname [1] = 'T';
// pc.classname [2] = 0;
//if (-1L == priocntl (idtype_t(), id_t(), PC_GETCLINFO, &pc))
// return -1;
return (0 == pc.rtInfo.maxPriority);
}
Except, of course, that he can't redefine the types, since
they're part of a system API.
The above doesn't redefine the types.
It's passing an incorrect type to priocntl.

Click to expand...

Click to expand...

On the contrary, it's passing a correct type.

Click to expand...

No. Read the man page. The only correct type when the third
argument is PC_GETCLINFO is a pcinfo_t* (defined in
<sys/priocntl.h>) Anything else is undefined behavior.

First, Undefined Behavior is a term used in the C++ standard for certain aspects
of the C++ language, it does not apply here and is completely meaningless here,
unless you want to also describe the OP's call as Undefined Behavior.

Come to think of it you have done exactly that, described the OP's code as
formally UB. So you're technically contradicting yourself by implying that the
code above is somehow C++ UB while the OP's code isn't. I am aware of how that
technical point, the point of whether there is real meaning at all, does matter
when speaking to me, and doesn't matter in FUDding the general audience.

Second, if you try to use some more meaningful non-standard meaning of
"Undefined Behavior", sort of, "it won't necessarily work", then you have
yourself stated that of course it will work, again a self-contradiction (there
can't be anything undefined about something that clearly will and does work).

Summing up so far, for both possible intended meanings of UB above you have
engaged in self-contradiction, and the only conclusion is that you're not
talking to me, because you know that I don't suffer idiot statements gladly (not
my own, and not yours), but that you are instead trying to FUD the general
audience with some emotive techno-babble.

Third, with respect to the word "correct" it's meaningless to apply it the way
you do above, like there could be something incorrect about perfectly working
code, and so it seems that you are playing word games also with respect to that
term, and that is deceitful.

And if you're not playing word games but really manage to not see that that it's
meaningless verbiage in the above quote, then, well, I think I'll quote Newton,
"I frame no hypotheses" as to how you then manage that kind of double-thinking.

I'd suggest that you start by reading the specification of the
function. (Admittedly, it's overly complex, and it's not easy
to find the necessary information.)

It seems that you fail to understand simple logic and examples that work?

And want to muddy the waters, yes?

You know that I know that it's completely and utterly irrelevant, so it must be
other readers you are trying to send on a wild-goose chase into complexity.

Bah.

I snip the rest of your article: I do not care to engage in "speaking to the
audience": this issue is extremely simple and clear cut, and you have presented
no technical point but a lot of FUD, which in the previous article I couldn't
believe you did intentionally, but now with more of it I do, and arguing with
someone who presents FUD for the audience and nothing for me, it pisses me off.

[SNIP rest]

- Alf (pissed off)

type punning	9	Sep 4, 2008
Type Punning	3	May 17, 2009
Clarification on type punning.	24	Aug 24, 2009
type-punning?	15	May 6, 2008
Type punning, code reordering and overloaded operator new() withcustom allocator	0	Nov 23, 2007
returning a type	4	Feb 5, 2012
Variadic function can have type safety.	0	Dec 20, 2012
Question on implicit type conversion	2	Mar 7, 2012

Type punning question

Travis Vitek

Alf P. Steinbach

Alf P. Steinbach

Travis Vitek

ld

Alf P. Steinbach

Joshua Maurice

Bo Persson

James Kanze

James Kanze

Giovanni Deretta

Alf P. Steinbach

Alf P. Steinbach

Alf P. Steinbach

Alf P. Steinbach

Alf P. Steinbach

James Kanze

Alf P. Steinbach

James Kanze

Alf P. Steinbach

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads