Proper way to do casts while avoiding aliasing issues

L

liljencrantz

Hi,

I have a piece of code that uses hashtables to store pointers to
various bits of data. The hashtable sees all pointers as const void *,
while the application obviously uses various other pointer types for
the data. I've run into a warning with the following code:

void hash_remove( hash_table_t *h,
const void *key,
const void **old_key,
const void **old_data );

void function_remove( const wchar_t *name )
{
void *key;
function_data_t *d;
event_t ev;

hash_remove( &function,
name,
(const void **) &key,
(const void **)&d );

....
}

Compiling it using GCC 4.1 results in the following warning:

gcc -g -O2 -std=c99 -fno-optimize-sibling-calls -Wall -D _GNU_SOURCE
-c -o function.o function.c function.c: In function
'function_remove':
function.c:204: warning: dereferencing type-punned pointer will break
strict-aliasing rules

The call to hash_remove is what is generating the warning.

My understanding about why this is a problem may be a bit blurry. In
general, using typecasting and pointer dereferencing does not seem to
be proper C99, and may end up breaking the compilers assumptions about
(non)aliasing of pointers of different types, and I suppose that is why
GCC is a bit unhappy here.

I can get around this by using a union of a const void * and the real
datatype, but this seems like a hacksih solution. Any pointers to the
proper way to do this type of typecasting, so as to avoid aliasing
problems would be welcome.

Also, I noticed that the comp.lang.c FAQ at
www.eskimo.com/~scs/c-faq.com/ returns an access error, and at least
the google cache of it contains no mention of aliasing. Perhaps a few
questions about aliasing could be added to the FAQ as well?
 
B

Barry Schwarz

Hi,

I have a piece of code that uses hashtables to store pointers to
various bits of data. The hashtable sees all pointers as const void *,
while the application obviously uses various other pointer types for
the data. I've run into a warning with the following code:

void hash_remove( hash_table_t *h,
const void *key,
const void **old_key,
const void **old_data );

void function_remove( const wchar_t *name )
{
void *key;
function_data_t *d;
event_t ev;

hash_remove( &function,
name,
(const void **) &key,
(const void **)&d );

...
}

Compiling it using GCC 4.1 results in the following warning:

gcc -g -O2 -std=c99 -fno-optimize-sibling-calls -Wall -D _GNU_SOURCE
-c -o function.o function.c function.c: In function
'function_remove':
function.c:204: warning: dereferencing type-punned pointer will break
strict-aliasing rules

The call to hash_remove is what is generating the warning.

My understanding about why this is a problem may be a bit blurry. In
general, using typecasting and pointer dereferencing does not seem to
be proper C99, and may end up breaking the compilers assumptions about
(non)aliasing of pointers of different types, and I suppose that is why
GCC is a bit unhappy here.
Your prototype says that the first argument to hash_remove will have
type hash_table_t*. In your calling statement, it has type
function_data_t**. Are these really compatible?


Remove del for email
 
C

CBFalconer

I have a piece of code that uses hashtables to store pointers to
various bits of data. The hashtable sees all pointers as const
void *, while the application obviously uses various other pointer
types for the data. I've run into a warning with the following code:

void hash_remove( hash_table_t *h,
const void *key,
const void **old_key,
const void **old_data );

void function_remove( const wchar_t *name )
{
void *key;
function_data_t *d;
event_t ev;

hash_remove( &function,
name,
(const void **) &key,
(const void **)&d );
...
}

Compiling it using GCC 4.1 results in the following warning:

You may want to look at how the interface to hashlib works with
generic (i.e. void*) pointers. See:

<http://cbfalconer.home.att.net/download/hashlib.zip>

--
Some informative links:
http://www.geocities.com/nnqweb/
http://www.catb.org/~esr/faqs/smart-questions.html
http://www.caliburn.nl/topposting.html
http://www.netmeister.org/news/learn2quote.html
 
R

Rod Pemberton

Hi,

I have a piece of code that uses hashtables to store pointers to
various bits of data. The hashtable sees all pointers as const void *,
while the application obviously uses various other pointer types for
the data. I've run into a warning with the following code:

void hash_remove( hash_table_t *h,
const void *key,
const void **old_key,
const void **old_data );

void function_remove( const wchar_t *name )
{
void *key;
function_data_t *d;
event_t ev;

hash_remove( &function,
name,
(const void **) &key,
(const void **)&d );

...
}

Compiling it using GCC 4.1 results in the following warning:

gcc -g -O2 -std=c99 -fno-optimize-sibling-calls -Wall -D _GNU_SOURCE
-c -o function.o function.c function.c: In function
'function_remove':
function.c:204: warning: dereferencing type-punned pointer will break
strict-aliasing rules

The call to hash_remove is what is generating the warning.

Yes, but you don't say which argument is causing the problem. From what
you've posted, 'name', 'key', or 'd' is just as likely as 'function' to be
creating the error... Which brings up the question: "Why haven't you cast
'function' and 'name'?"
My understanding about why this is a problem may be a bit blurry. In
general, using typecasting and pointer dereferencing does not seem to
be proper C99, and may end up breaking the compilers assumptions about
(non)aliasing of pointers of different types, and I suppose that is why
GCC is a bit unhappy here.

http://mail-index.netbsd.org/tech-kern/2003/08/11/0001.html
http://www.lysator.liu.se/c/restrict.html


Rod Pemberton
 
C

Chris Torek

I have a piece of code that uses hashtables to store pointers to
various bits of data. The hashtable sees all pointers as const void *,
while the application obviously uses various other pointer types for
the data. I've run into a warning with the following code:

void hash_remove( hash_table_t *h,
const void *key,
const void **old_key,
const void **old_data );

This says that hash_remove needs:

- a pointer to a hash table, "h"
- a key, "key"
- a pointer to a "const void *", old_key
- a pointer to a "const void *", old_data

Let us assume, for the sake of argument, that "const void *" is a
special datatype that uses 16 bytes to hold lots of extra information
about the target, while other pointers (where the compiler can make
more assumptions about the target) use just 4 bytes. So a *pointer
to* "const void *" is only 4 bytes long but it points to 16 bytes.
Meanwhile let us assume that a "double" is 8 bytes. Then:

double x = 3.141592653589793238462;
const void *p = &x;
const void **q = &p;

might look like this in memory:

x p
+---------------+ +------------------------------------------+
| 3.415926... | <-|---------------------* |
+---------------+ +------------------------------------------+
^
|
|
+-------+
q | * |
+-------+

Now we have this:
void function_remove( const wchar_t *name ) {
void *key;
function_data_t *d;
event_t ev;

hash_remove( &function,

Presumably "&function" is the address of a hash table, and OK (although
we have not seen a declaration for the identifier "function").

Here "name" has type "const wchar_t *", but this is converted to a
32-byte "const void *" and the resulting value is passed in. This
seems fine.
(const void **) &key,

Here we have a possible problem. The variable "key" is just "void *",
not "const void *". If "key" were "const void *", &key would have
type "const void **" and be a 4-byte pointer to a 16-byte pointer.
The 16-byte pointer currently contains garbage (is uninitialized)
but this would presumably be OK -- hash_remove is going to fill it
in.

As it happens, the C standard has some fuzzy wording about how
const qualifiers should not affect the underlying representation,
so if "const void *" is a 16-byte thing, plain "void *" should
also be a 16-byte thing. So &key should be a 4-byte pointer to
a 16-byte pointer, and this should actually work. It would be
better to make "key" a "const void *" and remove the cast, though.
(const void **)&d );

Here, on the other hand, we have a definite problem.

Remember that, except for "void *", we said that pointers were all
four bytes. Here "d" has type "function_data_t *", so it is a
4-byte pointer. &d is a 4-byte pointer to a 4-byte pointer:

&d d
+-------+ +-------+
| *---|--> | *---|-----? [we have no idea where this points]
+-------+ +-------+

but hash_remove is expecting a 4-byte pointer to a 16-byte pointer:

old_data
+-------+ +-------+.....................+
| *---|--> | *---|
+-------+ +-------+.....................+
\_____/ \____________________/
OK expected, but not there
Compiling it using GCC 4.1 results in the following warning:

gcc -g -O2 -std=c99 -fno-optimize-sibling-calls -Wall -D _GNU_SOURCE
-c -o function.o function.c function.c: In function
'function_remove':
function.c:204: warning: dereferencing type-punned pointer will break
strict-aliasing rules

The call to hash_remove is what is generating the warning.

My understanding about why this is a problem may be a bit blurry. In
general, using typecasting and pointer dereferencing does not seem to
be proper C99, and may end up breaking the compilers assumptions about
(non)aliasing of pointers of different types, and I suppose that is why
GCC is a bit unhappy here.

Yes. Or, if pointers to different data types have different sizes,
you could run into a situation in which hash_remove() updates all
16 bytes of your 4-byte pointer.
I can get around this by using a union of a const void * and the real
datatype, but this seems like a hacksih solution. Any pointers to the
proper way to do this type of typecasting, so as to avoid aliasing
problems would be welcome.

In general, the "right way to cast pointers" is not to do it at all.

Consider the following small rewrite of function_remove():

void function_remove(const wchar_t *name) {
const void *key;
const void *old_data;
const function_data_t *d;
event_t ev;

/* this fills in "key" and "old_data" */
hash_remove(&function, name, &key, &old_data);

/*
* We do not use "key" but we do use "old_data".
* The "const void *" value that hash_remove() filled
* in is actually a read-only pointer to function_data_t
* (after being converted to "void *"), so we convert
* it back from "void *" to "function_data_t *", all
* const-qualified, i.e., read-only.
*/
d = old_data;

... more code here, presumably ...
}

It is probably the case that hash_remove() does not modify the
data, but the data themselves are not necessarily read-only. If
this is true, you will have fewer headaches if you remove most of
the "const"s. Otherwise you will have places at which you are
forced to use casts to cast away the const-ness. This means that
"const" is not giving you any protection after all, and is therefore
useless. (This is usually true in C, which gets "const" quite
wrong. C++ is better about it, although still "not quite right".
It would all work right, and I think be more in the "spirit of C",
if const had been a storage-class modifier instead of a type-qualifier.
The other alternative is to have dynamic typing, as in real OO
languages, so that the type of a return parameter corresponds to
the type of the input parameter. This is sort of like C++ templates,
except not ugly :) . But oh well.)

Retaining just the one "useful or at least harmless" const gives:

void hash_remove(hash_table_t *h, const void *key,
void **old_key, void **old_data);

void function_remove(const wchar_t *name) {
void *key;
void *old_data;
function_data_t *d;
event_t ev;

hash_remove(&function, name, &key, &old_data);
d = old_data;

... more code here, presumably ...
}

In either case, though, the heart of the matter is to supply a
correct "&old_data", then derive the new value for "d" by *converting*
the value that hash_remove() fills in. It is true that, on many
(or perhaps even most) machines, this "conversion" is just "take
the bits as-is", but by doing an actual conversion -- even if the
compiler uses a total of 0 (zero) machine instructions to implement
that conversion -- you get guaranteed-to-work code. On machines
where the "conversion" is a no-op, if the compiler is half-decent
(and gcc is), you get the zero-extra-instructions you would if you
wrote non-portable, not-guaranteed code. Thus, there is no penalty
for doing it right.
 
L

liljencrantz

Thank you for taking the time to write this very helpful and
enlightening response. May I suggest that it is in some way added to
the FAQ?

Chris said:
Let us assume, for the sake of argument, that "const void *" is a
special datatype that uses 16 bytes to hold lots of extra information
about the target, while other pointers (where the compiler can make
more assumptions about the target) use just 4 bytes. So a *pointer
to* "const void *" is only 4 bytes long but it points to 16 bytes.
Meanwhile let us assume that a "double" is 8 bytes. Then:

double x = 3.141592653589793238462;
const void *p = &x;
const void **q = &p;

might look like this in memory:

x p
+---------------+ +------------------------------------------+
| 3.415926... | <-|---------------------* |
+---------------+ +------------------------------------------+
^
|
|
+-------+
q | * |
+-------+

Do any systems exist that actually _do_ this? (I want to fix the
problem regardless, I'm just curious about if any system actually use
this extre freedom provided by C)
Now we have this:


Presumably "&function" is the address of a hash table, and OK (although
we have not seen a declaration for the identifier "function").

My bad. Your assumption is correct.

Here "name" has type "const wchar_t *", but this is converted to a
32-byte "const void *" and the resulting value is passed in. This
seems fine.
(const void **) &key,

Here we have a possible problem. The variable "key" is just "void *",
not "const void *". If "key" were "const void *", &key would have
type "const void **" and be a 4-byte pointer to a 16-byte pointer.
The 16-byte pointer currently contains garbage (is uninitialized)
but this would presumably be OK -- hash_remove is going to fill it
in.

As it happens, the C standard has some fuzzy wording about how
const qualifiers should not affect the underlying representation,
so if "const void *" is a 16-byte thing, plain "void *" should
also be a 16-byte thing. So &key should be a 4-byte pointer to
a 16-byte pointer, and this should actually work. It would be
better to make "key" a "const void *" and remove the cast, though.
(const void **)&d );

Here, on the other hand, we have a definite problem.

Remember that, except for "void *", we said that pointers were all
four bytes. Here "d" has type "function_data_t *", so it is a
4-byte pointer. &d is a 4-byte pointer to a 4-byte pointer:

&d d
+-------+ +-------+
| *---|--> | *---|-----? [we have no idea where this points]
+-------+ +-------+

but hash_remove is expecting a 4-byte pointer to a 16-byte pointer:

old_data
+-------+ +-------+.....................+
| *---|--> | *---|
+-------+ +-------+.....................+
\_____/ \____________________/
OK expected, but not there
Compiling it using GCC 4.1 results in the following warning:

gcc -g -O2 -std=c99 -fno-optimize-sibling-calls -Wall -D _GNU_SOURCE
-c -o function.o function.c function.c: In function
'function_remove':
function.c:204: warning: dereferencing type-punned pointer will break
strict-aliasing rules

The call to hash_remove is what is generating the warning.

My understanding about why this is a problem may be a bit blurry. In
general, using typecasting and pointer dereferencing does not seem to
be proper C99, and may end up breaking the compilers assumptions about
(non)aliasing of pointers of different types, and I suppose that is why
GCC is a bit unhappy here.

Yes. Or, if pointers to different data types have different sizes,
you could run into a situation in which hash_remove() updates all
16 bytes of your 4-byte pointer.
I can get around this by using a union of a const void * and the real
datatype, but this seems like a hacksih solution. Any pointers to the
proper way to do this type of typecasting, so as to avoid aliasing
problems would be welcome.

In general, the "right way to cast pointers" is not to do it at all.

Consider the following small rewrite of function_remove():

void function_remove(const wchar_t *name) {
const void *key;
const void *old_data;
const function_data_t *d;
event_t ev;

/* this fills in "key" and "old_data" */
hash_remove(&function, name, &key, &old_data);

/*
* We do not use "key" but we do use "old_data".
* The "const void *" value that hash_remove() filled
* in is actually a read-only pointer to function_data_t
* (after being converted to "void *"), so we convert
* it back from "void *" to "function_data_t *", all
* const-qualified, i.e., read-only.
*/
d = old_data;

... more code here, presumably ...
}

It is probably the case that hash_remove() does not modify the
data, but the data themselves are not necessarily read-only. If
this is true, you will have fewer headaches if you remove most of
the "const"s. Otherwise you will have places at which you are
forced to use casts to cast away the const-ness. This means that
"const" is not giving you any protection after all, and is therefore
useless. (This is usually true in C, which gets "const" quite
wrong. C++ is better about it, although still "not quite right".
It would all work right, and I think be more in the "spirit of C",
if const had been a storage-class modifier instead of a type-qualifier.
The other alternative is to have dynamic typing, as in real OO
languages, so that the type of a return parameter corresponds to
the type of the input parameter. This is sort of like C++ templates,
except not ugly :) . But oh well.)

Ok, that makes sense. Thanks!
Retaining just the one "useful or at least harmless" const gives:

void hash_remove(hash_table_t *h, const void *key,
void **old_key, void **old_data);

void function_remove(const wchar_t *name) {
void *key;
void *old_data;
function_data_t *d;
event_t ev;

hash_remove(&function, name, &key, &old_data);
d = old_data;

... more code here, presumably ...
}

In either case, though, the heart of the matter is to supply a
correct "&old_data", then derive the new value for "d" by *converting*
the value that hash_remove() fills in. It is true that, on many
(or perhaps even most) machines, this "conversion" is just "take
the bits as-is", but by doing an actual conversion -- even if the
compiler uses a total of 0 (zero) machine instructions to implement
that conversion -- you get guaranteed-to-work code. On machines
where the "conversion" is a no-op, if the compiler is half-decent
(and gcc is), you get the zero-extra-instructions you would if you
wrote non-portable, not-guaranteed code. Thus, there is no penalty
for doing it right.

Yes, I've noticed that the const:ness doesn't buy you much. The
disadvantages are pretty clearly outlined in your response, the
advantage is that you never get a non-const pointer to a const memory
region, which is something that always makes me a bit uneasy. I added
it to err on the side of caution. You may be right that the
disadvantage outweighs the advantage - I might remove the const:s.
 
C

Chris Torek

Do any systems exist that actually _do_ this? (I want to fix the
problem regardless, I'm just curious about if any system actually use
this extre freedom provided by C)

The answer is "well, almost". :)

The classic example these days, for machines with "weird" pointers,
is the IBM AS/400. But the AS/400 has a uniform "data pointer"
space, and a separate but also uniform "function pointer" space,
as I understand it. (I have never used an AS/400.)

Somewhat less recently, we can look at the Cray machines, the
Data General Eclipse (the MV/10000 -- this is also the machine
referred to in the title of the book "The Soul of a New Machine").
All of these have special "byte pointer" *formats*. These do
not take up *more* space than other pointers, but they do have
"extra bits" when compared to other "normal" (word) pointers.
This comes about because the machines are, at some fundamental
level, "word-addressed" rather than "byte-addressed", with a
word being sufficiently large so that the C-compiler-writers
decided to give you "sub-word" data types to hold characters.
In order to fit two or more "char"s into a word, you have to
start with a word pointer, then add at least one -- and in the
case of the Cray, 3 -- extra bits somewhere, to pick out which
of the (2 or 8, respectively) bytes you really want.

On the Eclipse, this extra bit is actually handled by hardware:
the machine, at machine-code level, has separate "byte pointers"
and "word pointers", and instructions to convert from one to
the other.

If we go back into "C prehistory" (long before there was an ANSI
standard for C), we also find machines like the PR1ME, with 32
bit pointers but 48-bit "byte pointers". As with the Cray, the
reason for the extra-long "byte pointer" is that machine-level
pointers only point to whole words. To select a byte within
a machine word, the "byte pointer" uses additional memory.

While it is not directly related to this: in the bad old days of
the 80186, 80286, and early 80386 Intel chips, one also had to deal
with "segmented" systems and multiple "memory models" (usually
called "small", "large", "compact", and "huge"). These also result
in multiple pointer formats, although in this case, the formats
are based not on the data type, but rather on the size of the target
data. With the x86-64 now appearing, these days have returned:
the x86-64 necessarily provides multiple memory models, so that
you can run old 32-bit code and new 64-bit code on the same machine.

I would bet that data-type-dependent pointer formats will also be
"reinvented", and will re-appear at some point in the future. One
mistake people make in disregarding history is that they believe
"we are smarter now". We are not any smarter -- people back then
were just as clever -- we just have different materials and tool
sets, that allow us to tackle tougher problems. But the solutions
they invented had a reason to exist -- they were solving real
problems -- and those reasons, those same problems, tend to recur.
So history repeats -- or as Mark Twain was supposed to have said,
"History doesn't repeat itself, but it does rhyme." (This does
not appear to be an actual Twain quote. There is also a version
with "it echos".)
 
L

liljencrantz

Chris Torek skrev:
The answer is "well, almost". :)

The classic example these days, for machines with "weird" pointers,
is the IBM AS/400. But the AS/400 has a uniform "data pointer"
space, and a separate but also uniform "function pointer" space,
as I understand it. (I have never used an AS/400.)

Somewhat less recently, we can look at the Cray machines, the
Data General Eclipse (the MV/10000 -- this is also the machine
referred to in the title of the book "The Soul of a New Machine").
All of these have special "byte pointer" *formats*. These do
not take up *more* space than other pointers, but they do have
"extra bits" when compared to other "normal" (word) pointers.
This comes about because the machines are, at some fundamental
level, "word-addressed" rather than "byte-addressed", with a
word being sufficiently large so that the C-compiler-writers
decided to give you "sub-word" data types to hold characters.
In order to fit two or more "char"s into a word, you have to
start with a word pointer, then add at least one -- and in the
case of the Cray, 3 -- extra bits somewhere, to pick out which
of the (2 or 8, respectively) bytes you really want.

On the Eclipse, this extra bit is actually handled by hardware:
the machine, at machine-code level, has separate "byte pointers"
and "word pointers", and instructions to convert from one to
the other.

If we go back into "C prehistory" (long before there was an ANSI
standard for C), we also find machines like the PR1ME, with 32
bit pointers but 48-bit "byte pointers". As with the Cray, the
reason for the extra-long "byte pointer" is that machine-level
pointers only point to whole words. To select a byte within
a machine word, the "byte pointer" uses additional memory.

While it is not directly related to this: in the bad old days of
the 80186, 80286, and early 80386 Intel chips, one also had to deal
with "segmented" systems and multiple "memory models" (usually
called "small", "large", "compact", and "huge"). These also result
in multiple pointer formats, although in this case, the formats
are based not on the data type, but rather on the size of the target
data. With the x86-64 now appearing, these days have returned:
the x86-64 necessarily provides multiple memory models, so that
you can run old 32-bit code and new 64-bit code on the same machine.

I would bet that data-type-dependent pointer formats will also be
"reinvented", and will re-appear at some point in the future. One
mistake people make in disregarding history is that they believe
"we are smarter now". We are not any smarter -- people back then
were just as clever -- we just have different materials and tool
sets, that allow us to tackle tougher problems. But the solutions
they invented had a reason to exist -- they were solving real
problems -- and those reasons, those same problems, tend to recur.
So history repeats -- or as Mark Twain was supposed to have said,
"History doesn't repeat itself, but it does rhyme." (This does
not appear to be an actual Twain quote. There is also a version
with "it echos".)

Thank you for your detailed and interesting answers.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,772
Messages
2,569,593
Members
45,108
Latest member
AlbertEste
Top