Undefined behavior - 2 queries

T

Tommy Vercetti

Hi -

Great group!

I have 2 queries about undefined behavior:

1) Is the following code undefined?

float myfunction(float f)
{
if(sizeof(float) & sizeof(int)) {
/* use slow ordinary FP operations */
} else {
int i = *(int *)&f;
/* do clever bit-twiddling floating-point operation */
}
}

It only attempts to access the float as an integer when this makes
sense.

2) I've seen code like the following:

struct s {
int a;
char *b;
float c;
} s;
int *i = (int *) s;

Isn't this undefined? Won't this blow up if the compiler inserts padding
before the first element of the struct?

Cheers!
 
M

Malcolm McLean

Tommy Vercetti said:
Hi -

Great group!

I have 2 queries about undefined behavior:

1) Is the following code undefined?

float myfunction(float f)
{
if(sizeof(float) & sizeof(int)) {
/* use slow ordinary FP operations */
} else {
int i = *(int *)&f;
/* do clever bit-twiddling floating-point operation */
}
}

It only attempts to access the float as an integer when this makes
sense.

2) I've seen code like the following:

struct s {
int a;
char *b;
float c;
} s;
int *i = (int *) s;

Isn't this undefined? Won't this blow up if the compiler inserts padding
before the first element of the struct?
The first example should read if(sizeof(float) != sizeof(int)). If this is
false, the code is almost correct. However the float could contain a legal
float which is a trap value for an integer. You'd have to be on a pretty
pathological platform for this to be a problem, but a conforming
implementation could crash you out.

The second example is OK. The first element of a struct has the same address
as the whole. No prepadding is allowed, though padding may be inserted at
the end or between elements.
 
H

Harald van =?UTF-8?B?RMSzaw==?=

Tommy said:
Hi -

Great group!

I have 2 queries about undefined behavior:

1) Is the following code undefined?

float myfunction(float f)
{
if(sizeof(float) & sizeof(int)) {

I've tried figuring out why you would use the & operator here. I'm not
seeing it. Did you mean to use != ?
/* use slow ordinary FP operations */
} else {
int i = *(int *)&f;
/* do clever bit-twiddling floating-point operation */
}
}

It only attempts to access the float as an integer when this makes
sense.

The behaviour is outside the scope of the C standard, and very likely also
not defined as an extension by your particular implementation. If you want
your code to be portable, don't do it. If you don't need your code to be
portable, try to figure out how to do it using the extensions your compiler
provides. It's possible you actually do have a compiler that officially
allows your current code as an extension, but I doubt it.
2) I've seen code like the following:

struct s {
int a;
char *b;
float c;
} s;
int *i = (int *) s;

Isn't this undefined?

This is not allowed. You can't convert a structure to a pointer. If you
meant (int *) &s, then it's valid, though I personally prefer to write
&s.a.
Won't this blow up if the compiler inserts padding
before the first element of the struct?

Right. Which is not a problem, since the compiler is not allowed to insert
padding before the first element of the struct.
 
R

Richard Heathfield

Malcolm McLean said:
The second example is OK.

No, it isn't.
The first element of a struct has the same
address as the whole.

Therefore, although I cannot imagine circumstances in which I could
bring myself to do so, it would not be incorrect to write:

int *i = (int *)&s;

but that is not what he wrote. What he wrote is simply wrong.
 
T

Tommy Vercetti

I've tried figuring out why you would use the & operator here. I'm not
seeing it. Did you mean to use != ?

It's an optimization - at the machine code level, most processors will
have a single instruction for &.
The behaviour is outside the scope of the C standard, and very likely also
not defined as an extension by your particular implementation. If you want
your code to be portable, don't do it. If you don't need your code to be
portable, try to figure out how to do it using the extensions your compiler
provides. It's possible you actually do have a compiler that officially
allows your current code as an extension, but I doubt it.

Even though the undefined-behavior lines aren't called in situations
where they'd be undefined? E.g. in the following code:

if(1==2) {
char *p=NULL;
*p; /* KABOOM! */
}

can you really call this UB when the line invoking UB is guaranteed
never to be called?
This is not allowed. You can't convert a structure to a pointer. If you
meant (int *) &s, then it's valid, though I personally prefer to write
&s.a.

Yes, that was a typo.
Right. Which is not a problem, since the compiler is not allowed to insert
padding before the first element of the struct.

I think if you check, the compiler is allowed to insert padding into
structs as it sees fit. In particular, if the address of s isn't
properly aligned for an int then it will have to insert padding!
 
K

Keith Thompson

Malcolm McLean said:
1) Is the following code undefined?

float myfunction(float f)
{
if(sizeof(float) & sizeof(int)) {
/* use slow ordinary FP operations */
} else {
int i = *(int *)&f;
/* do clever bit-twiddling floating-point operation */
}
}

It only attempts to access the float as an integer when this makes
sense.
[...]

The first example should read if(sizeof(float) != sizeof(int)). If
this is false, the code is almost correct. However the float could
contain a legal float which is a trap value for an integer. You'd have
to be on a pretty pathological platform for this to be a problem, but
a conforming implementation could crash you out.
[...]

It could also fail if int and float have different alignment
requirements.
 
M

Malcolm McLean

Tommy Vercetti said:
It's an optimization - at the machine code level, most processors will
have a single instruction for &.


Even though the undefined-behavior lines aren't called in situations
where they'd be undefined? E.g. in the following code:

if(1==2) {
char *p=NULL;
*p; /* KABOOM! */
}

can you really call this UB when the line invoking UB is guaranteed
never to be called?


Yes, that was a typo.


I think if you check, the compiler is allowed to insert padding into
structs as it sees fit. In particular, if the address of s isn't
properly aligned for an int then it will have to insert padding!
No it's not.
If you think about it, inserting a padding byte before the first member
won't solve the problem of alignment.
 
A

Army1987

Tommy Vercetti wrote: [snip]
float myfunction(float f)
{
if(sizeof(float) & sizeof(int)) {

I've tried figuring out why you would use the & operator here. I'm not
seeing it. Did you mean to use != ?

It's an optimization - at the machine code level, most processors will
have a single instruction for &.
& is bitwise and, it will be nonzero unless the operands have no
set bits in common. Probably you mean ^ (bitwise xor), which is
zero if and only if the operands are equal.
Even though the undefined-behavior lines aren't called in situations
where they'd be undefined? E.g. in the following code:

if(1==2) {
char *p=NULL;
*p; /* KABOOM! */
}

can you really call this UB when the line invoking UB is guaranteed
never to be called?
The situation is different. The fact that int and float have the
same size doesn't require that they have the same alignment. For
example, imagine that on a particular machine both floats and ints
have 4 bytes, but floats can be placed anywhere, whereas ints must
be aligned to word boundaries. Converting a float* to int* and
dereferencing it causes UB if the float* points anywhere else than
to the beginning of a 4-byte word.
[snip]
I think if you check, the compiler is allowed to insert padding into
structs as it sees fit. In particular, if the address of s isn't
properly aligned for an int then it will have to insert padding!
The standard explicitly require that a pointer to a struct can point to
its first member if converted to its type.
 
K

Keith Thompson

Tommy Vercetti said:
It's an optimization - at the machine code level, most processors will
have a single instruction for &.

You should worry more about writing correct and clear code than about
trimming something down to a single instruction. Let the compiler
worry about that.

In this context, using "&" works if both sizes are powers of 2; it can
fail if they aren't.

Furthermore, sizeof(float) and sizeof(int) can both be computed
at compilation time. A decent compiler will probably evaluate
'sizeof(float) != sizeof(int)'
to either 0 or 1 at compile time, and eliminate one of the branches.

[...]
I think if you check, the compiler is allowed to insert padding into
structs as it sees fit. In particular, if the address of s isn't
properly aligned for an int then it will have to insert padding!

If you check, you'll find that no padding may be inserted before the
first member of a structure.

C99 p13:

Within a structure object, the non-bit-field members and the units
in which bit-fields reside have addresses that increase in the
order in which they are declared. A pointer to a structure object,
suitably converted, points to its initial member (or if that
member is a bit-field, then to the unit in which it resides), and
vice versa. There may be unnamed padding within a structure
object, but not at its beginning.
 
H

Harald van =?UTF-8?B?RMSzaw==?=

Tommy said:
It's an optimization - at the machine code level, most processors will
have a single instruction for &.

And most processors will also have a single instruction for !=. This is
irrelevant though, since sizeof(float) and sizeof(int) are constants, and
the compiler is required to be able to evaluate the expression at compile
time. It would take a particularly perverse compiler to not actually do so.
 
P

Peter J. Holzer

It's an optimization -

No, it's simply wrong.

Consider the (common) case of sizeof(float) == sizeof(int) == 4:
4 & 4 == 4, so the then branch (slow ordinary FP operations) is
executed. This is almost certainly not what you wanted.

If sizeof(int) == 8 and sizeof(float) == 4, (4 & 8) == 0, so the else
branch (int i = *(int *)&f) is executed, which causes undefined
behaviour due to the possible alignment mismatch.

But you haven't just reversed the test: If for example sizeof(int) == 8
and sizeof(float) == 12, then it will still test as true.

at the machine code level, most processors will
have a single instruction for &.

At the machine code level, most processors also have single instructions
for comparing integers. But most likely, the test will never be made at
runtime because it can already be evaluated at compile time.

Morals: Write what you mean and let the compiler worry about
optimization.

Actually, it only attempts to access the float as an integer when this
makes no sense. But assuming you got the test right: I think this is
still undefined behaviour:

1) Even if the size is the same, the alignment can be different (think
of the old x86 series, where integer and fp unit were separate
processors: They could easily have had different alignment
requirements).

2) Even if the alignment is the same, reinterpreting an fp number as an
int is not defined. The bit pattern can represent a trap value, for
example.

And thirdly, knowing the size is not enough for "clever bit-twiddling".
You need to know the representation. So that code should be something
like:


#if __STDC_IEC_559__ && CLEVER_BIT_TWIDDLING_IS_FASTER
/* clever bit twiddling here */
#else
/* normal FP here
#endif

Even though the undefined-behavior lines aren't called in situations
where they'd be undefined?

No, but in your code the undefined-behavior will be called. Your checks
to prevent that are insufficient.
E.g. in the following code:

if(1==2) {
char *p=NULL;
*p; /* KABOOM! */
}

That's ok.

I think if you check, the compiler is allowed to insert padding into
structs as it sees fit. In particular, if the address of s isn't
properly aligned for an int then it will have to insert padding!

No, it will have to ensure that s is properly aligned for all its
members.

hp
 
C

CBFalconer

Tommy said:
.... snip ...


I think if you check, the compiler is allowed to insert padding into
structs as it sees fit. In particular, if the address of s isn't
properly aligned for an int then it will have to insert padding!

But not before the first item in a struct. The struct won't be
assigned any address that is not suitable for an int.
 
A

Army1987

Tommy Vercetti wrote: [snip]
float myfunction(float f)
{
if(sizeof(float) & sizeof(int)) {

I've tried figuring out why you would use the & operator here. I'm not
seeing it. Did you mean to use != ?

It's an optimization - at the machine code level, most processors will
have a single instruction for &.
& is bitwise and, it will be nonzero unless the operands have no
set bits in common. Probably you mean ^ (bitwise xor), which is
zero if and only if the operands are equal.
And anyway, the expression is a constant integer expression, so
it will be evaluated at compile time. And any decent compiler
will not generate any code for the branch which will not be
executed.
/* use slow ordinary FP operations */
} else {
int i = *(int *)&f;
/* do clever bit-twiddling floating-point operation */
}
}
[snip]
can you really call this UB when the line invoking UB is guaranteed
never to be called?
The situation is different. The fact that int and float have the same
size doesn't require that they have the same alignment. For example,
imagine that on a particular machine both floats and ints have 4 bytes,
but floats can be placed anywhere, whereas ints must be aligned to word
boundaries. Converting a float* to int* and dereferencing it causes UB
if the float* points anywhere else than to the beginning of a 4-byte
word.
Anyway, since the bit-twiddling is nonportable anyway (it depends
on the representation of floats), you can look up your
implementation's documentation to check that.
 
C

christian.bau

It's an optimization - at the machine code level, most processors will
have a single instruction for &.

Considering that sizeof (float) and sizeof (int) are constants, this
argument is rather daft. On top of that, the code is wrong, because
sizeof(float) & sizeof(int) can only ever be zero if the sizes are
different, and in that case I'm sure you don't want to execute your
bit-fiddling code.

So you tried a nano-optimisation, didn't figure out that it would be
pointless as an optimisation, and on top of that you got it wrong.
Brilliant.
 
C

christian.bau

Actually, it only attempts to access the float as an integer when this
makes no sense. But assuming you got the test right: I think this is
still undefined behaviour:

1) Even if the size is the same, the alignment can be different (think
of the old x86 series, where integer and fp unit were separate
processors: They could easily have had different alignment
requirements).

2) Even if the alignment is the same, reinterpreting an fp number as an
int is not defined. The bit pattern can represent a trap value, for
example.

It is undefined behaviour anyway, because accessing an object of type
float using an lvalue of any type other than float or a character type
is _always_ undefined behaviour; that is just what the C Standard
says.

There is no way in conforming Standard C to access the representation
of a float through type int, unsigned int, long or long int, even
though this would be very useful. There is not even a method that
works in practice on a reasonable range of existing compilers that are
in common use. On some compilers access through a pointer of the
integer type will "work", on others using a union "works". It is
tricky, it depends on the level of compiler optimisation, it depends
on the exact circumstances, and it is just hard to get right.
 
A

Army1987

There is no way in conforming Standard C to access the representation
of a float through type int, unsigned int, long or long int, even
though this would be very useful.
float f;
int i;
memcpy(&i, &f, sizeof f);
/* do what you want with i */
memcpy(&f, &i, sizeof f);
 
A

Army1987

I don't think this changes anything.

Of course this isn't portable, because you'd need to know the
representation of float, the endianness of int, etc., but (except
in the case of trap representations) it works, even if int and
float have different algnments, and even if a compiler optimizes
away either assignment in something like
union { float f; int i; } u;
u.f = 42.0;
u.i ^= MAGIC;
printf("%g\n", u.f);
 
H

Harald van =?UTF-8?B?RMSzaw==?=

Peter said:
I don't think this changes anything.

It does. One problem with using *(int *) &f, simplified a bit, is that
you're reading and modifying a float using an lvalue of type int. Using
memcpy to move f's representation into a real int avoids this one. The
other problems are all avoided already if the implementation makes
sizeof(float) equal to sizeof(int), and int has no trap representations.
 
P

Peter J. Holzer

It does. One problem with using *(int *) &f, simplified a bit, is that
you're reading and modifying a float using an lvalue of type int. Using
memcpy to move f's representation into a real int avoids this one.

I don't think it does. If you think float and int variables are somehow
special, use malloced memory instead:


void *p1, *p2;
float *fp;
int *ip;

assert(sizeof (int) == sizeof (float));
p1 = malloc(sizeof (float));
p2 = malloc(sizeof (int));
fp = p1;
ip = p2;
*fp = 42.0;
memcpy(p2, p1, sizeof (float));
/* do something with *ip */
memcpy(p1, p2, sizeof (float));


How is this different from

p1 = malloc(sizeof (float));
fp = p1;
ip = p1;
*fp = 42.0;
/* do something with *ip */

?

You are accessing memory with the same content the same way. If the
sizeof (float) bytes pointed to by p1 are a float which cannot be
accessed as an integer after the assignment (*fp = 42.0), so are the
sizeof (float) bytes pointed to by p2 after the first memcpy.

hp
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,129
Latest member
FastBurnketo
Top