herding the bits of a float into an integer array

F

frank

Elsethread there is a discussion of how to check whether a bit is set.
I'm trying to do something similar. So far it looks like this:

dan@dan-desktop:~/source$ gcc -std=c99 -Wall -Wextra bits1.c -o out
dan@dan-desktop:~/source$ ./out
value is 0
count is 0
dan@dan-desktop:~/source$ cat bits1.c
#include <stdio.h>
#define VALUE_BITS 32

int main(void)
{

int count = 0;
int bit;

float x = .5;
int value;

value = (int) x;

printf("value is %d\n", value);

for(bit = 0; bit < VALUE_BITS; bit++)
{
if (value & 1)
count ++;
value >>= 1;
}
printf("count is %d\n", count);
return 0;
}


// gcc -std=c99 -Wall -Wextra bits1.c -o out
dan@dan-desktop:~/source$

The idea is that I want to see the bit pattern for an arbitrary float.
The above approach is what Lawrence Kirby starts out with in chp 5 of
Unleashed. (Presumably, his works.)

I've also received the following advice on how to do this:

C casts do the appropriate conversion. If you want to see the
bits through casts you have to dereference a pointer cast to
a pointer to a different type, though in that case the standard
does not guarantee the result.

To do it following the standard (at least C90), you cast a
pointer to (unsigned char *), memcpy() the bits to a variable
through a pointer also cast to (unsigned char *).

end quote

I don't really understand the shift operators, and they seem to be all we
have to do this.

Anyways, my question is how to look at the bit representation of a float
in C. Thanks for your comment and cheers,
 
I

Ian Collins

frank said:
To do it following the standard (at least C90), you cast a
pointer to (unsigned char *), memcpy() the bits to a variable
through a pointer also cast to (unsigned char *).

end quote

I don't really understand the shift operators, and they seem to be all we
have to do this.

Anyways, my question is how to look at the bit representation of a float
in C. Thanks for your comment and cheers,

This should do the trick for little-endian systems:

void printFloatBits( float f )
{
uint8_t* p = ((uint8_t*)&f)+sizeof(float)-1; // Point to MS byte

for( int byte = 0; byte < sizeof(float); ++byte )
{
uint8_t mask = 1<<CHAR_BIT-1; // Mask for MS bit

for( int bit = 0; bit < CHAR_BIT; ++bit )
{
printf( "%d", (*p & mask)>0 );
mask >>= 1;
}
--p;
}
puts("");
}
 
F

frank

This should do the trick for little-endian systems:

Alright, thx Ian. Nothing like an early success:

dan@dan-desktop:~/source$ ./out
00111111000000000000000000000000
dan@dan-desktop:~/source$ cat bits2.c
#include <stdio.h>
#include <stdint.h>
#include <limits.h>

void
printFloatBits (float f)
{
uint8_t *p = ((uint8_t *) & f) + sizeof (float) - 1; // Point to MS
byte

for (int byte = 0; byte < sizeof (float); ++byte)
{
uint8_t mask = 1 << CHAR_BIT - 1; // Mask for MS bit

for (int bit = 0; bit < CHAR_BIT; ++bit)
{
printf ("%d", (*p & mask) > 0);
mask >>= 1;
}
--p;
}
puts ("");
}

int
main (void)
{

void printFloatBits (float);
float x = .5;
printFloatBits (x);

return 0;
}


// gcc -std=c99 -Wall -Wextra bits2.c -o out
dan@dan-desktop:~/source$

Apparently this works for an ubuntu byte as well. The programs I've
looked at today all have this line:
mask >>= 1;

What precisely is the action here?
 
K

Keith Thompson

frank said:
Elsethread there is a discussion of how to check whether a bit is set.
I'm trying to do something similar. So far it looks like this:

dan@dan-desktop:~/source$ gcc -std=c99 -Wall -Wextra bits1.c -o out
dan@dan-desktop:~/source$ ./out
value is 0
count is 0
dan@dan-desktop:~/source$ cat bits1.c
#include <stdio.h>
#define VALUE_BITS 32

int main(void)
{

int count = 0;
int bit;

float x = .5;
int value;

value = (int) x;

This cast merely converts the value of x from float to int.
It doesn't "herd the bits"; it discards the fractional part.
(int)0.5 == 0 ; (int)123.4 == 123 .

(Incidentally, the cast is superfluous; the conversion would have been
done implicitly.)
printf("value is %d\n", value);

for(bit = 0; bit < VALUE_BITS; bit++)
{
if (value & 1)
count ++;
value >>= 1;
}
printf("count is %d\n", count);
return 0;
}


// gcc -std=c99 -Wall -Wextra bits1.c -o out
dan@dan-desktop:~/source$

The idea is that I want to see the bit pattern for an arbitrary float.
The above approach is what Lawrence Kirby starts out with in chp 5 of
Unleashed. (Presumably, his works.)

I've also received the following advice on how to do this:

C casts do the appropriate conversion. If you want to see the
bits through casts you have to dereference a pointer cast to
a pointer to a different type, though in that case the standard
does not guarantee the result.

To do it following the standard (at least C90), you cast a
pointer to (unsigned char *), memcpy() the bits to a variable
through a pointer also cast to (unsigned char *).

end quote

I don't really understand the shift operators, and they seem to be all we
have to do this.

Anyways, my question is how to look at the bit representation of a float
in C. Thanks for your comment and cheers,

Converting between numeric types just converts the values; such
conversions don't tell you anything about the representation.

Pointer conversions are one way to do "type-punning", i.e.,
treating an object of one type as if it were of a different type,
reinterpreting the bits that make up its representation.

Here's something to get you started:

#include <stdio.h>
int main(void)
{
float x = 0.5;
unsigned char *ptr = (unsigned char*)&x;
int i;

for (i = 0; i < sizeof x; i ++) {
printf("byte %d: 0x%02x\n", i, ptr);
}
return 0;
}
 
I

Ian Collins

pete said:
#include <limits.h>
#include <stdio.h>

#define STRING "%10f = %s\n"
#define E_TYPE float
#define INITIAL (54.3f)
#define FINAL 500
#define INC(E) ((E) *= 2)

typedef E_TYPE e_type;

Why all the nasty #defines? Or is this a piss take?
 
F

frank

Converting between numeric types just converts the values; such
conversions don't tell you anything about the representation.

Pointer conversions are one way to do "type-punning", i.e., treating an
object of one type as if it were of a different type, reinterpreting the
bits that make up its representation.

Here's something to get you started:

Thx, Keith.

dan@dan-desktop:~/source$ gcc -std=c99 -Wall -Wextra bits3.c -o out
bits3.c: In function ‘main’:
bits3.c:8: warning: comparison between signed and unsigned integer
expressions
dan@dan-desktop:~/source$ ./out
byte 0: 0x00
byte 1: 0x00
byte 2: 0x00
byte 3: 0x3f
dan@dan-desktop:~/source$ cat bits3.c
#include <stdio.h>
int main(void)
{
float x = 0.5;
unsigned char *ptr = (unsigned char*)&x;
int i;

for (i = 0; i < sizeof x; i ++) {
printf("byte %d: 0x%02x\n", i, ptr);
}
return 0;
}

// gcc -std=c99 -Wall -Wextra bits3.c -o out
dan@dan-desktop:~/source$

I can see that this is correct because I've been looking at the
neighborhood of .5 for a couple weeks now.

With the warning, is it sufficient to ignore it by noting that the
comparison is never to a negative with either value and that it's well
below where these datatypes wrap around on the high end?
 
K

Keith Thompson

frank said:
~/source$ gcc -std=c99 -Wall -Wextra bits3.c -o out
bits3.c: In function ‘main’:
bits3.c:8: warning: comparison between signed and unsigned integer
expressions [...]
dan@dan-desktop:~/source$ cat bits3.c
#include <stdio.h>
int main(void)
{
float x = 0.5;
unsigned char *ptr = (unsigned char*)&x;
int i;

for (i = 0; i < sizeof x; i ++) {
printf("byte %d: 0x%02x\n", i, ptr);
}
return 0;
}

// gcc -std=c99 -Wall -Wextra bits3.c -o out [...]
With the warning, is it sufficient to ignore it by noting that the
comparison is never to a negative with either value and that it's well
below where these datatypes wrap around on the high end?


Yes, you can safely ignore the warning, but it might be better to
change "int i;" to "size_t i;". (I compiled without extra warnings;
my mistake.)
 
I

Ian Collins

pete said:
It used to look like this:

#define STRING "%u = %s\n"
#define E_TYPE unsigned
#define INITIAL 0
#define FINAL 8
#define INC(E) (++(E))

but OP said "float"

Ah, I see, poor man's templates!
 
F

frank

When mask is non negative, then
mask >>= 1;
means exactly the same thing as
mask = mask / 2;

If mask is equal to 0x80u before
mask >>= 1;
then mask will be equal to 0x40u after.

Ok. I've had a little time now to digest this and dink around with it.

217.199997 = 01000011010110010011001100110011
mask starts as 128
mask after >>= is 64
mask after >>= is 32
mask after >>= is 16
mask after >>= is 8
mask after >>= is 4
mask after >>= is 2
mask after >>= is 1
mask after >>= is 0

This is pete's version with mask printf'ed after it gets a new value. It
behaves like he says above.

He assigns the original value with this line:
mask = ((unsigned char)-1 >> 1) + 1;

This is Ian's with the mask printed.
mask starts as 128
00111111mask starts as 128
00000000mask starts as 128
00000000mask starts as 128
00000000

This is how Ian gets his original value:
uint8_t mask = 1 << CHAR_BIT - 1; // Mask for MS bit

How do these values both end up being 128 and is there a difference
between them?
 
I

Ian Collins

frank wrote:

This is pete's version with mask printf'ed after it gets a new value. It
behaves like he says above.

He assigns the original value with this line:
mask = ((unsigned char)-1 >> 1) + 1;

This is Ian's with the mask printed.
mask starts as 128
00111111mask starts as 128
00000000mask starts as 128
00000000mask starts as 128
00000000

This is how Ian gets his original value:
uint8_t mask = 1 << CHAR_BIT - 1; // Mask for MS bit

How do these values both end up being 128 and is there a difference
between them?

Mine's clearer!

Assuming CHAR_BIT = 8:

Pete starts with (unsigned char)-1 (0xff), shifts it right one bit
(0x7f) than adds one (0x80).

I start with one (0x01) and shift it left 7 bits (0x80).

Either will work for any (non-zero) CHAR_BIT value.
 
I

Ian Collins

Ian said:
frank wrote:



Mine's clearer!

Assuming CHAR_BIT = 8:

Pete starts with (unsigned char)-1 (0xff), shifts it right one bit
(0x7f) than adds one (0x80).

I start with one (0x01) and shift it left 7 bits (0x80).

Either will work for any (non-zero) CHAR_BIT value.
Actually not quite; I made the implicit assumption that CHAR_BIT = 8 by
using uint8_t.
 
F

frank

Actually not quite; I made the implicit assumption that CHAR_BIT = 8 by
using uint8_t.

Would ul be the appropriate conversion specifier for printf for this type?

I've been dinking around more with your program:

dan@dan-desktop:~/source$ gcc -std=c99 -Wall -Wextra bits2.c -o out
bits2.c: In function ‘printFloatBits’:
bits2.c:10: warning: comparison between signed and unsigned integer
expressions
bits2.c:12: warning: suggest parentheses around ‘-’ inside ‘<<’
bits2.c:20: warning: format ‘%u’ expects type ‘unsigned int’, but
argument 2 has type ‘uint8_t *’
dan@dan-desktop:~/source$ ./out
mask starts as 128
00 10 21 31 41 51 61 71 p is 3214006674mask starts as 128
00 10 20 30 40 50 60 70 p is 3214006673mask starts as 128
00 10 20 30 40 50 60 70 p is 3214006672mask starts as 128
00 10 20 30 40 50 60 70 p is 3214006671
dan@dan-desktop:~/source$ cat bits2.c
#include <stdio.h>
#include <stdint.h>
#include <limits.h>

void
printFloatBits (float f)
{
uint8_t *p = ((uint8_t *) & f) + sizeof (float) - 1; // Point to MS
byte

for (int byte = 0; byte < sizeof (float); ++byte)
{
uint8_t mask = 1 << CHAR_BIT - 1; // Mask for MS bit
printf("mask starts as %u\n", mask);
for (int bit = 0; bit < CHAR_BIT; ++bit)
{
printf ("%d%d ", bit, (*p & mask) > 0);
mask >>= 1;
}
--p;
printf("p is %u", p);
}
puts ("");
}

int
main (void)
{

void printFloatBits (float);
float x = .5;
printFloatBits (x);

return 0;
}


// gcc -std=c99 -Wall -Wextra bits2.c -o out
dan@dan-desktop:~/source$

So this line sets the pointer to the ultimate byte of the float:
uint8_t *p = ((uint8_t *) & f) + sizeof (float) - 1; // Point to MS
byte

The control works by decrementing p and incrementing byte, which must
have the same number of useful iterations.

Inside the byte, do the bits go right to left or vice versa? I want to
think that they have to go right to left, but I just don't see it. Is
the sign bit necessarily on one side?
 
K

Keith Thompson

Richard Heathfield said:
No, since there's no such conversion specifier. You probably meant lu,
which is for unsigned long ints. For uint8_t, it seems to me that you
should use PRIu8 although I find the Standard less than clear on this
point. Example:

uint8_t mynum = 42;
printf("The answer is %" PRIu8 " according to DNA\n", mynum);

Ghastly syntax.

<snip>

The PRI* macros defined in <inttypes.h> define the proper formats for
the various typedefs, but yes, they're ugly.

In this particular case, it's guaranteed that uint8_t promotes to
signed int, so you could just use "%d". But since the promoted
value is non-negative, you could also use "%u". (The glibc version
of <inttypes.h> has ``#define PRIu8 "u"''.)

I'd probably use an explicit conversion with a carefully chosen
target type:

printf("The answer is %u according to DNA\n", (unsigned int)mynum);
 
F

frank

I like (1u << CHAR_BIT - 1) better,
since the bit that winds up being set is the sign bit and the type of (1
<< CHAR_BIT - 1) is (signed int).



(-1), when cast to an unsigned type,
yields the max value for the unsigned type. (((unsigned char)-1) ==
UCHAR_MAX)

The mask that I wanted can also be expressed as
(UCHAR_MAX - UCHAR_MAX / 2)
which happens to be the same thing as
(UCHAR_MAX / 2 + 1)
which I wrote as
(((unsigned char)-1 >> 1) + 1)

I'll have to think about why those are equivalent. I wanted to see
whether your version showed the same output as Ian's, and I've got good-
looking output, but my source is just mangled:

dan@dan-desktop:~/source$ man indent
dan@dan-desktop:~/source$ indent -i3 -kr bits5.c
dan@dan-desktop:~/source$ cat bits5.c
/* BEGIN bitstr.c */

#include <limits.h>
#include <stdio.h>
#include <stdint.h>

#define STRING "%10f = %s\n"
#define E_TYPE float
#define INITIAL (54.3f)
#define FINAL 500
#define INC(E) ((E) *= 2)
typedef E_TYPE e_type;
void bitstr(char *str, const void *obj, size_t n);
void printFloatBits(float f);
int main(void)
{
e_type e;
char ebits[CHAR_BIT * sizeof e + 1];
puts("\n/* BEGIN output from bitstr.c */\n");
for (e = INITIAL; FINAL >= e; INC(e)) {
bitstr(ebits, &e, sizeof e);
printf(STRING, e, ebits);
printf(" = ");
printFloatBits(e);
}
puts("\n/* END output from bitstr.c */");
return 0;
}

void bitstr(char *str, const void *obj, size_t n)
{
unsigned mask;
const unsigned char *const byte = obj;
while (n-- != 0) {
mask = ((unsigned char) -1 >> 1) + 1;

do {
*str++ = (char) (mask & byte[n] ? '1' : '0');
mask >>= 1;
}
while (mask != 0);
}
*str = '\0';
}

void printFloatBits(float f)
{
uint8_t *p = ((uint8_t *) & f) + sizeof(float) - 1; // Point to MS
byte

for (size_t byte = 0; byte < sizeof(float); ++byte) {
uint8_t mask = 1 << (CHAR_BIT - 1); // Mask for MS bit
for (int bit = 0; bit < CHAR_BIT; ++bit) {
printf("%d", (*p & mask) > 0);
mask >>= 1;
}
--p;
}
puts("");
}

/* END bitstr.c */
// gcc -std=c99 -Wall -Wextra bits5.c -o out
dan@dan-desktop:~/source$

How did this end up looking terrible?

Output looks good though:

/* BEGIN output from bitstr.c */

54.299999 = 01000010010110010011001100110011
= 01000010010110010011001100110011
108.599998 = 01000010110110010011001100110011
= 01000010110110010011001100110011
217.199997 = 01000011010110010011001100110011
= 01000011010110010011001100110011
434.399994 = 01000011110110010011001100110011
= 01000011110110010011001100110011

/* END output from bitstr.c */

Is there a bit model for floats in standard C?
 
I

Ian Collins

Keith said:
The PRI* macros defined in <inttypes.h> define the proper formats for
the various typedefs, but yes, they're ugly.

In this particular case, it's guaranteed that uint8_t promotes to
signed int, so you could just use "%d". But since the promoted
value is non-negative, you could also use "%u". (The glibc version
of <inttypes.h> has ``#define PRIu8 "u"''.)

I'd probably use an explicit conversion with a carefully chosen
target type:

printf("The answer is %u according to DNA\n", (unsigned int)mynum);

As you can see, I prefer to let nature take its course...
 
F

frank

Some compilers will warn of a signed, unsigned, mismatch for the
operands of the "less than" operator.

I was thinking that it's a trade-off between having source that draws no
warnings and readability. Everyone understands int, and we know it's not
going to have a range that outstrips our fingers.
I tend to code to avoid that mismatch.

Perhaps the concomitant is that your source is more technically difficult.
 
F

Frank

(unsigned char)-1 is the largest possible value that can be stored in
an unsigned char.

while((unsigned char)-1 == UCHAR_MAX)
{
}

is an infinite loop.

Ok. But an unsigned char can't *have* the value negative one, right? How
about ((unsigned char) -42) ?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,565
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top