# analysis of floating point values

Discussion in 'C++' started by K4 Monk, Feb 18, 2011.

1. ### K4 MonkGuest

hi, is there a way by which we can see a bit-by bit representation of
how a floating point value is stored? I was thinking something similar
to how we shift bits using the >> and << operators but they only work
for integers. In short, I'd like to first find out what the size of
the value is in bytes and then analyse every bit in every byte. Is
this possible?

thanks!

K4 Monk, Feb 18, 2011

2. ### Jens Thoms ToerringGuest

K4 Monk <> wrote:
> hi, is there a way by which we can see a bit-by bit representation of
> how a floating point value is stored? I was thinking something similar
> to how we shift bits using the >> and << operators but they only work
> for integers. In short, I'd like to first find out what the size of
> the value is in bytes and then analyse every bit in every byte. Is
> this possible?

Yes, it's possible. One way is

#include <cstdio>
#include <cstring>

int main( )
{
double d = 42.1419;
unsigned char * c = new unsigned char [ sizeof d ];
memcpy( c, &d, sizeof x );

for ( size_t i = 0; i < sizeof x; ++i )
printf( "%02x ", c[ i ] );
printf( "\n" );
delete [ ] c;
}

(I use printf() here since I'm too lazy to look up the
correct way to get properly formated hex output from
std::cout).

The 'sizeof d' bits (which could also be written as
'sizeof(double)' tells you how many bytes there are in
a double. To keep things simple a buffer of unsigned
chars of exacty this size is allocated and the bytes
of the double are directly copied over there. Then
you can print out the hexadecimal values of each of
those bytes. Splitting up a hex value into its bits is
so simple it can be done without a program;-) (You
could also do without the extra buffer using casts
and a bit of pointer fiddling.)

Of course, those hex values won't make too much sense
without any idea what they mean. And what they mean can
differ from machine to machine. Luckily, most machines
nowadays use IEEE 754-2008 format, see e.g.

http://en.wikipedia.org/wiki/IEEE_754-2008

and the pages linked in there. Keep in mind that there
can be emdianess issues (i.e. one some machines the
low order bytes come first in memory, on others the
high order bytes and then there are further possible
variations) and that not all machines use this format.

Regards, Jens
--
\ Jens Thoms Toerring ___
\__________________________ http://toerring.de

Jens Thoms Toerring, Feb 18, 2011

3. ### gwowenGuest

K4 Monk wrote:
> hi, is there a way by which we can see a bit-by bit representation of
> how a floating point value is stored? I was thinking something similar
> to how we shift bits using the >> and << operators but they only work
> for integers. In short, I'd like to first find out what the size of
> the value is in bytes and then analyse every bit in every byte. Is
> this possible?

template <typename T>
for(size_t q=0;q<sizeof(T); ++q){
for (int bit=CHAR_BIT;bit > 0 ;--bit){
std::cout << ((*(chaddr+q)&(1U<<(bit-1))) ? 1 : 0);
}
}
std::cout<<std::endl;
}

int main()
{
float x = -43.43;
double y = 43.43;
print_representation(&x);
print_representation(&y);
}

gwowen, Feb 18, 2011
4. ### K4 MonkGuest

Thanks to all, Leigh, Jens, and gwowen...this is very useful...good
for practising C++ as well.

One quick note, I see you cast to (unsigned char*) but using sizeof
operator I see that "double" is 16 bytes (on my machine a core2duo
running linux x86_64) and unsigned char* is 8 bytes. But still the
code snippets you provided are able to print everything. Ah well, time
to go through the books again!

K4 Monk, Feb 18, 2011
5. ### PaulGuest

"Jens Thoms Toerring" <> wrote in message
news:-berlin.de...
> K4 Monk <> wrote:
>> hi, is there a way by which we can see a bit-by bit representation of
>> how a floating point value is stored? I was thinking something similar
>> to how we shift bits using the >> and << operators but they only work
>> for integers. In short, I'd like to first find out what the size of
>> the value is in bytes and then analyse every bit in every byte. Is
>> this possible?

>
> Yes, it's possible. One way is
>
> #include <cstdio>
> #include <cstring>
>
> int main( )
> {
> double d = 42.1419;
> unsigned char * c = new unsigned char [ sizeof d ];
> memcpy( c, &d, sizeof x );
>
> for ( size_t i = 0; i < sizeof x; ++i )
> printf( "%02x ", c[ i ] );
> printf( "\n" );
> delete [ ] c;
> }
>
> (I use printf() here since I'm too lazy to look up the
> correct way to get properly formated hex output from
> std::cout).

It's simply:
std::cout<<std::hex<< your_integer;

HTH.

Paul, Feb 18, 2011
6. ### Jens Thoms ToerringGuest

K4 Monk <> wrote:
> Thanks to all, Leigh, Jens, and gwowen...this is very useful...good
> for practising C++ as well.

> One quick note, I see you cast to (unsigned char*) but using sizeof
> operator I see that "double" is 16 bytes (on my machine a core2duo
> running linux x86_64) and unsigned char* is 8 bytes.

In e.g.

> > unsigned char* bits = reinterpret_cast<unsigned char*>(&f);

there's a cast from 'double *' (note the '&' in front of 'f')
to 'unsigned char *', so the sizeof a double is irrelevant
here. The size of a double becomes only important when you
then iterate over the array of unsigned chars with

> > for (std::size_t i = 0; i != sizeof(f); ++i)

to stop when the end of the array (which is just the
double split up into single bytes) is reached.

Regards, Jens
--
\ Jens Thoms Toerring ___
\__________________________ http://toerring.de

Jens Thoms Toerring, Feb 18, 2011
7. ### K4 MonkGuest

On Feb 18, 7:26 pm, (Jens Thoms Toerring) wrote:
> In e.g.
>
> > > unsigned char* bits = reinterpret_cast<unsigned char*>(&f);

>
> there's a cast from 'double *' (note the '&' in front of 'f')
> to 'unsigned char *', so the sizeof a double is irrelevant
> here.

On Feb 18, 7:28 pm, Leigh Johnston <> wrote:
> On 18/02/2011 14:15, K4 Monk wrote:
> The cast is of a *pointer to* the double rather than of the *value of*
> the double; an unsigned char pointer is the same size as a double pointer..
>
> /Leigh

Ok I understand now. Thanks. So its basically because pointers are
always the same size regardless of whether they point to a double or a
char, and we then switch the type of the pointer so it acts as a ptr
to a char. (And the reason we do this is because double* will always
take chunks of memory in 16 bytes whereas for byte-by-byte analysis we
need a pointer which byte sized chunks, correct?) Wow I feel so much
wiser now. This is exciting, it means we can also analyze the layout
of other objects in memory...cool!

K4 Monk, Feb 18, 2011
8. ### PaulGuest

"K4 Monk" <> wrote in message
news:...
> Thanks to all, Leigh, Jens, and gwowen...this is very useful...good
> for practising C++ as well.
>
> One quick note, I see you cast to (unsigned char*) but using sizeof
> operator I see that "double" is 16 bytes (on my machine a core2duo
> running linux x86_64) and unsigned char* is 8 bytes. But still the
> code snippets you provided are able to print everything. Ah well, time
> to go through the books again!
>

The outer loop is limited by the sizeof the float or double and not the
sizeof char.
for(size_t q=0;q<sizeof(T); ++q).

The data structure, that is the double or float, is traversed as if it were
an array of chars.
*(chaddr+q) in the inner loop, is indexing this data structure in char sized
chunks.

HTH

Paul, Feb 18, 2011
9. ### Jens Thoms ToerringGuest

K4 Monk <> wrote:
> On Feb 18, 7:26Â pm, (Jens Thoms Toerring) wrote:
> > In e.g.
> >
> > > > unsigned char* bits = reinterpret_cast<unsigned char*>(&f);

> >
> > there's a cast from 'double *' (note the '&' in front of 'f')
> > to 'unsigned char *', so the sizeof a double is irrelevant
> > here.

> On Feb 18, 7:28 pm, Leigh Johnston <> wrote:
> > On 18/02/2011 14:15, K4 Monk wrote:
> > The cast is of a *pointer to* the double rather than of the *value of*
> > the double; an unsigned char pointer is the same size as a double pointer.

> Ok I understand now. Thanks. So its basically because pointers are
> always the same size regardless of whether they point to a double or a
> char

Mostly correct. I'm not sure about the C++ standard, but the
C standard only guarantees that the size of a char or void
pointer is sufficient to allow a cast from other object types
to them (the back-cast is then also possible). That means that
there's a theoretical possibility that pointers to objects of
other types have smaller sizes. But I haven't seen any such
machine yet and it also rather likely doesn't make much sense
to cast from a char pointer to some other pointer (as long as
it's not a back-cast to the original type).

Be a bit careful with pointers to functions, they aren't in-
cluded in this (a function isn't an object). But even there
on most machines it also works.

, and we then switch the type of the pointer so it acts as a ptr
> to a char. (And the reason we do this is because double* will always
> take chunks of memory in 16 bytes whereas for byte-by-byte analysis we
> need a pointer which byte sized chunks, correct?)

Yes, exactly.
Regards, Jens
--
\ Jens Thoms Toerring ___
\__________________________ http://toerring.de

Jens Thoms Toerring, Feb 18, 2011
10. ### Juha NieminenGuest

Leigh Johnston <> wrote:
> int main()
> {
> double f = 42.42;
> unsigned char* bits = reinterpret_cast<unsigned char*>(&f);
> for (std::size_t i = 0; i != sizeof(f); ++i)
> {
> unsigned char byte = bits;
> for (std::size_t j = CHAR_BIT; j != 0; --j)
> std::cout << (byte >> (j-1) & 1 ? '1' : '0');
> }
> }

error: 'size_t' is not a member of 'std'

Juha Nieminen, Feb 20, 2011
11. ### Juha NieminenGuest

None of the presented solutions take into account endianess. Usually when
you want to print the binary representation of something, you want the
most significant bit to be printed first and go down from there (in other
words, you just want the base-2 representation of the value in the same
way you would print a regular base-10 one).

Also, since it's trivial in C++ to make the function work with any type,
not just doubles, why not do that while we are at it?

//---------------------------------------------------------------
#include <iostream>
#include <climits>

template<typename Type>
void printBinaryRepresentation(Type value)
{
// Resolve if this is a big-endian or a little-endian system:
int dummy = 1;
bool littleEndian = (*reinterpret_cast<char*>(&dummy) == 1);

// The trick is to create a char pointer to the value:
const unsigned char* bytePtr =
reinterpret_cast<const unsigned char*>(&value);

// Loop over the bytes in the floating point value:
for(unsigned i = 0; i < sizeof(Type); ++i)
{
unsigned char byte;
if(littleEndian) // we have to traverse the value backwards:
byte = bytePtr[sizeof(Type) - i - 1];
else // we have to traverse it forwards:
byte = bytePtr;

// Print the bits in the byte:
for(int bitIndex = CHAR_BIT-1; bitIndex >= 0; --bitIndex)
std::cout << ((byte >> bitIndex) & 1);
}

std::cout << std::endl;
}

int main()
{
printBinaryRepresentation(0.5);
printBinaryRepresentation(-0.5f);
}
//---------------------------------------------------------------

Juha Nieminen, Feb 20, 2011
12. ### Juha NieminenGuest

Leigh Johnston <> wrote:
> On 20/02/2011 15:19, Juha Nieminen wrote:
>> Leigh Johnston<> wrote:
>>> int main()
>>> {
>>> double f = 42.42;
>>> unsigned char* bits = reinterpret_cast<unsigned char*>(&f);
>>> for (std::size_t i = 0; i != sizeof(f); ++i)
>>> {
>>> unsigned char byte = bits;
>>> for (std::size_t j = CHAR_BIT; j != 0; --j)
>>> std::cout<< (byte>> (j-1)& 1 ? '1' : '0');
>>> }
>>> }

>>
>> error: 'size_t' is not a member of 'std'

>
> What are you gibbering about? #includes are usually implied in a code
> snippet. std::size_t exists.

And someone learning C++ is supposed to know that how?

It would have costed only a couple of additional lines to post a complete
program.

Juha Nieminen, Feb 20, 2011
13. ### gwowenGuest

On Feb 20, 3:30 pm, Juha Nieminen <> wrote:
>   None of the presented solutions take into account endianess.

The problem with that is that you assume every machine's endianess/
byte-order can be described completely by looking at the lowest-
addressed-byte of the representation of unsigned(1). This can be
wrong on an ARM, and is always wrong on a PDP-11. It also assumes that
the endianess of a float point type is the same as an integer type.
This is untrue on a smattering of crazier-than-a-bag-of-weasel

> Usually when you want to print the binary representation of something, you want the most significant bit to be printed first and go down from there

But sometimes you're printing the representation of something
precisely to poke around at the innards of the processor, to determine
byte-ordering, mantissa format, etc, and you really want the "bytes
ordered-as-they-are-arranged-in-memory". And of course, sometimes you
want to print the representation of something that is not a value type.

gwowen, Feb 20, 2011
14. ### K4 MonkGuest

On Feb 20, 8:30 pm, Juha Nieminen <> wrote:
>     bool littleEndian = (*reinterpret_cast<char*>(&dummy) == 1);

took me a minute to understand this but now that I do, its very
clever! I got confused because dummy is an int of value 1, and in the
line above I wasn't sure if 1 was a bool or an int.

K4 Monk, Feb 22, 2011
15. ### Joshua MauriceGuest

On Feb 18, 6:56 am, (Jens Thoms Toerring) wrote:
> K4 Monk <> wrote:
> > On Feb 18, 7:26 pm, (Jens Thoms Toerring) wrote:
> > > In e.g.

>
> > > > > unsigned char* bits = reinterpret_cast<unsigned char*>(&f);

>
> > > there's a cast from 'double *' (note the '&' in front of 'f')
> > > to 'unsigned char *', so the sizeof a double is irrelevant
> > > here.

> > On Feb 18, 7:28 pm, Leigh Johnston <> wrote:
> > > On 18/02/2011 14:15, K4 Monk wrote:
> > > The cast is of a *pointer to* the double rather than of the *value of*
> > > the double; an unsigned char pointer is the same size as a double pointer.

> > Ok I understand now. Thanks. So its basically because pointers are
> > always the same size regardless of whether they point to a double or a
> > char

>
> Mostly correct. I'm not sure about the C++ standard, but the
> C standard only guarantees that the size of a char or void
> pointer is sufficient to allow a cast from other object types
> to them (the back-cast is then also possible). That means that
> there's a theoretical possibility that pointers to objects of
> other types have smaller sizes. But I haven't seen any such
> machine yet and it also rather likely doesn't make much sense
> to cast from a char pointer to some other pointer (as long as
> it's not a back-cast to the original type).
>
> Be a bit careful with pointers to functions, they aren't in-
> cluded in this (a function isn't an object). But even there
> on most machines it also works.

Pretty sure that's the same in C++.

However, due to forward declarations, an implementation would likely
have to go out of its way to have different pointer to struct types
which have different sizes or representations. This is true of C and C+
+.

Also, IIRC, some crazy mainframes do have void* and char* of a
different size than int*. The reason is that the machine is at the
hardware level only 64 bit addressable, and they didn't want to have
CHAR_BITS or whatever be 64. Instead, char has 8 bits, and a "simple"
char read or write is implemented through a hardware assembly load or
store with additional implicit bit manipulation to only change the
right 8 bits. (Needless to say, such a system wouldn't be POSIX

Joshua Maurice, Feb 22, 2011
16. ### James KanzeGuest

On Feb 22, 10:53 pm, Joshua Maurice <> wrote:
> On Feb 18, 6:56 am, (Jens Thoms Toerring) wrote:

> Also, IIRC, some crazy mainframes do have void* and char* of a
> different size than int*.

Not so much on mainframes, as on smaller, embedded machines. On
where not much text handling is to be expected, using word
addressing makes sense even today, at least if words are small.
If you're not using all of the bits for addressing, then you
might as well spend the extra bits to address bytes. On a 16
128KB, rather than just 64KB.

What did happen in the past (and may still be the case on some
exotic mainframes) is that the basic address was originally word
addressing, but that some of the unused upper bits were later
dedicated to the byte address in a word. In such cases, a char*
wouldn't be bigger than an int*, but it would have a different
representation, and casting a char* to an int* could force the