converting 64-bit fixed-point to float

John Fisher · Jul 21, 2007

Hi Group,

troubles with converting signed 32.32, little-endian, 2's complement
back to floating point. I have been trying to brew it myself. I am
running Python 2.5 on a Mac. Here is the C-code I have been trying to
leverage:

double FPuint8ArrayToFPDouble(uint8 *buffer, int startIndex)
{
uint32 resultDec = 0;
uint32 resultWh = 0;
int i;

for(i = 0; i < 4; i++)
{
resultDec += (uint32)buffer[startIndex + i] * pow(2, (i*8));
resultWh += (uint32)buffer[startIndex + i + 4] * pow(2, (i*8));
}

return ( (double)((int)resultWh) + (double)(resultDec)/4294967296.0 );
}

Here is my version in Python, with some test code built in:

from ctypes import *

def conv64(input):
input1=[0]*8
input1[0]=c_ushort(input[0])
input1[1]=c_ushort(input[1])
input1[2]=c_ushort(input[2])
input1[3]=c_ushort(input[3])
input1[4]=c_ushort(input[4])
input1[5]=c_ushort(input[5])
input1[6]=c_ushort(input[6])
input1[7]=c_ushort(input[7])
#print input1[0].value,
input1[1].value,input1[2].value,input1[3].value
#print
input1[4].value,input1[5].value,input1[6].value,input1[7].value
#print
resultDec=c_ulong(0)
resultWh=c_ulong(0)
for i in range(4):
dec_c=c_ulong(input1.value)
Wh_c=c_ulong(input1[i+4].value)
resultDec.value=resultDec.value+dec_c.value*2**(i*8)
resultWh.value=resultWh.value+Wh_c.value*2**(i*8)
conval=float(int(resultWh.value))+float(resultDec.value)/4294967296.0
#print conval
return conval
#tabs got messed up bringing this into MacSoup

#these are 64-bit fixed point format (signed 32.32, little-endian, 2's
complement)
#should be -1
conv64_0=[0, 0, 0, 255, 255, 255, 255, 255]
#should be 0
conv64_1=[0, 0, 0, 0, 0, 0, 0, 0]
#should be 0.20000
conv64_1_2=[51, 51, 51, 51, 0, 0, 0, 0]
#should be 1
conv64_2=[0, 0, 0, 0, 1, 0, 0, 0]
#should be 2
conv64_3=[0, 0, 0, 0, 2, 0, 0, 0]
#should be 298.15
conv64_4=[102, 102, 102, 38, 42, 1, 0, 0]
#should be -0.2
conv64_5=[205,204,204,204,255,255,255,255]
output0=conv64(conv64_0)
print "output should be -1 is "+str(output0)
output1=conv64(conv64_1)
print "output should be 0 is "+str(output1)
output1_2=conv64(conv64_1_2)
print "output should be 0.2 is "+str(output1_2)
output2=conv64(conv64_2)
print "output should be 1 is "+str(output2)
output3=conv64(conv64_3)
print "output should be 2 is "+str(output3)
output4=conv64(conv64_4)
print "output should be 298.15 is "+str(output4)
output5=conv64(conv64_5)
print "output should be -0.2 is "+str(output5)

Finally, here is the output I get from my code:
output should be -1 is 4294967296.0
output should be 0 is 0.0
output should be 0.2 is 0.199999999953
output should be 1 is 1.0
output should be 2 is 2.0
output should be 298.15 is 298.15
output should be -0.2 is 4294967295.8

Thanks for any light you can shed on my ignorance.

wave_man

Michael Tobis · Jul 21, 2007

It appears to be correct for positive numbers.

if conval >= 2**16:
conval -= 2**32

would appear to patch things up.

It's not very pretty, though. You could at least start with

input1 = [c_ushort(item) for item in input]

instead of your first 9 lines.

mt

attn.steven.kuo · Jul 21, 2007

Hi Group,

troubles with converting signed 32.32, little-endian, 2's complement
back to floating point. I have been trying to brew it myself. I am
running Python 2.5 on a Mac. Here is the C-code I have been trying to
leverage:

double FPuint8ArrayToFPDouble(uint8 *buffer, int startIndex)
{
uint32 resultDec = 0;
uint32 resultWh = 0;
int i;

for(i = 0; i < 4; i++)
{
resultDec += (uint32)buffer[startIndex + i] * pow(2, (i*8));
resultWh += (uint32)buffer[startIndex + i + 4] * pow(2, (i*8));
}

return ( (double)((int)resultWh) + (double)(resultDec)/4294967296.0 );

}

There are a few problem spots in that C code. I tend to
think that it "works" because you're on a system that has
4-byte int and CHAR_BIT == 8. When the most-significant-bit (MSB)
of resultWh is 1, then casting to int makes that a negative
value (i.e., MSB == the sign bit).

I presume that somewhere you include <stdint.h> (from C99)
and that uint32 is really uint32_t, etc. For that to be
portable, you should probably cast to int32_t?

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>

double arr2dbl (uint8_t *buffer, int startIndex)
{
uint32_t decimal = 0;
uint32_t whole = 0;
size_t i;
for (i = 0; i < 4; ++i)
{
decimal += (buffer[startIndex + i] << (i*8));
whole += (buffer[startIndex + i + 4] << (i*8));
}
return (int32_t)whole + (decimal/(UINT32_MAX+1.0));
}

int main (void)
{
uint8_t arr[7][8] = {
{0, 0, 0, 0, 0xff, 0xff, 0xff, 0xff},
{0, 0, 0, 0, 0, 0, 0, 0},
{51, 51, 51, 51, 0, 0, 0, 0},
{0, 0, 0, 0, 1, 0, 0, 0},
{0, 0, 0, 0, 2, 0, 0, 0},
{102, 102, 102, 38, 42, 1, 0, 0 },
{205, 204, 204, 204, 0xff, 0xff, 0xff, 0xff}};
size_t i;
double result;
for (i = 0; i < sizeof arr/sizeof arr[0]; ++i)
{
result = arr2dbl(arr, 0);
printf("%f\n", result);
}
return 0;
}

Here is my version in Python, with some test code built in:

from ctypes import *

def conv64(input):
input1=[0]*8
input1[0]=c_ushort(input[0])
input1[1]=c_ushort(input[1])
input1[2]=c_ushort(input[2])
input1[3]=c_ushort(input[3])
input1[4]=c_ushort(input[4])
input1[5]=c_ushort(input[5])
input1[6]=c_ushort(input[6])
input1[7]=c_ushort(input[7])
#print input1[0].value,
input1[1].value,input1[2].value,input1[3].value
#print
input1[4].value,input1[5].value,input1[6].value,input1[7].value
#print
resultDec=c_ulong(0)
resultWh=c_ulong(0)
for i in range(4):
dec_c=c_ulong(input1.value)
Wh_c=c_ulong(input1[i+4].value)
resultDec.value=resultDec.value+dec_c.value*2**(i*8)
resultWh.value=resultWh.value+Wh_c.value*2**(i*8)
conval=float(int(resultWh.value))+float(resultDec.value)/4294967296.0
#print conval
return conval
#tabs got messed up bringing this into MacSoup

(snipped)

Finally, here is the output I get from my code:

output should be -1 is 4294967296.0
output should be 0 is 0.0
output should be 0.2 is 0.199999999953
output should be 1 is 1.0
output should be 2 is 2.0
output should be 298.15 is 298.15
output should be -0.2 is 4294967295.8

Thanks for any light you can shed on my ignorance.

wave_man

Click to expand...

This is my translation:

from ctypes import *

def conv64(input):
input1 = [c_uint8(item) for item in input]
dec = c_uint32(0)
whl = c_uint32(0)
for i in xrange(4):
dec.value += (input1.value << (i*8))
whl.value += (input1[i+4].value << (i*8))
cast_whl_to_int = c_int32(whl.value)
return float(cast_whl_to_int.value + dec.value/4294967296.0)

for arr in [[0, 0, 0, 0, 0xff, 0xff, 0xff, 0xff],
[0, 0, 0, 0, 0, 0, 0, 0],
[51, 51, 51, 51, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 2, 0, 0, 0],
[102, 102, 102, 38, 42, 1, 0, 0],
[205,204,204,204,255,255,255,255]]:
print "%f" % conv64(arr)

However, I've not looked deeply into ctypes so I
don't know if c_int32 is really C's int, or int32_t, or ???

John Fisher · Jul 22, 2007

Hi Group,

troubles with converting signed 32.32, little-endian, 2's complement
back to floating point. I have been trying to brew it myself. I am
running Python 2.5 on a Mac. Here is the C-code I have been trying to
leverage:

double FPuint8ArrayToFPDouble(uint8 *buffer, int startIndex)
{
uint32 resultDec = 0;
uint32 resultWh = 0;
int i;

for(i = 0; i < 4; i++)
{
resultDec += (uint32)buffer[startIndex + i] * pow(2, (i*8));
resultWh += (uint32)buffer[startIndex + i + 4] * pow(2, (i*8));
}

return ( (double)((int)resultWh) + (double)(resultDec)/4294967296.0 );

}

Click to expand...

There are a few problem spots in that C code. I tend to
think that it "works" because you're on a system that has
4-byte int and CHAR_BIT == 8. When the most-significant-bit (MSB)
of resultWh is 1, then casting to int makes that a negative
value (i.e., MSB == the sign bit).

I presume that somewhere you include <stdint.h> (from C99)
and that uint32 is really uint32_t, etc. For that to be
portable, you should probably cast to int32_t?

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>

double arr2dbl (uint8_t *buffer, int startIndex)
{
uint32_t decimal = 0;
uint32_t whole = 0;
size_t i;
for (i = 0; i < 4; ++i)
{
decimal += (buffer[startIndex + i] << (i*8));
whole += (buffer[startIndex + i + 4] << (i*8));
}
return (int32_t)whole + (decimal/(UINT32_MAX+1.0));
}

int main (void)
{
uint8_t arr[7][8] = {
{0, 0, 0, 0, 0xff, 0xff, 0xff, 0xff},
{0, 0, 0, 0, 0, 0, 0, 0},
{51, 51, 51, 51, 0, 0, 0, 0},
{0, 0, 0, 0, 1, 0, 0, 0},
{0, 0, 0, 0, 2, 0, 0, 0},
{102, 102, 102, 38, 42, 1, 0, 0 },
{205, 204, 204, 204, 0xff, 0xff, 0xff, 0xff}};
size_t i;
double result;
for (i = 0; i < sizeof arr/sizeof arr[0]; ++i)
{
result = arr2dbl(arr, 0);
printf("%f\n", result);
}
return 0;
}

Here is my version in Python, with some test code built in:

from ctypes import *

def conv64(input):
input1=[0]*8
input1[0]=c_ushort(input[0])
input1[1]=c_ushort(input[1])
input1[2]=c_ushort(input[2])
input1[3]=c_ushort(input[3])
input1[4]=c_ushort(input[4])
input1[5]=c_ushort(input[5])
input1[6]=c_ushort(input[6])
input1[7]=c_ushort(input[7])
#print input1[0].value,
input1[1].value,input1[2].value,input1[3].value
#print
input1[4].value,input1[5].value,input1[6].value,input1[7].value
#print
resultDec=c_ulong(0)
resultWh=c_ulong(0)
for i in range(4):
dec_c=c_ulong(input1.value)
Wh_c=c_ulong(input1[i+4].value)
resultDec.value=resultDec.value+dec_c.value*2**(i*8)
resultWh.value=resultWh.value+Wh_c.value*2**(i*8)
conval=float(int(resultWh.value))+float(resultDec.value)/4294967296.0
#print conval
return conval
#tabs got messed up bringing this into MacSoup

(snipped)

Finally, here is the output I get from my code:

output should be -1 is 4294967296.0
output should be 0 is 0.0
output should be 0.2 is 0.199999999953
output should be 1 is 1.0
output should be 2 is 2.0
output should be 298.15 is 298.15
output should be -0.2 is 4294967295.8

Thanks for any light you can shed on my ignorance.

wave_man

Click to expand...

This is my translation:

from ctypes import *

def conv64(input):
input1 = [c_uint8(item) for item in input]
dec = c_uint32(0)
whl = c_uint32(0)
for i in xrange(4):
dec.value += (input1.value << (i*8))
whl.value += (input1[i+4].value << (i*8))
cast_whl_to_int = c_int32(whl.value)
return float(cast_whl_to_int.value + dec.value/4294967296.0)

for arr in [[0, 0, 0, 0, 0xff, 0xff, 0xff, 0xff],
[0, 0, 0, 0, 0, 0, 0, 0],
[51, 51, 51, 51, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 2, 0, 0, 0],
[102, 102, 102, 38, 42, 1, 0, 0],
[205,204,204,204,255,255,255,255]]:
print "%f" % conv64(arr)

However, I've not looked deeply into ctypes so I
don't know if c_int32 is really C's int, or int32_t, or ???

Actually this was very helpful, thanks.

Rgds,

wave_man

John Machin · Jul 22, 2007

[email protected] said:
[email protected] said:

Hi Group,
troubles with converting signed 32.32, little-endian, 2's complement
back to floating point. I have been trying to brew it myself. I am
running Python 2.5 on a Mac. Here is the C-code I have been trying to
leverage:
double FPuint8ArrayToFPDouble(uint8 *buffer, int startIndex)
{
uint32 resultDec = 0;
uint32 resultWh = 0;
int i;
for(i = 0; i < 4; i++)
{
resultDec += (uint32)buffer[startIndex + i] * pow(2, (i*8));
resultWh += (uint32)buffer[startIndex + i + 4] * pow(2, (i*8));
}
return ( (double)((int)resultWh) + (double)(resultDec)/4294967296.0 );
}

Click to expand...

Click to expand...

There are a few problem spots in that C code. I tend to
think that it "works" because you're on a system that has
4-byte int and CHAR_BIT == 8. When the most-significant-bit (MSB)
of resultWh is 1, then casting to int makes that a negative
value (i.e., MSB == the sign bit).

Click to expand...

I presume that somewhere you include <stdint.h> (from C99)
and that uint32 is really uint32_t, etc. For that to be
portable, you should probably cast to int32_t?

Click to expand...

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>

Click to expand...

double arr2dbl (uint8_t *buffer, int startIndex)
{
uint32_t decimal = 0;
uint32_t whole = 0;
size_t i;
for (i = 0; i < 4; ++i)
{
decimal += (buffer[startIndex + i] << (i*8));
whole += (buffer[startIndex + i + 4] << (i*8));
}
return (int32_t)whole + (decimal/(UINT32_MAX+1.0));
}

Click to expand...

int main (void)
{
uint8_t arr[7][8] = {
{0, 0, 0, 0, 0xff, 0xff, 0xff, 0xff},
{0, 0, 0, 0, 0, 0, 0, 0},
{51, 51, 51, 51, 0, 0, 0, 0},
{0, 0, 0, 0, 1, 0, 0, 0},
{0, 0, 0, 0, 2, 0, 0, 0},
{102, 102, 102, 38, 42, 1, 0, 0 },
{205, 204, 204, 204, 0xff, 0xff, 0xff, 0xff}};
size_t i;
double result;
for (i = 0; i < sizeof arr/sizeof arr[0]; ++i)
{
result = arr2dbl(arr, 0);
printf("%f\n", result);
}
return 0;
}

Here is my version in Python, with some test code built in:
from ctypes import *
def conv64(input):
input1=[0]*8
input1[0]=c_ushort(input[0])
input1[1]=c_ushort(input[1])
input1[2]=c_ushort(input[2])
input1[3]=c_ushort(input[3])
input1[4]=c_ushort(input[4])
input1[5]=c_ushort(input[5])
input1[6]=c_ushort(input[6])
input1[7]=c_ushort(input[7])
#print input1[0].value,
input1[1].value,input1[2].value,input1[3].value
#print
input1[4].value,input1[5].value,input1[6].value,input1[7].value
#print
resultDec=c_ulong(0)
resultWh=c_ulong(0)
for i in range(4):
dec_c=c_ulong(input1.value)
Wh_c=c_ulong(input1[i+4].value)
resultDec.value=resultDec.value+dec_c.value*2**(i*8)
resultWh.value=resultWh.value+Wh_c.value*2**(i*8)
conval=float(int(resultWh.value))+float(resultDec.value)/4294967296.0
#print conval
return conval
#tabs got messed up bringing this into MacSoup

Click to expand...

(snipped)

Click to expand...

Finally, here is the output I get from my code:
output should be -1 is 4294967296.0
output should be 0 is 0.0
output should be 0.2 is 0.199999999953
output should be 1 is 1.0
output should be 2 is 2.0
output should be 298.15 is 298.15
output should be -0.2 is 4294967295.8
Thanks for any light you can shed on my ignorance.
wave_man

Click to expand...

Click to expand...

This is my translation:

Click to expand...

from ctypes import *

Click to expand...

def conv64(input):
input1 = [c_uint8(item) for item in input]
dec = c_uint32(0)
whl = c_uint32(0)
for i in xrange(4):
dec.value += (input1.value << (i*8))
whl.value += (input1[i+4].value << (i*8))
cast_whl_to_int = c_int32(whl.value)
return float(cast_whl_to_int.value + dec.value/4294967296.0)

Click to expand...

for arr in [[0, 0, 0, 0, 0xff, 0xff, 0xff, 0xff],
[0, 0, 0, 0, 0, 0, 0, 0],
[51, 51, 51, 51, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 2, 0, 0, 0],
[102, 102, 102, 38, 42, 1, 0, 0],
[205,204,204,204,255,255,255,255]]:
print "%f" % conv64(arr)

Click to expand...

However, I've not looked deeply into ctypes so I
don't know if c_int32 is really C's int, or int32_t, or ???

Click to expand...

Actually this was very helpful, thanks.

Holy code bloat, Batman! The following appears to give the same
results as Steven's code ...

def conv64jm(input):
# Assuming input is an array of unsigned 8-bit ints.
# If it were a string, the first step could be omitted.
input1 = struct.pack('8B', *input) # alternative: map(chr, input)
return struct.unpack("<q", input1)[0] / 4294967296.0

Replace blanks with letter	11	Aug 20, 2013
Python point location of intersect between two lines	0	Feb 28, 2018
Need help with this script	4	Mar 12, 2023
Implementing an 8 bit fixed point register	5	Jul 1, 2008
Minimum Total Difficulty	0	Nov 15, 2023
Taskcproblem calendar	4	Aug 31, 2023
Python List Comprehension Error: Unexpected Output	1	Aug 28, 2023
How to use Densenet121 in monai	0	Feb 16, 2024

converting 64-bit fixed-point to float

John Fisher

Michael Tobis

attn.steven.kuo

John Fisher

John Machin

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads