Anyone recognize this numeric storage format - similar to "float", but not quite

G

geskerrett

We are working on a project to decipher a record structure of an old
accounting system that originates from the late80's mid-90's.
We have come across a number format that appears to be a "float" but
doesn't match any of the more standard implementations.
so we are hoping this is a recognizable number storage format with an
identifiable name AND pre-built conversion method
similiar to the "struct" modules available in python.

Here is what we have determined so far.

Example Number: 1234567890

This get stored on disk as 8 bytes, resulting in the following HEX
characters;
00 00 00 A4 05 2c 13 9f

If we changed the order so that it is "little Endian" we get;
9F 13 2c 05 A4 00 00 00

If the HEX is converted to binary it looks like;
10011111 00010011 00101100 00000101 10100100 00000000 000000000
00000000

If the example number 1234567890 is converted to binary it looks like;

10010011 00101100 00000101 1010010

To extract the example number, you need to do the following;
1) take the decimal value of the first byte and subtract 128
2) This tells you how many of the following bits to are significant and
must be read
3) Once the remaining bits are read, reverse the first bit of that
group (ie if it is a 0 make it a 1)
4) convert the result to decimal
.... and presto, the example number !

Using a fixed width font it is easy to see the match at the bit level;

10011111 00010011001011000000010110100100000000000000000000000000
-------- 1001001100101100000001011010010


If you are interested, the following are three other examples;

Orig Hex: 00 00 00 60 92 96 72 A0
Actual Value: 4069954144

Orig Hex: 00 00 80 22 A3 26 3C A1
Actual Value: 6313297477


So ... does anyone recognize this ??
Is there a "built-in" conversion method in Python ??

Thanks in advance.
 
T

Terry Reedy

This appears to be a repost, perhaps not by the op but due to a glitch
somewhere, of a question posted about a month ago and answered.
 
B

Bengt Richter

We are working on a project to decipher a record structure of an old
accounting system that originates from the late80's mid-90's.
We have come across a number format that appears to be a "float" but
doesn't match any of the more standard implementations.
so we are hoping this is a recognizable number storage format with an
identifiable name AND pre-built conversion method
similiar to the "struct" modules available in python.

Here is what we have determined so far.

Example Number: 1234567890

This get stored on disk as 8 bytes, resulting in the following HEX
characters;
00 00 00 A4 05 2c 13 9f

If we changed the order so that it is "little Endian" we get;
9F 13 2c 05 A4 00 00 00

If the HEX is converted to binary it looks like;
10011111 00010011 00101100 00000101 10100100 00000000 000000000
00000000

If the example number 1234567890 is converted to binary it looks like;

10010011 00101100 00000101 1010010

To extract the example number, you need to do the following;
1) take the decimal value of the first byte and subtract 128
2) This tells you how many of the following bits to are significant and
must be read
3) Once the remaining bits are read, reverse the first bit of that
group (ie if it is a 0 make it a 1)
4) convert the result to decimal
... and presto, the example number !

Using a fixed width font it is easy to see the match at the bit level;

10011111 00010011001011000000010110100100000000000000000000000000
-------- 1001001100101100000001011010010


If you are interested, the following are three other examples;

Orig Hex: 00 00 00 60 92 96 72 A0
Actual Value: 4069954144

Orig Hex: 00 00 80 22 A3 26 3C A1
Actual Value: 6313297477


So ... does anyone recognize this ??
Is there a "built-in" conversion method in Python ??

Thanks in advance.
Not looking too closely, but I recall something similar (although I suspect that the bit you
are "reversing" is a sign bit that shadows a known constant MSB 1 for non-zero numbers, and
shouldn't just be reversed):

http://groups.google.com/group/comp...ccc20a1d8d5/4aadc71be8aeddbe#4aadc71be8aeddbe

Regards,
Bengt Richter
 
B

Bengt Richter

[...]

This appears to be a repost, perhaps not by the op but due to a glitch
somewhere, of a question posted about a month ago and answered.
</moved>
UIAM the more or less recent original you are thinking of turned out to be
straight IEEE double format, and I think this is not, though I think it looks
like one that was answered (by me ;-) quite a while ago (Dec 1 2003).

Regards,
Bengt Richter
 
G

geskerrett

Thanks Bengt for directing me to your previous post.
I think I agree with you on the "reversing bit" and the constant MSB.
In reworking my examples I was always changing the 0 to 1.
 
G

geskerrett

I am not sure if you are still watching this thread, but I seem to have
a bit of a problem with the code sample you so graciously provided.
It seems to work in all instances, except the original example I
provided (namely, 1234567890). On my system, the number 1234567890,
gets converted to 1234567895.5.

I made a few changes to your original program, but it is largely the
same with different test samples samples. Any thoughts ??

Sample Code Below ----------------------
# Conversion of Microsoft Binary Format numbers to Python Floats

import binascii as bn
import struct as st

data = [(1234567890,'000000AF052C139F'),
(4069954144,'00000060929672A0'),
(999999.99, '703D0AD7FF237494'),
( 88888.88, '400ad7a3709c2d91'),
( 22222.22, '400ad7a3709c2d8f'),
( 33333.33, 'b047e17a54350290'),
( 1500.34, '7814ae47e18a3b8b'),
( 42345.00, '0000000000692590'),
]

def msd2float(bytes):
if sum(bytes) in [0,72,127]: #take out values that don't make
sense possible the NaN and Infinity ??
return 0.0
b = bytes[:]
sign = bytes[-2]&0x80
b[-2] |= 0x80 #hidden most sig bit in place of sign
exp = bytes[-1] - 0x80 - 56 #exponent offset
acc = 0L
for i,byte in enumerate(b[:-1]):
acc |=(long(byte)<<(i*8))
return (float(acc)*2.0**exp)*((1.,-1.)[sign!=0])

for line in data:
val = line[0]
binval = bn.unhexlify(line[1])
le_bytes = list(st.unpack('BBBBBBBB',binval))
test = msd2float(le_bytes)
print " In:",val, "\nOut:",test,"\n"

Sample Output ------------------------
C:/Python24/pythonw.exe -u "C:/pytest/dms/Test MBF.pyw"
In: 1234567890
Out: 1234567895.5

In: 4069954144
Out: 4069954144.0

In: 999999.99
Out: 999999.99

In: 88888.88
Out: 88888.88

In: 22222.22
Out: 22222.22

In: 33333.33
Out: 33333.33

In: 1500.34
Out: 1500.34

In: 42345.0
Out: 42345.0
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,013
Latest member
KatriceSwa

Latest Threads

Top