Anyone recognize this numeric storage format - similar to "float", but not quite

geskerrett · Aug 24, 2005

We are working on a project to decipher a record structure of an old
accounting system that originates from the late80's mid-90's.
We have come across a number format that appears to be a "float" but
doesn't match any of the more standard implementations.
so we are hoping this is a recognizable number storage format with an
identifiable name AND pre-built conversion method
similiar to the "struct" modules available in python.

Here is what we have determined so far.

Example Number: 1234567890

This get stored on disk as 8 bytes, resulting in the following HEX
characters;
00 00 00 A4 05 2c 13 9f

If we changed the order so that it is "little Endian" we get;
9F 13 2c 05 A4 00 00 00

If the HEX is converted to binary it looks like;
10011111 00010011 00101100 00000101 10100100 00000000 000000000
00000000

If the example number 1234567890 is converted to binary it looks like;

10010011 00101100 00000101 1010010

To extract the example number, you need to do the following;
1) take the decimal value of the first byte and subtract 128
2) This tells you how many of the following bits to are significant and
must be read
3) Once the remaining bits are read, reverse the first bit of that
group (ie if it is a 0 make it a 1)
4) convert the result to decimal
.... and presto, the example number !

Using a fixed width font it is easy to see the match at the bit level;

10011111 00010011001011000000010110100100000000000000000000000000
-------- 1001001100101100000001011010010

If you are interested, the following are three other examples;

Orig Hex: 00 00 00 60 92 96 72 A0
Actual Value: 4069954144

Orig Hex: 00 00 80 22 A3 26 3C A1
Actual Value: 6313297477

So ... does anyone recognize this ??
Is there a "built-in" conversion method in Python ??

Thanks in advance.

Terry Reedy · Aug 24, 2005

This appears to be a repost, perhaps not by the op but due to a glitch
somewhere, of a question posted about a month ago and answered.

Bengt Richter · Aug 24, 2005

We are working on a project to decipher a record structure of an old
accounting system that originates from the late80's mid-90's.
We have come across a number format that appears to be a "float" but
doesn't match any of the more standard implementations.
so we are hoping this is a recognizable number storage format with an
identifiable name AND pre-built conversion method
similiar to the "struct" modules available in python.

Here is what we have determined so far.

Example Number: 1234567890

This get stored on disk as 8 bytes, resulting in the following HEX
characters;
00 00 00 A4 05 2c 13 9f

If we changed the order so that it is "little Endian" we get;
9F 13 2c 05 A4 00 00 00

If the HEX is converted to binary it looks like;
10011111 00010011 00101100 00000101 10100100 00000000 000000000
00000000

If the example number 1234567890 is converted to binary it looks like;

10010011 00101100 00000101 1010010

To extract the example number, you need to do the following;
1) take the decimal value of the first byte and subtract 128
2) This tells you how many of the following bits to are significant and
must be read
3) Once the remaining bits are read, reverse the first bit of that
group (ie if it is a 0 make it a 1)
4) convert the result to decimal
... and presto, the example number !

Using a fixed width font it is easy to see the match at the bit level;

10011111 00010011001011000000010110100100000000000000000000000000
-------- 1001001100101100000001011010010

If you are interested, the following are three other examples;

Orig Hex: 00 00 00 60 92 96 72 A0
Actual Value: 4069954144

Orig Hex: 00 00 80 22 A3 26 3C A1
Actual Value: 6313297477

So ... does anyone recognize this ??
Is there a "built-in" conversion method in Python ??

Thanks in advance.

Not looking too closely, but I recall something similar (although I suspect that the bit you
are "reversing" is a sign bit that shadows a known constant MSB 1 for non-zero numbers, and
shouldn't just be reversed):

http://groups.google.com/group/comp...ccc20a1d8d5/4aadc71be8aeddbe#4aadc71be8aeddbe

Regards,
Bengt Richter

Bengt Richter · Aug 24, 2005

[...]

This appears to be a repost, perhaps not by the op but due to a glitch
somewhere, of a question posted about a month ago and answered.

</moved>
UIAM the more or less recent original you are thinking of turned out to be
straight IEEE double format, and I think this is not, though I think it looks
like one that was answered (by me ;-) quite a while ago (Dec 1 2003).

Regards,
Bengt Richter

geskerrett · Aug 24, 2005

Thanks Bengt for directing me to your previous post.
I think I agree with you on the "reversing bit" and the constant MSB.
In reworking my examples I was always changing the 0 to 1.

geskerrett · Aug 26, 2005

I am not sure if you are still watching this thread, but I seem to have
a bit of a problem with the code sample you so graciously provided.
It seems to work in all instances, except the original example I
provided (namely, 1234567890). On my system, the number 1234567890,
gets converted to 1234567895.5.

I made a few changes to your original program, but it is largely the
same with different test samples samples. Any thoughts ??

Sample Code Below ----------------------
# Conversion of Microsoft Binary Format numbers to Python Floats

import binascii as bn
import struct as st

data = [(1234567890,'000000AF052C139F'),
(4069954144,'00000060929672A0'),
(999999.99, '703D0AD7FF237494'),
( 88888.88, '400ad7a3709c2d91'),
( 22222.22, '400ad7a3709c2d8f'),
( 33333.33, 'b047e17a54350290'),
( 1500.34, '7814ae47e18a3b8b'),
( 42345.00, '0000000000692590'),
]

def msd2float(bytes):
if sum(bytes) in [0,72,127]: #take out values that don't make
sense possible the NaN and Infinity ??
return 0.0
b = bytes[:]
sign = bytes[-2]&0x80
b[-2] |= 0x80 #hidden most sig bit in place of sign
exp = bytes[-1] - 0x80 - 56 #exponent offset
acc = 0L
for i,byte in enumerate(b[:-1]):
acc |=(long(byte)<<(i*8))
return (float(acc)*2.0**exp)*((1.,-1.)[sign!=0])

for line in data:
val = line[0]
binval = bn.unhexlify(line[1])
le_bytes = list(st.unpack('BBBBBBBB',binval))
test = msd2float(le_bytes)
print " In:",val, "\nOut:",test,"\n"

Sample Output ------------------------
C:/Python24/pythonw.exe -u "C:/pytest/dms/Test MBF.pyw"
In: 1234567890
Out: 1234567895.5

In: 4069954144
Out: 4069954144.0

In: 999999.99
Out: 999999.99

In: 88888.88
Out: 88888.88

In: 22222.22
Out: 22222.22

In: 33333.33
Out: 33333.33

In: 1500.34
Out: 1500.34

In: 42345.0
Out: 42345.0

Does any one recognize this binary data storage format	27	Aug 9, 2005
How to use ufixed when it involves multiplication a number of times?(VHDL question)	0	Aug 22, 2016
Photo does not recognize format of Archive::Zip contents	10	Jun 8, 2005
ERROR: storage size of 'tzp' isn't known	1	May 12, 2005
can read drwtsn32 log entry?	1	Jan 1, 2004
Using Validation controls to check for NON NUMERIC input. (This should be easy!)	2	Dec 28, 2005
No-syntax Web-programming-IDE (was: Does turtle graphics have the wrong associations?)	0	Nov 22, 2009
What Is Computer Programming Anyway?	3	Jun 8, 2009

Anyone recognize this numeric storage format - similar to "float", but not quite

geskerrett

Terry Reedy

Bengt Richter

Bengt Richter

geskerrett

geskerrett

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads