IBM integer and double formats

john.goodleaf · Nov 10, 2008

I'm poking at writing data out to a SAS XPORT file (transport file).
Each record must be 80 bytes long, ASCII. Integers should be "IBM-
style integers" and floats should be "IBM-style doubles." Now I have
some idea what that means from reading a C source file documented by
the SAS institute, but before I go over the deep end trying to write
my own routines, does anyone know of an already-done means of writing
integers and floats out to their IBM mainframe equivalents?

Thanks,
John

Mark Dickinson · Nov 10, 2008

does anyone know of an already-done means of writing
integers and floats out to their IBM mainframe equivalents?

I don't know of anything in Python that does this. There was
a thread a while ago that may be relevant:

http://mail.python.org/pipermail/python-list/2004-March/255361.html

and Googling 'IBM VAX double conversion' produces some tools that
convert between IEEE, IBM and VAX double formats. I'm not quite
sure what you're after here: I assume that you're working on
some type of modern non-IBM (and therefore presumably IEEE
754-based) hardware, and you want to be able to convert IEEE
doubles to IBM style doubles. Is that right?

It shouldn't be too difficult to write routines, making use
of math.ldexp and math.frexp. The fun part would be deciding
how to handle NaNs and infinities when encoding to IBM format,
and how to handle out-of-range floats coming back (if I recall
correctly, the IBM format allows a wider range of exponents
than IEEE).

Mark

Mark Dickinson · Nov 10, 2008

and how to handle out-of-range floats coming back (if I recall
correctly, the IBM format allows a wider range of exponents
than IEEE).

Whoops---wrong way around. It looks like it's IEEE that has the
larger exponent range. DBL_MAX is around 2.**252 for IBM and
2.**1024 for IEEE.

Sorry.

Mark

Mark Dickinson · Nov 10, 2008

my own routines, does anyone know of an already-done means of writing
integers and floats out to their IBM mainframe equivalents?

Here's a quick attempt at converting doubles using Python.
It uses the isnan and isinf functions that are in Python 2.6;
if you don't have Python 2.6 or don't care about IEEE specials
then just delete the relevant lines.

Mark

def IBMtoIEEE(x):
"""Convert a Python float (assumed stored as an IEEE 754 double)
to IBM hexadecimal float format.

NaNs and infinities raise ValueError. IEEE values that are too
large to be stored in IBM format raise OverflowError. Values that
are too small to be represented exactly in IEEE format are rounded
to the nearest IBM value, using round-half-to-even.

The result is returned as a hex string.

"""
if isnan(x) or isinf(x):
raise ValueError("cannot convert infinity or nan to IBM
format")
if not x:
s, m, e = 0, 0, 0
else:
s = 0 if x > 0.0 else 1
m, e = frexp(x)
m, e = int(abs(m) * 2**(56 - -e % 4)), (e + -e % 4)//4 + 64
if e >= 128:
raise OverflowError("value too large to represent in IBM
format")
elif e < 0:
h = 2**(4*-e - 1)
m = m // (2*h) + (1 if m & h and m & (3*h-1) else 0)
e = 0
return "%x" % (s*2**63 + e*2**56 + m)

John Machin · Nov 10, 2008

Here's a quick attempt at converting doubles using Python.
It uses the isnan and isinf functions that are in Python 2.6;
if you don't have Python 2.6 or don't care about IEEE specials
then just delete the relevant lines.

Mark

def IBMtoIEEE(x):
"""Convert a Python float (assumed stored as an IEEE 754 double)
to IBM hexadecimal float format.

Call me crazy if you like, but I'd name that function IEEEtoIBM.

return "%x" % (s*2**63 + e*2**56 + m)

That's a hexadecimal representation in lowercase with no leading
zeroes ... variable length and lowercase doesn't seem very IBM to me.

The extremely ugly C code in
http://support.sas.com/techsup/technote/ts140.html
seems to indicate an 8-byte bigendian binary representation. Note that
page contains a tohex() function which is not used AFAICT.

Perhaps this would do the job:
return struct.pack('>Q', s*2**63 + e*2**56 + m)

Cheers,
John

Mark Dickinson · Nov 11, 2008

Call me crazy if you like, but I'd name that function IEEEtoIBM.

But it's topsy-turvy day! Didn't you get the memo?

Oh, all right. IEEEtoIBM it is.

That's a hexadecimal representation in lowercase with no leading
zeroes ... variable length and lowercase doesn't seem very IBM to me.

True. Replace "%x" with "%016X" for fixed-length uppercase. Or as you
say, bytes output is probably more natural. I was guessing that the
OP
wants to write the converted float out to an ASCII file, in hex.

Mark

John Machin · Nov 11, 2008

But it's topsy-turvy day! Didn't you get the memo?

Oh, all right. IEEEtoIBM it is.

True. Replace "%x" with "%016X" for fixed-length uppercase. Or as you
say, bytes output is probably more natural. I was guessing that the
OP
wants to write the converted float out to an ASCII file, in hex.

Sheesh. It's an *IBM mainframe* file. It would need to be in EBCDIC,
not ASCII. But why guess? He said he wanted to write it out in SAS
XPORT format.

Mark Dickinson · Nov 11, 2008

Sheesh. It's an *IBM mainframe* file. It would need to be in EBCDIC,
not ASCII. But why guess? He said he wanted to write it out in SAS
XPORT format.

Which is stored in ASCII, no? From the link you gave earlier, 3rd
line of the introduction:

"All character data are stored in ASCII, regardless of the
operating system."

Mark

Mark Dickinson · Nov 11, 2008

"All character data are stored in ASCII, regardless of the
operating system."

But character data is not the same thing as numeric data. Okay---
you win again, John.

Sheesh. [...]

Apologies for annoying you.

Mark

Tasks	1	Nov 29, 2022
reading binary file into memory. Converting from char to uint32,float, double, ASCII strings etc (st	37	Oct 15, 2011
The devolution of English language and slothful c.l.p behaviors exposed!	50	Jan 24, 2012
C language now truly universal	0	Jan 1, 2011
performance of std::vector<double>, double[] and uBlas::vector ondifferent CPU	15	Apr 24, 2005
Reading in cooked mode (was Re: Python MSI not installing, log fileshowing name of a Viatnemese comm	8	Mar 23, 2014
In the Matter of Herb Schildt: a Detailed Analysis of "C: TheComplete Nonsense"	109	Apr 3, 2010
comp.lang.c and the BP Oil Spill	7	May 30, 2010

IBM integer and double formats

john.goodleaf

Mark Dickinson

Mark Dickinson

Mark Dickinson

John Machin

Mark Dickinson

John Machin

Mark Dickinson

Mark Dickinson

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads