Convert IEEE single from integer representation

Discussion in 'Perl Misc' started by A. Sinan Unur, Mar 10, 2007.

  1. Hello all:

    In one of my programs, I had to read data that was saved in binary
    format. Some of the data consisted of IEEE 754 single precision floats
    saved as 32 bit integers (in network order). My pack/unpack skills are
    not that great (I don't think they can handle this case) so I wrote
    something to handle the conversion of these numbers.

    I would very much appreciate if you can take a look at the code and see
    if I am doing anything that I should not be doing or if there is a
    better way of doing this.

    The function in question is ieee_single_from_int in the string below.

    The function is a straight-forward application of the manual steps
    needed to go from the integer representation to the floating point
    number.

    I would like to know if there is an obvious way of doing this that I
    have missed or if there is a CPAN module that already handles these
    kinds of conversions. If not, I'll package this as a module and start
    preparing my first ever CPAN contribution ;-)

    You can use http://babbage.cs.qc.edu/IEEE-754/32bit.html to check for
    correctness. http://en.wikipedia.org/wiki/IEEE_754 explains the format.


    #!/usr/bin/perl

    use strict;
    use warnings;

    my $buffer;

    # The following loop replaces the routine to read reasonably
    # sized chunks from the file.

    while ( my $line = <DATA> ) {
    my $hex;
    last unless ( $hex ) = ($line =~ /\A\d{7}: ([[:xdigit:] ]+)/);
    while ( $hex =~ /([[:xdigit:]]{2})/g ) {
    $buffer .= chr( hex $1 );
    }
    }

    for ( my $i = 0; $i < length $buffer; $i += 4 ) {
    my $uint32 = unpack 'N', substr( $buffer, $i, 4 );

    my ($v, $e) = ieee_single_from_int( $uint32 );

    if ( defined $v ) {
    printf "%8.8x : % .16f\n", $uint32, $v;
    }
    else {
    warn sprintf "%8.8x : %s\n", $uint32, $e;
    }
    }

    use constant DENOMINATOR => 0x00800000;
    use constant UINT32_MASK => 0xffffffff;
    use constant SIGN_MASK => 0x80000000;
    use constant FRAC_MASK => 0x007fffff;
    use constant EXP_MASK => 0x7f800000;

    sub ieee_single_from_int {
    my $uint32 = ( $_[0] & UINT32_MASK );
    my $exp = ( $uint32 & EXP_MASK ) >> 23;
    my $frac = $uint32 & FRAC_MASK;
    my $sign = $uint32 & SIGN_MASK ;

    my ($v, $e);

    if ( $exp and $exp < 0xff ) {
    $v = ( 1 + $frac / DENOMINATOR ) * ( 2**( $exp - 127) );
    }
    elsif( $exp == 0x00 ) {
    $v = ( $frac / DENOMINATOR ) * ( 2**( -126 ) );
    }
    elsif( $exp == 0xff ) {
    $e = $frac ? "NaN"
    : $sign ? "-Infinity"
    : "+Infinity";
    }

    $v = -$v if defined( $v ) and $sign;
    return wantarray ? ( $v, $e ) : $v;
    }


    __DATA__
    0000420: 4016 2933 3f1b 739a be86 8200 c00d c853 @.)3?.s........S
    0000430: bf18 7633 404a 3eba bfc5 b34d 3ea3 00a7 ..v3@J>....M>...
    0000440: bfae 1e10 3e30 8d00 bfa0 02da bfb9 2bed ....>0........+.
    0000450: 3f33 66da bfbc 9b4d 3fa3 c200 c088 cd93 ?3f....M?.......
    0000460: 40f2 5e4a 4005 5407 c086 b92a bf61 5f8a @.^J@.T....*.a_.
    0000470: bf2a 75da 3f5d 2a4d bf9a 1373 bfbd 475a .*u.?]*M...s..GZ


    Thank you for your time.

    Sinan
    A. Sinan Unur, Mar 10, 2007
    #1
    1. Advertising

  2. Bob Walton <> wrote in news:45f2399e$0$1369
    $:

    > A. Sinan Unur wrote:


    [ snipped by Bob ]

    > I suggest:
    >
    > use strict;
    > use warnings;
    > while ( my $line = <DATA> ) {
    > my $hex;
    > ($hex)=$line=~/:((?: [[:xdigit:]]{4}){8})/;
    > $hex=~s/ //g;
    > while($hex=~s/([[:xdigit:]]{8})//){
    > my $str=$1;
    > my $float=unpack 'f',reverse pack 'H8',$str;
    > print "$str : $float\n";
    > }
    > }


    ....

    >
    > HTH.


    Well, this certainly does help (at least in those cases where I can
    assume the platform's internal format for representing floats matches
    the IEEE format).

    In my original problem the file contained the binary representations
    (that is, not the hex dump I included in my post, but the actual bytes).

    So, I got rid of the ieee_single_from_int function, replaced the calls
    with:

    my $in = unpack V => substr $$record_ref, 32 + 4 * $month, 4;
    my $out = unpack f => pack H8 => sprintf '%8.8x', $in;

    given that $$record_ref contains the actual bytes rather than hex chars.

    Thank you for showing me this. It does, of course rely on the platform
    specific f doing the right thing but since I am only using this to
    convert files for my own use, I don't think that will be a problem.

    Sinan
    A. Sinan Unur, Mar 10, 2007
    #2
    1. Advertising

  3. [A complimentary Cc of this posting was sent to
    A. Sinan Unur
    <>], who wrote in article <Xns98EF717D7F9Dasu1cornelledu@127.0.0.1>:
    > Well, this certainly does help (at least in those cases where I can
    > assume the platform's internal format for representing floats matches
    > the IEEE format).


    Keep in mind that there is no such thing as "an IEEE format". IEEE
    requires a certain *semantic* of floats, not a particular way of
    binary representation. However, IIRC, all but 2 architechtures use
    one of two representations, related to each other as V is to N
    (pack-parlance).

    Hope this helps,
    Ilya
    Ilya Zakharevich, Mar 10, 2007
    #3
  4. Ilya Zakharevich <> wrote in
    news:esu72k$2i7$:

    > [A complimentary Cc of this posting was sent to
    > A. Sinan Unur
    > <>], who wrote in article
    > <Xns98EF717D7F9Dasu1cornelledu@127.0.0.1>:
    >> Well, this certainly does help (at least in those cases where I can
    >> assume the platform's internal format for representing floats matches
    >> the IEEE format).

    >
    > Keep in mind that there is no such thing as "an IEEE format". IEEE
    > requires a certain *semantic* of floats, not a particular way of
    > binary representation. However, IIRC, all but 2 architechtures use
    > one of two representations, related to each other as V is to N
    > (pack-parlance).


    I was using the word 'format' not to refer to the way they were stored
    on disk but rather what the bits mean once you have it in the
    appropriate int. Thank you for the clarification.

    Would you mind posting/letting me know which two architectures you are
    referring to above?

    Thank you.

    Sinan

    --
    A. Sinan Unur <>
    (remove .invalid and reverse each component for email address)
    clpmisc guidelines: <URL:http://www.augustmail.com/~tadmc/clpmisc.shtml>
    A. Sinan Unur, Mar 10, 2007
    #4
  5. [A complimentary Cc of this posting was sent to
    A. Sinan Unur
    <>], who wrote in article <Xns98EFBA3051F7Dasu1cornelledu@127.0.0.1>:
    > > Keep in mind that there is no such thing as "an IEEE format". IEEE
    > > requires a certain *semantic* of floats, not a particular way of
    > > binary representation. However, IIRC, all but 2 architechtures use
    > > one of two representations, related to each other as V is to N
    > > (pack-parlance).

    >
    > I was using the word 'format' not to refer to the way they were stored
    > on disk but rather what the bits mean once you have it in the
    > appropriate int.


    Me too.

    > Would you mind posting/letting me know which two architectures you are
    > referring to above?


    If I remebered, I would write it down in the initial post. ;-)

    If I could trust what I vaguely remember :-(, there were 2 very obscure
    names I have not ever heard in any other context. ;-)

    Hope this helps,
    Ilya
    Ilya Zakharevich, Mar 11, 2007
    #5
  6. Ilya Zakharevich <> wrote in
    news:et0m7o$1psp$:

    > [A complimentary Cc of this posting was sent to
    > A. Sinan Unur
    > <>], who wrote in article
    > <Xns98EFBA3051F7Dasu1cornelledu@127.0.0.1>:
    >> > Keep in mind that there is no such thing as "an IEEE format". IEEE
    >> > requires a certain *semantic* of floats, not a particular way of
    >> > binary representation. However, IIRC, all but 2 architechtures use
    >> > one of two representations, related to each other as V is to N
    >> > (pack-parlance).

    >>
    >> I was using the word 'format' not to refer to the way they were
    >> stored on disk but rather what the bits mean once you have it in the
    >> appropriate int.

    >
    > Me too.


    Oh, OK, I have to do some more reading then.

    >
    >> Would you mind posting/letting me know which two architectures you
    >> are referring to above?

    >
    > If I remebered, I would write it down in the initial post. ;-)


    ;-) Thank you.

    Sinan


    --
    A. Sinan Unur <>
    (remove .invalid and reverse each component for email address)
    clpmisc guidelines: <URL:http://www.augustmail.com/~tadmc/clpmisc.shtml>
    A. Sinan Unur, Mar 11, 2007
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Spendius
    Replies:
    16
    Views:
    21,498
    Jon Skeet
    Sep 8, 2003
  2. les ander
    Replies:
    4
    Views:
    861
    wes weston
    Oct 5, 2004
  3. ej
    Replies:
    4
    Views:
    833
  4. =?utf-8?B?Qm9yaXMgRHXFoWVr?=
    Replies:
    3
    Views:
    425
    John Machin
    Sep 15, 2007
  5. c64
    Replies:
    0
    Views:
    879
Loading...

Share This Page