Binary file - numbers

E

E-Star

I have a binary file that I want to read some numbers out of. However
the numbers are 32bit floats. How can I get the numbers into a C
program to use? I can do the calculations manually....but is there an
easier way? Is there some handy functions which do this?

example.

File contains: 42 7C 00 00
I want to read it in and end up with my C variable equalling 63.
 
R

Richard Bos

E-Star said:
I have a binary file that I want to read some numbers out of. However
the numbers are 32bit floats. How can I get the numbers into a C
program to use?

Well... you _could_ try fread(), but there's a catch. fread() reads
bytes into memory, not values. If those floats weren't written by a
system that arranges its floats exactly the same way as yours does, you
won't get the values you expected.
You can, of course, usually get away with having one program fwrite()
some floats, and then having another program on the same machine fread()
them later on, but if you want to read files from another machine,
you're taking a risk, and the risk is greater the less like your own
machine the source of the file is.
It's all a question of balancing demands, really. If you know that this
file is only ever going to contain "your" kind of floats, you opt for
simplicity and use fread(). If you know that this file is a cross-
platform standard and you want your program to run on several kinds of
machines, you parse it by hand.

Richard
 
P

Phil Tregoning

pete said:
It's a float representation for 63.0

Or more precisely, it's an IEEE 60559 single precision representation
for 63.0. The code below produces the output "63", and is based on the
description here:

http://www.psc.edu/general/software/packages/ieee/ieee.html

The code is not optimised or well tested.

For systems that actually use IEEE single floats, bear in mind that
they might have a different byte order.

Phil T



#include <stdio.h>
#include <math.h>
#include <float.h>
#include <assert.h>

double ieee_single(const void *v)
{
const unsigned char *data = v;
int s, e;
unsigned long src;
long f;
double value;

src = ((unsigned long)data[0] << 24) +
((unsigned long)data[1] << 16) +
((unsigned long)data[2] << 8) +
((unsigned long)data[3]);

s = (src & 0x80000000UL) >> 31;
e = (src & 0x7F800000UL) >> 23;
f = (src & 0x007FFFFFUL);

if (e == 255 && f != 0) {
/* NaN - Not a number */
value = DBL_MAX;
}
else if (e == 255 && f == 0 && s == 1) {
/* Negative infinity */
value = -DBL_MAX;
}
else if (e == 255 && f == 0 && s == 0) {
/* Positive infinity */
value = DBL_MAX;
}
else if (e > 0 && e < 255) {
/* Normal number */
f += 0x00800000UL;
if (s) f = -f;
value = ldexp(f, e - 127 - 23);
}
else if (e == 0 && f != 0) {
/* Denormal number */
if (s) f = -f;
value = ldexp(f, -126 - 23);
}
else if (e == 0 && f == 0 && s == 1) {
/* Negative zero */
value = 0;
}
else if (e == 0 && f == 0 && s == 0) {
/* Positive zero */
value = 0;
}
else {
/* Never happens */
printf("s = %d, e = %d, f = %lu\n", s, e, f);
assert(!"Oops, unhandled case in ieee_single()");
}

return value;
}

int main(void)
{
float f;
unsigned char combo[] = {0x42, 0x7c, 0x00, 0x00};
printf("%g\n", ieee_single(combo));
return 0;
}
 
C

CBFalconer

James said:
.... snip ...
I'm still trying to figure out how *any* combo of 0x42, 0x7C,
0x0 and 0x0 == 63?????

To start with 63 has (in binary) 6 successive 1 bits. Most FP
representations suppress the MS bit, and replace it with a sign
bit, leaving 5 bits. Where does your representation have 5
successive 1 bits? That would appear to locate the significand.
Most systems also use an offset method for the exponent, such that
the common integers are in the middle of the range. That would
appear to locate the exponent portion. Now all you have left to
worry about is the order of those two 0x0 bytes. A simple
experiment or two should resolve that.

The representation appears to be: sign, 8 bit binary exponent
offset 128, 23 bit significand, in order. Looks awfully IEEEish
to me.
 
D

Dhruv

I have a binary file that I want to read some numbers out of. However
the numbers are 32bit floats. How can I get the numbers into a C
program to use? I can do the calculations manually....but is there an
easier way? Is there some handy functions which do this?

example.

File contains: 42 7C 00 00
I want to read it in and end up with my C variable equalling 63.


If your program stored them in the standard IEEE floating point format,
then you could find out which bits represents what (number, sign,
exponent), and rotate the appropriate bits, and get the equivalent 16-bit
float. You would have been better off storing the number as ASCII text, if you
were worried about portability.

-Dhruv.
 
J

Jim Fischer

James said:
I'm still trying to figure out how *any* combo of 0x42, 0x7C, 0x0 and
0x0 == 63?????

Floating-point values are commonly implemented internally in accordance
with IEEE Standard 754 for Binary Floating-Point Arithmetic (IEEE 754).
Assuming your floating-point values are in fact IEEE 754 compliant (and
they probably are), take a look at the following web site:

http://babbage.cs.qc.edu/courses/cs341/IEEE-754.html

This site shows how base-10 floating-point values are represented in
IEEE 754 format, and vice versa.

FWIW, here's another web site with IEEE 754 info if you're interested:

http://www.sns.ias.edu/Main/info//sw/Workshop/Workshop/common/ug/
 
M

Martin Ambuhl

I have a binary file that I want to read some numbers out of. However
the numbers are 32bit floats. How can I get the numbers into a C
program to use? I can do the calculations manually....but is there an
easier way? Is there some handy functions which do this?

example.

File contains: 42 7C 00 00
I want to read it in and end up with my C variable equalling 63.

If you use fread() on the file -- opened as a binary file -- you can get
the bytes into a char[]. Just using memmove or memcpy to move them to a
float (or freading into a float) may nor suffice. For example, on my
implementation the bytes for 63.0f are { 0x0, 0x0, 0x7c, 0x42 }, so
would require reordering before using memmove to transfer them to the
float.
 
E

E-Star

Just to let everyone know...for simplicity I did write the bytes in
reverse.

ie. The file really contains 00 00 7C 42



Phil said:
pete said:
It's a float representation for 63.0

Or more precisely, it's an IEEE 60559 single precision representation
for 63.0. The code below produces the output "63", and is based on the
description here:

http://www.psc.edu/general/software/packages/ieee/ieee.html

The code is not optimised or well tested.

For systems that actually use IEEE single floats, bear in mind that
they might have a different byte order.

Phil T



#include <stdio.h>
#include <math.h>
#include <float.h>
#include <assert.h>

double ieee_single(const void *v)
{
const unsigned char *data = v;
int s, e;
unsigned long src;
long f;
double value;

src = ((unsigned long)data[0] << 24) +
((unsigned long)data[1] << 16) +
((unsigned long)data[2] << 8) +
((unsigned long)data[3]);

s = (src & 0x80000000UL) >> 31;
e = (src & 0x7F800000UL) >> 23;
f = (src & 0x007FFFFFUL);

if (e == 255 && f != 0) {
/* NaN - Not a number */
value = DBL_MAX;
}
else if (e == 255 && f == 0 && s == 1) {
/* Negative infinity */
value = -DBL_MAX;
}
else if (e == 255 && f == 0 && s == 0) {
/* Positive infinity */
value = DBL_MAX;
}
else if (e > 0 && e < 255) {
/* Normal number */
f += 0x00800000UL;
if (s) f = -f;
value = ldexp(f, e - 127 - 23);
}
else if (e == 0 && f != 0) {
/* Denormal number */
if (s) f = -f;
value = ldexp(f, -126 - 23);
}
else if (e == 0 && f == 0 && s == 1) {
/* Negative zero */
value = 0;
}
else if (e == 0 && f == 0 && s == 0) {
/* Positive zero */
value = 0;
}
else {
/* Never happens */
printf("s = %d, e = %d, f = %lu\n", s, e, f);
assert(!"Oops, unhandled case in ieee_single()");
}

return value;
}

int main(void)
{
float f;
unsigned char combo[] = {0x42, 0x7c, 0x00, 0x00};
printf("%g\n", ieee_single(combo));
return 0;
}
 
R

Richard Heathfield

Martin said:
Martin Ambuhl
Returning soon to the
Fourth Largest City in America

Um, would that be Bogota?

As far as I can make out[1], Sao Paolo comes in first (assuming "large"
means "high population"), at 9969000, Mexico City is second with 8605000,
New York City is third with 8008000, and Bogota is fourth with 6260000.

;-)



[1] I didn't check /every/ country in America.
 
R

Richard Heathfield

Lew said:
Assuming he meant "North America", it probably would be Chicago <grin>

Mexico City (8,657,000)
New York City (8,039,000)
Los Angeles (3,829,000)
Chicago (2,926,000)

If he meant "all the Americas (North, Central, and South), it probably
would be Los Angeles (Sao Paulo is 3rd)

LA fourth? Hmmm. Here are the top five that I know of.

Sao Paulo 9969000
Mexico City 8605000
New York City 8008000
Bogota 6260000
Toronto 4682000

LA ain't even in the same ball-park, y'all. :)
 
B

Blah

LA fourth? Hmmm. Here are the top five that I know of.

Sao Paulo 9969000
Mexico City 8605000
New York City 8008000
Bogota 6260000
Toronto 4682000

LA ain't even in the same ball-park, y'all. :)

Much like C coding, citing population statistics can have pitfalls for
the unwary.

[ using the U.S. million as in million == 1,000,000 ]

By some conventions, Sao Paolo has 10 million, NYC has 8 million, and Tokyo
has 20 million (citing Tokyo from memory). By other conventions, Sao Paolo
has 18 million, NYC has 25 million, and Tokyo has 35 million. Why the
discrepancy?

Some census takers feel that "metro area" is a more accurate measure of a
city's population. A "metro area" is somewhat more vaguely defined than
stricter means of determining boundaries, but it is aimed more at
identifying large population centers than worrying about politically drawn
lines. A "metro area" usually includes what is politically considered part
of the city as well as immediate suburbs - mainly anywhere that a given city
is likely to draw its work force from.

Not having the metro area numbers in front of me beyond Sao Paolo and NYC
for the Americas, I can't say what number 4 on the list might be in those
terms.

BUT, making a few assumptions here, I think we can determine where Mr.
Ambuhl lives.

assumption 1) By "America" he actually meant "United States"
assumption 2) He is counting political boundaries and not metro areas
assumption 3) He recognizes Brooklyn's uniqueness and counts it as separate
from NYC.

Thus we get:

1) NYC (sans Brooklyn) 5.3 M
2) LA 3.8 M
3) Chicago 2.9 M
4) Brooklyn 2.7 M

It's fortunate that he counts Brooklyn as separate, as without it #4 would
be Houston, and I've heard of life there being referred to as "exile" and
the state described as "the armpit of the world".
 
B

Blah

Simon Biber said:
Blah said:
[ using the U.S. million as in million == 1,000,000 ]

What other million is there?!

I knew the American and European *illions differed at some point, and
just assumed the whole system was different. Upon research I discovered the
divergence occurs after million, but up to that point they stay the same.

So the comment is accurate, if meaningless.
 
D

Dan Pop

In said:
Just to let everyone know...for simplicity I did write the bytes in
reverse.

ie. The file really contains 00 00 7C 42

You've lost me here. Why was it any simpler to write the bytes in
reverse order?

Anyway, both orders are actually used by real implementations: the one you
originally posted is used by big endian implementations, while the
one actually used in your file is used by little endian implementations.

Dan
 
J

jjp

(snip)
It's fortunate that he counts Brooklyn as separate, as without it #4 would
be Houston, and I've heard of life there being referred to as "exile" and
the state described as "the armpit of the world".

Obviously you've been hearing from people who don't know much about it
then.

Houston is a big and international port city -- in that way much like
New York, L.A., Seattle and Miami, and with some of the country's best
big city amenities.

And if Texas were such an "armpit" I don't think people would be
moving here at the rates they are.
 
B

Blah

jjp said:
Obviously you've been hearing from people who don't know much about it
then.

Houston is a big and international port city -- in that way much like
New York, L.A., Seattle and Miami, and with some of the country's best
big city amenities.

And if Texas were such an "armpit" I don't think people would be
moving here at the rates they are.

I *knew* that would go over someone's head.

Take at look at whose signature started this subthread, and then google for
the phrase "armpit of the world" and that should turn up what I was getting
at.
 
M

Martin Ambuhl

I *knew* that would go over someone's head.

Take at look at whose signature started this subthread, and then
google for the phrase "armpit of the world" and that should turn up
what I was getting at.

If you did that, you would notice that I never mentioned Houston, so your
rant has absolutely no foundation.
 
B

Blah

Martin Ambuhl said:
If you did that, you would notice that I never mentioned Houston, so your
rant has absolutely no foundation.

Well that attempt at a joke went quite sour.

My rant's foundation is in misunderstanding. My apologies for
unwittingly putting words in your mouth.

Between one sig referencing TX and another referencing the 4th largest
US city, I assumed you actually meant Houston. Oh well. That'll teach me
not to make dumb jokes.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top