reading Java floats from C

S

sbalko

Hi,

I am trying to read Java-floats (IEEE 754 encoding) stored in a binary
file from C (gcc on linux/i386, more specifically). Unfortunately, C
seems to expect floats to be stored somewhat differently than Java
does. I suspected an endianess problem and tried out ntohl/htonl but it
doesn't help.

Any clues?

Thanks,
Sören
 
L

Lawrence Kirby

Hi,

I am trying to read Java-floats (IEEE 754 encoding) stored in a binary
file

The question there would be how Java stores floats in a file, which would
depend in the code used to store them. From the information given there's
no reason to assume that the Java code is storing them in the same binary
format used internally. I think you'll need to discuss this in a Java
related newsgroup.
from C (gcc on linux/i386, more specifically). Unfortunately, C
seems to expect floats to be stored somewhat differently than Java
does. I suspected an endianess problem and tried out ntohl/htonl but it
doesn't help.

C doesn't specify the representation used by floating point types,
although IEEE 754 is typical. If you give some information about the
file format the Java code is using, and the C code you are using to read
the values we should be able to help you.

Lawrence
 
S

sbalko

Lawrence said:
The question there would be how Java stores floats in a file, which would
depend in the code used to store them. From the information given there's
no reason to assume that the Java code is storing them in the same binary
format used internally. I think you'll need to discuss this in a Java
related newsgroup.
On the java side, I am using DataOutputStream's writeFloat method which
explicitly uses IEEE 754 to encode a float into 4 bytes.
C doesn't specify the representation used by floating point types,
although IEEE 754 is typical. If you give some information about the
file format the Java code is using, and the C code you are using to read
the values we should be able to help you.
Actually the java file is a plain format with intermixed ASCII and
subsequently stored floats . On the C side, things are a bit more
complex. I am using mmap to map the file to a main memory address
(casted to a char* pointer). Then I memcpy 4 bytes from certain offsets
in the buffer to a float variable. I also tried to apply ntohl on that
float but that doesn't solve my problem either.
 
M

Malcolm

Actually the java file is a plain format with intermixed ASCII and
subsequently stored floats . On the C side, things are a bit more
complex. I am using mmap to map the file to a main memory address
(casted to a char* pointer). Then I memcpy 4 bytes from certain offsets
in the buffer to a float variable. I also tried to apply ntohl on that
float but that doesn't solve my problem either.
Do you know how floating point numbers are generally constructed?

By doing some experiments you ought to be able to work out what
representation your Java platform and C compiler uses, and to convert. Watch
out for special cases like nan, infinity, and very small numbers.
 
L

Lawrence Kirby

On Mon, 20 Jun 2005 15:34:53 -0700, sbalko wrote:

....
Actually the java file is a plain format with intermixed ASCII and
subsequently stored floats .

I suggest you log the representation of the data you've read in. You
access the representation of an object by treating it as an array of
unsigned char e.g.

TYPE var = value;
const unsigned char *ptr = (const unsigned char *)&var;

for (i = 0; i < sizeof var; i++)
printf(" %02x", ptr);

Also do this for the same values set in the C environment. You should then
be able to see if

a) you've read the data in correctly

b) how the Java and C representations correspond.
On the C side, things are a bit more
complex. I am using mmap to map the file to a main memory address
(casted to a char* pointer). Then I memcpy 4 bytes from certain offsets
in the buffer to a float variable. I also tried to apply ntohl on that
float but that doesn't solve my problem either.

ntohl isn't a standard C library function. Given the common socket related
definition of it you can't apply it directly to a float, you would be
converting the value to a long, swapping bytes and converting back again
which will give a completely wrong result.

Instead of using memcpy() to copy into the float try code that copies the
bytes to the float object in reverse order. E.g.

void unmarshall_float(float *fl, const unsigned char *data)
{
unsigned char *flrep = (unsigned char *)fl;
int i;

for (i = 0; i < sizeof(float); i++)
flrep = data[sizeof(float)-1-i];
}

Lawrence
 
J

Joe Wright

Lawrence said:
On Mon, 20 Jun 2005 15:34:53 -0700, sbalko wrote:

...

Actually the java file is a plain format with intermixed ASCII and
subsequently stored floats .


I suggest you log the representation of the data you've read in. You
access the representation of an object by treating it as an array of
unsigned char e.g.

TYPE var = value;
const unsigned char *ptr = (const unsigned char *)&var;

for (i = 0; i < sizeof var; i++)
printf(" %02x", ptr);

Also do this for the same values set in the C environment. You should then
be able to see if

a) you've read the data in correctly

b) how the Java and C representations correspond.

On the C side, things are a bit more
complex. I am using mmap to map the file to a main memory address
(casted to a char* pointer). Then I memcpy 4 bytes from certain offsets
in the buffer to a float variable. I also tried to apply ntohl on that
float but that doesn't solve my problem either.


ntohl isn't a standard C library function. Given the common socket related
definition of it you can't apply it directly to a float, you would be
converting the value to a long, swapping bytes and converting back again
which will give a completely wrong result.

Instead of using memcpy() to copy into the float try code that copies the
bytes to the float object in reverse order. E.g.

void unmarshall_float(float *fl, const unsigned char *data)
{
unsigned char *flrep = (unsigned char *)fl;
int i;

for (i = 0; i < sizeof(float); i++)
flrep = data[sizeof(float)-1-i];
}

Lawrence


I would, if possible, coerce Java to write text like '1.23456789e2' for
floats. Convert them with strtod() on the C side.
 
R

Robert Maas, see http://tinyurl.com/uh3t

(I've cross-posted this to comp.programming where it's more relevant.
Also I've blacked out the specific names of programming languages
because that's irrelevant to my general answer.)
From: (e-mail address removed)
I am trying to read ###-floats (IEEE ??? encoding) stored in a binary
file from %%% (??? on ???, more specifically). Unfortunately, %%% seems
to expect floats to be stored somewhat differently than ### does. I
suspected an endianess problem and tried out ntohl/htonl but it
doesn't help. Any clues?

If you can't find such an answer from online documents, why didn't you
just do some experiments? For example, try this to see how ### writes
floats in binary mode: Write a test program that writes out exactly
five values of exactly 0.0, then write out these values in sequence:
9.0 0.0 10.0 0.0 11.0 0.0 12.0 0.0 13.0 0.0 14.0 0.0 15.0 0.0, and then
examine the resultant file to see if you can find:
- The same exact pattern repeating exactly five times before it's
broken by other patterns not the same, to show you what the 0.0 looks
like in binary file format.
- Alternating original pattern and other patterns the same length, to
make sure you haven't accidently used different precision for the
non-zero values generated from the index variable in your loop and the
zero values generated by literals.
- Among those non-zero groups of bytes, see if you can find a bit
pattern that goes somewhat like this:
1001
1010
1011
1100
1101
1110
1111
The '1' might be missing if it's in a notation where the 1 is assumed
rather than explicit, but the other bits should follow that pattern.

At that point you have a good idea where the mantissa is located. Now
to find where the exponent is located, generate this sequence:
0.0 1.0 0.0 2.0 0.0 4.0 0.0 8.0 0.0 16.0 0.0 32.0 0.0 64.0 0.0
You should see a similar pattern in the bits.

Finally you need to know how negative numbers are expressed.
I leave that as an exercise for the reader.

Once you know all that for ###, do the same for %%%.
Write that test data from the program that will be doing reading.
(Unless it's totally broken, it should write data in the same layout
that it expects to read it in.)

Now compare what you learned about ### and %%%, whether sequence of
bytes is the only difference, or there's a more complicated difference
in representation.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top