How to make binary data portable?

PengYu.UT · Jun 30, 2005

Hi,

I write the content of a in file "data" (in Sun Machine). Then I read
"data" in both SunOS and linux. But the result is different. Do you
know how to make it binary data portable.

Best wishes,
Peng

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[]){

int a = 100;
int b;

FILE *fp;
/* fp = fopen("data", "w");
fwrite(&a, sizeof(int), 1, fp);
fclose(fp);
*/

fp = fopen("data", "r");
fread(&b, sizeof(int), 1, fp);
fclose(fp);

printf("b = %x\n", *((unsigned int*)&b));

return 0;
}

Martin Ambuhl · Jun 30, 2005

Hi,

I write the content of a in file "data" (in Sun Machine). Then I read
"data" in both SunOS and linux. But the result is different. Do you
know how to make it binary data portable.

Binary numeric data is inherently not portable. If you want files to be
portable, your best bet is to write numeric data as text. Even that
assumes that the different implementations|platforms use a common form
of encoding text. You will find that when transporting data from one
implementation|platform to another you still need to consider whether
you need to convert that data.

Walter Roberson · Jun 30, 2005

Binary numeric data is inherently not portable. If you want files to be
portable, your best bet is to write numeric data as text. Even that
assumes that the different implementations|platforms use a common form
of encoding text.

The "xdr" library (which is NOT part of the C standard itself) was
written to try to deal with these issues. "xdr" stands for
"external data representation". It is commonly used for
Remote Procedure Calls, so it is available for a wide variety
of systems.

I seem to recall that the xdr folk got around to extending xdr to
work with 64 bit values, but I am not sure how widely those extensions
got implemented.

PengYu.UT · Jun 30, 2005

Martin said:
Binary numeric data is inherently not portable. If you want files to be
portable, your best bet is to write numeric data as text. Even that
assumes that the different implementations|platforms use a common form
of encoding text. You will find that when transporting data from one
implementation|platform to another you still need to consider whether
you need to convert that data.

Is there any easy way to convert the data?

PengYu.UT · Jun 30, 2005

Walter said:
The "xdr" library (which is NOT part of the C standard itself) was
written to try to deal with these issues. "xdr" stands for
"external data representation". It is commonly used for
Remote Procedure Calls, so it is available for a wide variety
of systems.

I seem to recall that the xdr folk got around to extending xdr to
work with 64 bit values, but I am not sure how widely those extensions
got implemented.

Do you have a rough idea how much performance will be lost using xdr
instead of using native representations, when I don't have to use xdr?

Randy Howard · Jun 30, 2005

Is there any easy way to convert the data?

Define 'easy'.

You could just write it all out as ASCII text, using a known
format, then read it in and convert it based upon that format.

The short example you used only involved an int, so it's pretty
simple. What are you really trying to do?

Or you could use something like XML if you have managers around
that like buzzwords.

Charles Mills · Jun 30, 2005

Hi,

I write the content of a in file "data" (in Sun Machine). Then I read
"data" in both SunOS and linux. But the result is different. Do you
know how to make it binary data portable.

Best wishes,
Peng

If you are consistent about the following three things you should be OK
on the vast majority of platforms:
1) type (float, signed integer, unsigned integer)
2) size
3) endianness

For example if you always represent some value in your file as a 32 bit
big endian unsigned integer you will have no problems as long as you
are consistent about this. (Always read and write the value as a 32
bit big endian unsigned integer. It would be good programming practice
to have one module which handles this.)

The C99 header stdint.h provides definitions of signed and unsigned
integers with specific sizes/widths.

Floating point numbers can be a headache especially if your data is
moving across machines that don't use ieee floats. If you seach the
internet you will probably be able to find C code which converts other
floating point representations to the ieee representation.

To ensure consistent endianness byte swapping macros will probably come
in handy. glib and other libraries provide these kind of macros
(http://developer.gnome.org/doc/API/glib/), also see hton() and
friends.

-Charlie

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[]){

int a = 100;
int b;

FILE *fp;
/* fp = fopen("data", "w");
fwrite(&a, sizeof(int), 1, fp);
fclose(fp);
*/

fp = fopen("data", "r");
fread(&b, sizeof(int), 1, fp);
fclose(fp);

printf("b = %x\n", *((unsigned int*)&b));

return 0;
}

Sensei · Jun 30, 2005

Can all this be avoided using a byte-wise representation? I mean,
choosing to write, forcing the representaion:

0xAABBCCDD

as

AA BB CC DD

Is this what glib does?

Walter Roberson · Jul 1, 2005

Can all this be avoided using a byte-wise representation?

You did not quote enough context to indicate what "all this" is.

I mean,
choosing to write, forcing the representaion:

AA BB CC DD

Is this what glib does?

glib does a lot of different things; you would need to be more
specific.

There is a standard for 32 bit 2s-complement integers, which is
known as "network byte order"; that standard is "big-endian".

The original poster did not, however, indicate that the values to
be exchanged are integers, and did not indicate a size -- and the
original poster listed operating systems, not machine representations
(one could run Linux on a 1's complement machine for example.)

It turns out that the common representation of double is more
pervasive than the common representation of float (or to put that
another way, the representation of float is more variable than
the representation for double.) But one gets into issues such
as native 80-bit doubles, and one gets into "long double"
difficulties -- and the fact that a particular representation
of plain double is common does not indicate that representation
is the one that will be used on the OP's Linux systems.

Walter Roberson · Jul 1, 2005

Do you have a rough idea how much performance will be lost using xdr
instead of using native representations, when I don't have to use xdr?

No, I can't rightly say that I do.

SunOS is an operating system, which is produced for multiple
processors.

Linux is an operating system, which is produced for a wide variety
of processors.

Telling us that you are taking the data from SunOS to Linux
narrows down the source data representations to one of a few,
but leaves the destination data representation pretty wide open.

We can't meaningfully speak about "efficiency" without knowing
the hardware details of the source and destination computers
and of exactly how the data is to be processed. For example,
if the data is just sitting around on the Sun box and you
write a program that does nothing other than read it there,
serialize it, copy it to the Linux box, and deserialize it,
and you run that program in the background, then how much
"efficiency" is lost compared to getting faster but incorrect
answers due to having used incompatible binary formats ?

Were you aware that even if both sides happen to use IEEE 754
repesentations, that merely doing byte-order conversions is not
sufficient ? IEEE 754 nails the representation for most
arithmetic values, but there are values that the implementation is
given more flexibility for. IEEE 754 includes representations
for positive and negative infinities, negative zero, various
signaling numbers, de-normalized numbers, and sets of
"Not A Number" (NaN). The available denormalized numbers and
NaN are especially implementation dependant if my memory serves
me correctly.

You didn't tell us anything about the characteristics of the binary
data, so we must assue that you are using "long double" on the Sun, and
that the data includes some of the IEEE 754 special cases. And since
you didn't tells us anything about the destination Linux system, we
must assume that it is a bit-sliced machine that uses either
one's-complement or seperate-sign and that it doesn't have "long
double" available at all.

URGENT	1	Jan 31, 2023
Help with EXT3 Filesystem work	1	Mar 13, 2022
error	28	Aug 30, 2012
Windows LLDP Driver Responds With No Data	0	Mar 17, 2023
Adding adressing of IPv6 to program	1	Feb 16, 2023
no error by fscanf on reading from output file	18	Oct 30, 2011
Please help with C programming to save GPS reception data in Raspberry Pi.	0	Dec 8, 2022
Trying to get JSON data from API into HTML table	7	Feb 1, 2021

How to make binary data portable?

PengYu.UT

Martin Ambuhl

Walter Roberson

PengYu.UT

PengYu.UT

Randy Howard

Charles Mills

Sensei

Walter Roberson

Walter Roberson

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads