Structure size and binary format

Discussion in 'C Programming' started by gamehack, Dec 31, 2005.

  1. gamehack

    gamehack Guest

    Hi all,

    I've been wondering when I write a structure like:

    struct {
    int a;
    unsigned int b;
    float c;
    } mystruct;

    And then I'm using this as a record for a binary file. The problem is
    that the size of the types is different on different
    platforms(win/lin/osx) so if a file was copied on another platform and
    attempted to be read then the first say 16 bytes could be regarded as
    the integer a but it could have been created on system where integer
    was 32 bytes. Is there a portable solution to this? Moreover, I've been
    looking for some resource on designing your own binary format and I
    couldn't find anything apart from short tutorials how to read binary
    files. Are there any good resources?

    Thanks a lot
    gamehack, Dec 31, 2005
    #1
    1. Advertising

  2. On 30 Dec 2005 16:05:03 -0800, in comp.lang.c , "gamehack"
    <> wrote:

    >Hi all,
    >
    >I've been wondering when I write a structure like:
    >
    >struct {
    >int a;
    >unsigned int b;
    >float c;
    >} mystruct;
    >
    >And then I'm using this as a record for a binary file. The problem is
    >that the size of the types is different on different
    >platforms(win/lin/osx) so if a file was copied on another platform and
    >attempted to be read then the first say 16 bytes could be regarded as
    >the integer a but it could have been created on system where integer
    >was 32 bytes. Is there a portable solution to this?


    The simplest is to store the data as text, not binary data. Other
    methods might involve using fixed-width data types (if your platforms
    support them), or writing custom load/save functions for each platform
    which still store in binary but do it element by element and take into
    account the differing sizes of types on each platform.


    Mark McIntyre
    --

    ----== Posted via Newsfeeds.Com - Unlimited-Unrestricted-Secure Usenet News==----
    http://www.newsfeeds.com The #1 Newsgroup Service in the World! 120,000+ Newsgroups
    ----= East and West-Coast Server Farms - Total Privacy via Encryption =----
    Mark McIntyre, Dec 31, 2005
    #2
    1. Advertising

  3. gamehack

    Chuck F. Guest

    gamehack wrote:
    >
    > I've been wondering when I write a structure like:
    >
    > struct {
    > int a;
    > unsigned int b;
    > float c;
    > } mystruct;
    >
    > And then I'm using this as a record for a binary file. The
    > problem is that the size of the types is different on different
    > platforms(win/lin/osx) so if a file was copied on another
    > platform and attempted to be read then the first say 16 bytes
    > could be regarded as the integer a but it could have been
    > created on system where integer was 32 bytes.


    Good. You recognize the existence of a problem. The answer is
    "Don't do that". Binary representations are, in general, not
    portable. You can convert things into a sequence of bytes and
    write/read those to a file, but that means you also have to write
    the conversion mechanisms. Now such things as byte sex can bite you.

    Far and away the most portable transportation mechanism is pure
    text. You already have conversion routines in the standard
    library, and all you need to do is use them. Anybody and their dog
    can read the files.

    --
    "If you want to post a followup via groups.google.com, don't use
    the broken "Reply" link at the bottom of the article. Click on
    "show options" at the top of the article, then click on the
    "Reply" at the bottom of the article headers." - Keith Thompson
    More details at: <http://cfaj.freeshell.org/google/>
    Chuck F., Dec 31, 2005
    #3
  4. gamehack

    Malcolm Guest

    "gamehack" <> wrote
    >
    > I've been wondering when I write a structure like:
    >
    > struct {
    > int a;
    > unsigned int b;
    > float c;
    > } mystruct;
    >
    > And then I'm using this as a record for a binary file. The problem is
    > that the size of the types is different on different
    > platforms(win/lin/osx) so if a file was copied on another platform and
    > attempted to be read then the first say 16 bytes could be regarded as
    > the integer a but it could have been created on system where integer
    > was 32 bytes. Is there a portable solution to this? Moreover, I've been
    > looking for some resource on designing your own binary format and I
    > couldn't find anything apart from short tutorials how to read binary
    > files. Are there any good resources?
    >

    Integers are easy. Just use the AND and OR operators, together with the
    bitshifts ( >> <<) to break up an integer into 8-bit chunks, and store it,
    big-endian, in a file.

    It is necessary to use the big-endian format because otherwise those
    little-endians might take over the world, and force us all to store our
    bytes at the little end, and we don't wnat that happening.

    The float is a bit more tricky. Floating point number have their own
    internal format. The good news is that virtually all are 32-bit IEEE format
    (sign, exponent, mantissa). You can probably get away with a binary dump,
    making sure of the endianness. However to be really portable, you do need to
    break the number up into its constitutents, and then rebuild it, using the
    ldexp() and frexp() functions.
    Malcolm, Dec 31, 2005
    #4
  5. gamehack

    Eric Sosman Guest

    Chuck F. wrote:

    > gamehack wrote:
    >
    >>
    >> I've been wondering when I write a structure like:
    >>
    >> struct {
    >> int a;
    >> unsigned int b;
    >> float c;
    >> } mystruct;
    >>
    >> And then I'm using this as a record for a binary file. The
    >> problem is that the size of the types is different on different
    >> platforms(win/lin/osx) so if a file was copied on another
    >> platform and attempted to be read then the first say 16 bytes
    >> could be regarded as the integer a but it could have been
    >> created on system where integer was 32 bytes.

    >
    >
    > Good. You recognize the existence of a problem. The answer is "Don't
    > do that". Binary representations are, in general, not portable. You
    > can convert things into a sequence of bytes and write/read those to a
    > file, but that means you also have to write the conversion mechanisms.
    > Now such things as byte sex can bite you.
    >
    > Far and away the most portable transportation mechanism is pure text.
    > You already have conversion routines in the standard library, and all
    > you need to do is use them. Anybody and their dog can read the files.
    >
    Eric Sosman, Dec 31, 2005
    #5
  6. gamehack

    gamehack Guest

    Thanks a lot guys.
    gamehack, Dec 31, 2005
    #6
  7. gamehack

    Eric Sosman Guest

    (Please excuse the vacuous reply that I fat-fingered
    a moment ago.)

    Chuck F. wrote:

    > gamehack wrote:
    >
    >>
    >> I've been wondering when I write a structure like:
    >>
    >> struct {
    >> int a;
    >> unsigned int b;
    >> float c;
    >> } mystruct;
    >>
    >> And then I'm using this as a record for a binary file. The
    >> problem is that the size of the types is different on different
    >> platforms(win/lin/osx) so if a file was copied on another
    >> platform and attempted to be read then the first say 16 bytes
    >> could be regarded as the integer a but it could have been
    >> created on system where integer was 32 bytes.

    >
    >
    > Good. You recognize the existence of a problem. The answer is "Don't
    > do that". Binary representations are, in general, not portable. You
    > can convert things into a sequence of bytes and write/read those to a
    > file, but that means you also have to write the conversion mechanisms.
    > Now such things as byte sex can bite you.


    "Don't do that" needs a little qualification, I think.
    If "that" means "just read and write the struct in whatever
    form the compiler happens to choose," the advice is sound.
    But the claim that binary representations are not portable
    (I'm not sure what "in general" means here) doesn't hold up.
    Who has not transported a ZIP or GIF or JPEG file between
    dissimilar systems? At a lower level, who has not exchanged
    IP packets with other systems? Portability is a matter of
    agreed-upon standards, not of the underlying representations
    chosen.

    > Far and away the most portable transportation mechanism is pure text.
    > You already have conversion routines in the standard library, and all
    > you need to do is use them. Anybody and their dog can read the files.


    Text has a few pitfalls of its own. Even without appealing
    to the multitude of character encoding schemes, some difficulties
    are apparent. For example, it is no simple matter to devise a
    portable text representation for arbitrary `double' values. A
    value encoded as text, sent to another machine and decoded, then
    re-encoded and sent back again may not decode to the same value
    that was originally transmitted. It requires as much care to
    make this work for text as for binary representations. (And I've
    got the war stories from a PPOE to prove it, too ...)

    --
    Eric Sosman
    lid
    Eric Sosman, Dec 31, 2005
    #7
  8. Eric Sosman <> writes:
    [...]
    > Text has a few pitfalls of its own. Even without appealing
    > to the multitude of character encoding schemes, some difficulties
    > are apparent. For example, it is no simple matter to devise a
    > portable text representation for arbitrary `double' values. A
    > value encoded as text, sent to another machine and decoded, then
    > re-encoded and sent back again may not decode to the same value
    > that was originally transmitted. It requires as much care to
    > make this work for text as for binary representations. (And I've
    > got the war stories from a PPOE to prove it, too ...)


    A hexadecimal floating-point representation (supported in C99,
    implementable in C90) should avoid at least some of the problems.
    With enough digits, you can have an exact textual representation of a
    floating-point value.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
    Keith Thompson, Dec 31, 2005
    #8
  9. gamehack

    gamehack Guest

    Thank you. That's why I wondered how to design a format, like .zip .jpg
    etc :) Do you basically say that each 33 bytes would be one pixel, and
    the value of red would be the first 11 bytes, green next 11 bytes, and
    then last 11 bytes are going to be blue. And probably some fixed-size
    headers at the end file(or probably using some sequence of bytes to
    mark end of fields in the header). The problem is that I haven't seen
    _any_ good resources about designing file formats. Any pointers?

    Regards,
    gamehack
    gamehack, Dec 31, 2005
    #9
  10. gamehack

    Eric Sosman Guest

    gamehack wrote:
    > Thank you. That's why I wondered how to design a format, like .zip .jpg
    > etc :) Do you basically say that each 33 bytes would be one pixel, and
    > the value of red would be the first 11 bytes, green next 11 bytes, and
    > then last 11 bytes are going to be blue. And probably some fixed-size
    > headers at the end file(or probably using some sequence of bytes to
    > mark end of fields in the header). The problem is that I haven't seen
    > _any_ good resources about designing file formats. Any pointers?


    <OT>

    Visit http://www.wotsit.org/ to find descriptions of
    many file formats. Some are binary, some are textual. Some
    are designed for portability, some are not. In any event, a
    review of what's already been done should give you some ideas.
    Perhaps you'll even find an existing format that meets your
    needs; if so, adopting it might make available whole suites of
    helpful tools for dealing with it.

    </OT>

    --
    Eric Sosman
    lid
    Eric Sosman, Dec 31, 2005
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Excluded_Middle

    Pointers to structure and array of structure.

    Excluded_Middle, Oct 24, 2004, in forum: C Programming
    Replies:
    4
    Views:
    745
    Martin Ambuhl
    Oct 26, 2004
  2. Replies:
    9
    Views:
    25,285
    Lal Bahadur Singh
    Nov 11, 2011
  3. Kislay

    Size of a structure : Structure Padding

    Kislay, Oct 1, 2007, in forum: C Programming
    Replies:
    15
    Views:
    951
    clinuxpro
    Jul 13, 2011
  4. Jason Cavett

    Preferred Size, Minimum Size, Size

    Jason Cavett, May 23, 2008, in forum: Java
    Replies:
    5
    Views:
    12,559
    Michael Jung
    May 25, 2008
  5. Ken Starks
    Replies:
    4
    Views:
    342
    Ken Starks
    Jun 23, 2008
Loading...

Share This Page