Re: Integer of specific size

Discussion in 'C Programming' started by Kevin Easton, Aug 18, 2003.

  1. Kevin Easton

    Kevin Easton Guest

    - <> wrote:
    > I am writing a program to parse AVI files. The AVI header structures
    > contain 32 bit integers.
    >
    > My program needs to be portable, so I need to the know the best way of
    > defining my structures to contain integers which are *exactly* 32 bits
    > in size.
    >
    > I can't use int as this can be 16 bits on some platforms, and even
    > long can be 64 bits.
    >
    > What's the correct way of ensuring the integers will be 32 bits in
    > size on all (or most) platforms?


    The correct way is to use a type that is big enough to hold all the
    values of interest to you - in this case, long int.

    Then you have to realise you can't read the ints directly from the file
    structure - you have to read them a byte at a time, in the byte order
    defined by the file format, and build up the complete value from its
    component bytes, storing it in the long int structure member.

    Quite apart from the integer size problem (there's no reason why there
    has to be *any* exactly-32-bit type), there's also the problem of
    endianness and the mostly theoretical problem of padding bits.

    - Kevin.
     
    Kevin Easton, Aug 18, 2003
    #1
    1. Advertising

  2. Kevin Easton

    - Guest

    Kevin Easton <> wrote in message news:<newscache$mhbtjh$j58$>...
    > Quite apart from the integer size problem (there's no reason why there
    > has to be *any* exactly-32-bit type), there's also the problem of
    > endianness and the mostly theoretical problem of padding bits.


    Thanks for the feedback everyone.

    Re endianism, is it possible to handle endianism in a general way
    rather than with a list of #defines for various specific platforms?

    I can't think how this might be done at compile time, but perhaps at
    runtime you could declare a union of an int and a 4-char array, the
    assign a signature value to the integer, say:

    myint = 0x12345678;

    and then test the values of the chars to see which contain 0x12 and
    0x34 etc.
     
    -, Aug 19, 2003
    #2
    1. Advertising

  3. Kevin Easton

    Eric Sosman Guest

    - wrote:
    >
    > Kevin Easton <> wrote in message news:<newscache$mhbtjh$j58$>...
    > > Quite apart from the integer size problem (there's no reason why there
    > > has to be *any* exactly-32-bit type), there's also the problem of
    > > endianness and the mostly theoretical problem of padding bits.

    >
    > Thanks for the feedback everyone.
    >
    > Re endianism, is it possible to handle endianism in a general way
    > rather than with a list of #defines for various specific platforms?
    >
    > I can't think how this might be done at compile time, but perhaps at
    > runtime you could declare a union of an int and a 4-char array, the
    > assign a signature value to the integer, say:
    >
    > myint = 0x12345678;
    >
    > and then test the values of the chars to see which contain 0x12 and
    > 0x34 etc.


    It's simpler to do as Kevin Easton suggested: process
    the bytes one by one, and use ordinary arithmetic operators
    to assemble multi-byte values. For example, if the file
    format is Big-Endian (and assuming an 8-bit byte), you can
    build a 32-bit value like this:

    unsigned char buff[4]; /* four bytes from file */
    uint32_t value; /* see elsethread */

    value = ((uint32_t)buff[0] << 24)
    + ((uint32_t)buff[1] << 16)
    + ((uint32_t)buff[2] << 8)
    + ((uint32_t)buff[3] << 0);

    (There are many other essentially equivalent ways to write
    this.) If the file format is Little-Endian, use a similar
    expression but change the array subscripts. The main thing
    is that this method works no matter what endianness your
    host machine uses: Big-Endian, Little-Endian, Middle-Endian,
    World-Without-Endian -- you don't care, it just works.

    You might be wondering about the (uint32_t) casts in the
    above; they're there for a reason. Without them, an expression
    like `buff[0] << 24' would be evaluated as follows:

    - Fetch `buff[0]', an `unsigned char' value.

    - Apply the "usual arithmetic conversions" to this value,
    yielding either an `int' or an `unsigned int' depending
    on the characteristics of the platform.

    - Left-shift the resulting value by 24 bit positions.

    The problem is that you're trying to cope with the possibility
    that `int' might not be 32 bits wide; it could in fact be as
    narrow as 16 bits. If you try to apply a 24-bit shift to a
    16-bit value, you're smack in the middle of all the trouble
    you were trying to escape in the first place. Avoid this trap
    by using the explicit conversion to a 32-bit type instead of
    letting the compiler choose its own "working" type.

    --
     
    Eric Sosman, Aug 19, 2003
    #3
  4. Kevin Easton

    Kevin Easton Guest

    Eric Sosman <> wrote:
    > - wrote:
    >>
    >> Kevin Easton <> wrote in message news:<newscache$mhbtjh$j58$>...
    >> > Quite apart from the integer size problem (there's no reason why there
    >> > has to be *any* exactly-32-bit type), there's also the problem of
    >> > endianness and the mostly theoretical problem of padding bits.

    >>
    >> Thanks for the feedback everyone.
    >>
    >> Re endianism, is it possible to handle endianism in a general way
    >> rather than with a list of #defines for various specific platforms?
    >>
    >> I can't think how this might be done at compile time, but perhaps at
    >> runtime you could declare a union of an int and a 4-char array, the
    >> assign a signature value to the integer, say:
    >>
    >> myint = 0x12345678;
    >>
    >> and then test the values of the chars to see which contain 0x12 and
    >> 0x34 etc.

    >
    > It's simpler to do as Kevin Easton suggested: process
    > the bytes one by one, and use ordinary arithmetic operators
    > to assemble multi-byte values. For example, if the file
    > format is Big-Endian (and assuming an 8-bit byte), you can
    > build a 32-bit value like this:
    >
    > unsigned char buff[4]; /* four bytes from file */
    > uint32_t value; /* see elsethread */
    >
    > value = ((uint32_t)buff[0] << 24)
    > + ((uint32_t)buff[1] << 16)
    > + ((uint32_t)buff[2] << 8)
    > + ((uint32_t)buff[3] << 0);
    >

    ....and you can even use "unsigned long" instead of uint32_t.

    If you're reading a signed value from the file, it will probably be
    easiest to read it into an unsigned long as above, and then apply the
    appropriate value transformation.

    - Kevin.
     
    Kevin Easton, Aug 20, 2003
    #4
  5. Kevin Easton

    j Guest

    Eric Sosman <> wrote in message news:<>...
    > - wrote:
    > >
    > > Kevin Easton <> wrote in message news:<newscache$mhbtjh$j58$>...
    > > > Quite apart from the integer size problem (there's no reason why there
    > > > has to be *any* exactly-32-bit type), there's also the problem of
    > > > endianness and the mostly theoretical problem of padding bits.

    > >
    > > Thanks for the feedback everyone.
    > >
    > > Re endianism, is it possible to handle endianism in a general way
    > > rather than with a list of #defines for various specific platforms?
    > >
    > > I can't think how this might be done at compile time, but perhaps at
    > > runtime you could declare a union of an int and a 4-char array, the
    > > assign a signature value to the integer, say:
    > >
    > > myint = 0x12345678;
    > >
    > > and then test the values of the chars to see which contain 0x12 and
    > > 0x34 etc.

    >
    > It's simpler to do as Kevin Easton suggested: process
    > the bytes one by one, and use ordinary arithmetic operators
    > to assemble multi-byte values. For example, if the file
    > format is Big-Endian (and assuming an 8-bit byte), you can
    > build a 32-bit value like this:
    >
    > unsigned char buff[4]; /* four bytes from file */
    > uint32_t value; /* see elsethread */
    >
    > value = ((uint32_t)buff[0] << 24)
    > + ((uint32_t)buff[1] << 16)
    > + ((uint32_t)buff[2] << 8)
    > + ((uint32_t)buff[3] << 0);
    >
    > (There are many other essentially equivalent ways to write
    > this.)


    Is your example the preferred way?
     
    j, Aug 20, 2003
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Simon G Best
    Replies:
    3
    Views:
    402
    Simon G Best
    Jul 17, 2003
  2. =?Utf-8?B?SmF2?=

    Is ViwState Page-Specific or UserControl-Specific

    =?Utf-8?B?SmF2?=, Aug 16, 2006, in forum: ASP .Net
    Replies:
    2
    Views:
    575
    =?Utf-8?B?SmF2?=
    Aug 16, 2006
  3. Jason Cavett

    Preferred Size, Minimum Size, Size

    Jason Cavett, May 23, 2008, in forum: Java
    Replies:
    5
    Views:
    12,742
    Michael Jung
    May 25, 2008
  4. mazdotnet
    Replies:
    2
    Views:
    425
    Alexey Smirnov
    Oct 2, 2009
  5. Suresh V
    Replies:
    5
    Views:
    3,858
    SaticCaster
    Jul 5, 2010
Loading...

Share This Page