bitfield confusion

Discussion in 'C Programming' started by mathog, Jul 11, 2013.

  1. mathog

    mathog Guest

    I am having one of those days - what I am doing wrong here?

    1. The Microsoft EMF+ specification, section 2.2.2.19

    http://msdn.microsoft.com/en-us/library/cc231004.aspx

    says that GraphicsVersion objects are 32 bits as:

    0-19 Metafile Signature
    20-31 GraphicsVersion enumeration

    2. So I defined what I thought would be the corresponding struct, at
    least for Intel platforms. This assumes bitfields are listed in the
    struct from least to most significant bits, perhaps they are the other
    way around? Or is this one of those areas where the compiler can do
    anything it wants?

    typedef struct {
    unsigned int Signature : 20;
    unsigned int GrfVersion : 12;
    } U_PMF_GRAPHICSVERSION;

    3. Opened a file with EMF+ records and found the corresponding 32 bits.
    This is on an Intel architecture machine and the file was made by
    Powerpoint on this machine. Examine the value various ways with this code:

    U_PMF_GRAPHICSVERSION Version;
    printf("DEBUG at offset:%8.8X\n",
    *(uint32_t *)contents);
    printf("DEBUG at offset by byte:%2.2X %2.2X %2.2X %2.2X\n",
    *(uint8_t *)(contents + 0),
    *(uint8_t *)(contents + 1),
    *(uint8_t *)(contents + 2),
    *(uint8_t *)(contents + 3)
    );
    memcpy(&Version, contents, sizeof(U_PMF_GRAPHICSVERSION));
    printf("DEBUG Sig:%X GrfV:%X\n",
    Version.Signature, Version.GrfVersion);

    The output is

    DEBUG at offset:DBC01002
    DEBUG at offset by byte:02 10 C0 DB
    DEBUG Sig:1002 GrfV:DBC

    For an EMF+ file signature must be 0xDBC01, and version can be 2.

    I must be screwing up somewhere, but where? The first two DEBUG lines
    are consistent with this being a little endian system. "DB" is clearly
    at the most significant bit end of the
    32 bits, but the EMF+ specification appears to say that it should be
    somewhere in the middle. For a little endian machine, doesn't (1) say
    that for sig == DBC01 sig and version == 002 the bytes in the file
    should be: 01 BC 2D 00 ?? Swapping the order of the bit fields in the
    struct above does put DBC01 in Sig and 2 in GrfV, but it does not seem
    to be consistent with the documentation.

    Thanks,

    David Mathog
     
    mathog, Jul 11, 2013
    #1
    1. Advertising

  2. mathog

    James Kuyper Guest

    On 07/11/2013 03:59 PM, mathog wrote:
    > I am having one of those days - what I am doing wrong here?
    >
    > 1. The Microsoft EMF+ specification, section 2.2.2.19
    >
    > http://msdn.microsoft.com/en-us/library/cc231004.aspx
    >
    > says that GraphicsVersion objects are 32 bits as:
    >
    > 0-19 Metafile Signature
    > 20-31 GraphicsVersion enumeration
    >
    > 2. So I defined what I thought would be the corresponding struct, at
    > least for Intel platforms. This assumes bitfields are listed in the
    > struct from least to most significant bits, perhaps they are the other
    > way around? Or is this one of those areas where the compiler can do
    > anything it wants?


    Almost. There are only a few restrictions imposed by the C standard on
    the allocation of bit-fields. The relevant requirements are in terms
    addressable storage units, about which the standard says very little -
    it does not say how big they are, and does not require the
    implementation to document how big they are, nor does it require that
    the size be the same in all contexts.

    "... If enough space remains, a bit-field that immediately follows
    another bit-field in a structure shall be packed into adjacent bits of
    the same unit. If insufficient space remains, whether a bit-field that
    does not fit is put into the next unit or overlaps adjacent units is
    implementation-defined. The order of allocation of bit-fields within a
    unit (high-order to low-order or low-order to high-order) is
    implementation-defined. The alignment of the addressable storage unit is
    unspecified." (6.7.2.1p6)
    "... As a special case, a bit-field structure member with a width of 0
    indicates that no further bit-field is to be packed into the unit in
    which the previous bitfield, if any, was placed." (6.7.2.1p7)

    What the standard fails to guarantee about bit-field layouts renders
    them useless for such purposes, at least in code that needs to be
    portable. That's a bit of a shame, because such purposes would otherwise
    be overwhelming the most popular reasons for using them.
     
    James Kuyper, Jul 11, 2013
    #2
    1. Advertising

  3. mathog

    Lew Pitcher Guest

    On Thursday 11 July 2013 15:59, in comp.lang.c, wrote:

    > I am having one of those days - what I am doing wrong here?
    >
    > 1. The Microsoft EMF+ specification, section 2.2.2.19
    >
    > http://msdn.microsoft.com/en-us/library/cc231004.aspx
    >
    > says that GraphicsVersion objects are 32 bits as:
    >
    > 0-19 Metafile Signature
    > 20-31 GraphicsVersion enumeration
    >
    > 2. So I defined what I thought would be the corresponding struct, at
    > least for Intel platforms. This assumes bitfields are listed in the
    > struct from least to most significant bits, perhaps they are the other
    > way around? Or is this one of those areas where the compiler can do
    > anything it wants?
    >
    > typedef struct {
    > unsigned int Signature : 20;
    > unsigned int GrfVersion : 12;
    > } U_PMF_GRAPHICSVERSION;
    >
    > 3. Opened a file with EMF+ records and found the corresponding 32 bits.
    > This is on an Intel architecture machine and the file was made by
    > Powerpoint on this machine. Examine the value various ways with this
    > code:
    >
    > U_PMF_GRAPHICSVERSION Version;
    > printf("DEBUG at offset:%8.8X\n",
    > *(uint32_t *)contents);
    > printf("DEBUG at offset by byte:%2.2X %2.2X %2.2X %2.2X\n",
    > *(uint8_t *)(contents + 0),
    > *(uint8_t *)(contents + 1),
    > *(uint8_t *)(contents + 2),
    > *(uint8_t *)(contents + 3)
    > );
    > memcpy(&Version, contents, sizeof(U_PMF_GRAPHICSVERSION));
    > printf("DEBUG Sig:%X GrfV:%X\n",
    > Version.Signature, Version.GrfVersion);
    >
    > The output is
    >
    > DEBUG at offset:DBC01002
    > DEBUG at offset by byte:02 10 C0 DB
    > DEBUG Sig:1002 GrfV:DBC
    >
    > For an EMF+ file signature must be 0xDBC01, and version can be 2.
    >
    > I must be screwing up somewhere, but where? The first two DEBUG lines
    > are consistent with this being a little endian system. "DB" is clearly
    > at the most significant bit end of the
    > 32 bits, but the EMF+ specification appears to say that it should be
    > somewhere in the middle. For a little endian machine, doesn't (1) say
    > that for sig == DBC01 sig and version == 002 the bytes in the file
    > should be: 01 BC 2D 00 ?? Swapping the order of the bit fields in the
    > struct above does put DBC01 in Sig and 2 in GrfV, but it does not seem
    > to be consistent with the documentation.


    The C standard doesn't define how bitfields are ordered within the
    underlying storage, leaving that up to the implementation. For Microsoft C
    compilers (eg Visual Studio), bitfields are allocated from lo-order to
    hi-order against the underlying integer used as storage for bitfields.
    (See http://msdn.microsoft.com/en-us/library/yszfawxh(v=vs.80).aspx)

    Your structure, when mapped by a MS C compiler, would result in
    Signature being mapped to the 2^19 through 2^0 bits, and
    GrfVersion being mapped to the 2^31 through 2^20 bits
    of the underlying unsigned integer that the bit fields are packed into.

    3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
    1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
    |<----GrfVersion------->|<-------------Signature--------------->|

    This is not a bit-level mapping of storage; it is a logical mapping against
    the interpreted value in storage; in other words, for Microsoft C products,
    the bitfields are interpreted in the same little-endian order that the
    underlying integer type is interpreted.

    When you memcpy()ed the 4 byte value into your structure, you, in effect,
    set that underlying integer value to 0xDBC01002, not 0x0201C0DB.
    Consequently, GrfVersion mapped to the 0xDBC portion of the underlying
    integer value, and Signature mapped to the 0x01002 portion.

    To fix this, change your structure to match the compiler's implicit bitmap
    mapping:

    typedef struct {
    unsigned int GrfVersion : 12;
    unsigned int Signature : 20;
    } U_PMF_GRAPHICSVERSION;

    This will map GrfVersion to the 12 lo-order bits of the underlying integer
    used as bitmap storage (in your case, the 0x002), and Signature to the 20
    hi-order bits of the underlying integer (in your case, the 0xDBC01).

    HTH
    --
    Lew Pitcher
    "In Skills, We Trust"
     
    Lew Pitcher, Jul 12, 2013
    #3
  4. mathog

    JohnF Guest

    Lew Pitcher <> wrote:
    > wrote:
    >> typedef struct {
    >> unsigned int Signature : 20;
    >> unsigned int GrfVersion : 12;
    >> } U_PMF_GRAPHICSVERSION;

    > To fix this, [...]


    [...] just forget those bitfields entirely, and do it the hard way,
    for example,
    /* ---
    * bitfield macros (byte_bits=76543210, with lsb=bit#0 and 128=bit#7set)
    * --------------------------------------------------------------------- */
    #define getbit(x,bit) ( ((x) >> (bit)) & 1 ) /* get bit-th bit of x */
    #define setbit(x,bit) ( (x) |= (1<<(bit)) ) /* set bit-th bit of x */
    #define clearbit(x,bit) ( (x) &= ~(1<<(bit)) ) /* clear bit-th bit of x */
    #define putbit(x,bit,val) \
    if(((int)(val))==0) clearbit((x),(bit)); else setbit((x),(bit))
    #define bitmask(nbits) ((1<<(nbits))-1) /* a mask of nbits 1's */
    #define getbitfield(x,bit1,nbits) (((x)>>(bit1)) & (bitmask(nbits)))
    #define putbitfield(x,bit1,nbits,val) /* x:bit1...bit1+nbits-1 = val */ \
    if ( (nbits)>0 && (bit1)>=0 ) { /* check input */ \
    (x) &= (~((bitmask((nbits))) << (bit1))); /*set field=0's*/ \
    (x) |= (((val)&(bitmask((nbits)))) << (bit1)); /*set field=val*/ \
    } else /* let user supply final ; */
    this will be a little more portable.
    --
    John Forkosh ( mailto: where j=john and f=forkosh )
     
    JohnF, Jul 12, 2013
    #4
  5. "mathog" <> wrote in message
    news:krn2g1$gig$...
    >I am having one of those days - what I am doing wrong here?
    >
    > 1. The Microsoft EMF+ specification, section 2.2.2.19
    >
    > http://msdn.microsoft.com/en-us/library/cc231004.aspx
    >
    > says that GraphicsVersion objects are 32 bits as:
    >
    > 0-19 Metafile Signature
    > 20-31 GraphicsVersion enumeration
    >
    > 2. So I defined what I thought would be the corresponding struct, at
    > least for Intel platforms. This assumes bitfields are listed in the
    > struct from least to most significant bits, perhaps they are the other way
    > around? Or is this one of those areas where the compiler can do anything
    > it wants?
    >
    > typedef struct {
    > unsigned int Signature : 20;
    > unsigned int GrfVersion : 12;
    > } U_PMF_GRAPHICSVERSION;


    It's safer (more portable and reliable) to define bitfields manually - i.e.
    with masks and shifts. There is some simple C code showing how to access and
    manipulate such fields at

    http://codewiki.wikispaces.com/bitfield_operations.c

    James
     
    James Harris \(es\), Jul 12, 2013
    #5
  6. mathog

    Rosario1903 Guest

    On Thu, 11 Jul 2013 12:59:08 -0700, mathog wrote:
    >
    > U_PMF_GRAPHICSVERSION Version;
    > printf("DEBUG at offset:%8.8X\n",
    > *(uint32_t *)contents);
    > printf("DEBUG at offset by byte:%2.2X %2.2X %2.2X %2.2X\n",
    > *(uint8_t *)(contents + 0),
    > *(uint8_t *)(contents + 1),
    > *(uint8_t *)(contents + 2),
    > *(uint8_t *)(contents + 3)
    > );


    i not see the definition of contents...
    i suppose "uint8_t *contents;"

    if contents is a pointer to uint8_t [or int8_t or char if char is 8
    bit the same for unsigned char] this would print the first 4
    contiguous chars

    if contents is a pointer to u32 or pointer to int [or long or float]
    this would print the first char of the first 4 elements
    of the array of u32 [or int or long or float] etc
    "contents" point to
     
    Rosario1903, Jul 12, 2013
    #6
  7. mathog

    Ian Collins Guest

    James Harris (es) wrote:
    > "mathog" <> wrote in message
    > news:krn2g1$gig$...
    >> I am having one of those days - what I am doing wrong here?
    >>
    >> 1. The Microsoft EMF+ specification, section 2.2.2.19
    >>
    >> http://msdn.microsoft.com/en-us/library/cc231004.aspx
    >>
    >> says that GraphicsVersion objects are 32 bits as:
    >>
    >> 0-19 Metafile Signature
    >> 20-31 GraphicsVersion enumeration
    >>
    >> 2. So I defined what I thought would be the corresponding struct, at
    >> least for Intel platforms. This assumes bitfields are listed in the
    >> struct from least to most significant bits, perhaps they are the other way
    >> around? Or is this one of those areas where the compiler can do anything
    >> it wants?
    >>
    >> typedef struct {
    >> unsigned int Signature : 20;
    >> unsigned int GrfVersion : 12;
    >> } U_PMF_GRAPHICSVERSION;

    >
    > It's safer (more portable and reliable) to define bitfields manually - i.e.
    > with masks and shifts. There is some simple C code showing how to access and
    > manipulate such fields at
    >
    > http://codewiki.wikispaces.com/bitfield_operations.c


    I really don't understand why people get so hung up about bit fields, or
    why they'd want to muck about with shifts and masks. That low level
    stuff is the compiler's job. It isn't rocket science to determine the
    order of bit fields (my day to day platform has preprocessor macros for
    this) and to use them correctly and portably.

    --
    Ian Collins
     
    Ian Collins, Jul 12, 2013
    #7
  8. Ian Collins wrote:
    >I really don't understand why people get so hung up about bit fields, or
    >why they'd want to muck about with shifts and masks. That low level
    >stuff is the compiler's job. It isn't rocket science to determine the
    >order of bit fields (my day to day platform has preprocessor macros for
    >this) and to use them correctly and portably.


    Controlling the layout and contents of a byte/char/int/word/etc. on a
    bit by bit basis is indispensable in embedded work, were is
    commonplace to have to read and write hardware registers containing
    several fields defined by bit position and width in bits.

    Not being able to use bitfields in a portable way, (because of the
    implementation dependent aspects,) leaves no choice but to muck about
    with shifts and masks ...
    --
    Roberto Waltman

    [ Please reply to the group,
    return address is invalid ]
     
    Roberto Waltman, Jul 12, 2013
    #8
  9. mathog

    Les Cargill Guest

    Ian Collins wrote:
    > James Harris (es) wrote:
    >> "mathog" <> wrote in message
    >> news:krn2g1$gig$...
    >>> I am having one of those days - what I am doing wrong here?
    >>>
    >>> 1. The Microsoft EMF+ specification, section 2.2.2.19
    >>>
    >>> http://msdn.microsoft.com/en-us/library/cc231004.aspx
    >>>
    >>> says that GraphicsVersion objects are 32 bits as:
    >>>
    >>> 0-19 Metafile Signature
    >>> 20-31 GraphicsVersion enumeration
    >>>
    >>> 2. So I defined what I thought would be the corresponding struct, at
    >>> least for Intel platforms. This assumes bitfields are listed in the
    >>> struct from least to most significant bits, perhaps they are the
    >>> other way
    >>> around? Or is this one of those areas where the compiler can do
    >>> anything
    >>> it wants?
    >>>
    >>> typedef struct {
    >>> unsigned int Signature : 20;
    >>> unsigned int GrfVersion : 12;
    >>> } U_PMF_GRAPHICSVERSION;

    >>
    >> It's safer (more portable and reliable) to define bitfields manually -
    >> i.e.
    >> with masks and shifts. There is some simple C code showing how to
    >> access and
    >> manipulate such fields at
    >>
    >> http://codewiki.wikispaces.com/bitfield_operations.c

    >
    > I really don't understand why people get so hung up about bit fields, or
    > why they'd want to muck about with shifts and masks. That low level
    > stuff is the compiler's job. It isn't rocket science to determine the
    > order of bit fields (my day to day platform has preprocessor macros for
    > this) and to use them correctly and portably.
    >



    I've principally used them for FPGA register maps
    and certain binary comms protocols. In neither case must they
    be fully portable - a different architecture would likely
    mean a different FPGA anyway.

    And for network protocols, it's not hard to have a header file per
    "endianness" permutation.

    Bit fields are way better than bit shifts and macros.

    --
    Les Cargill
     
    Les Cargill, Jul 12, 2013
    #9
  10. mathog

    Les Cargill Guest

    Roberto Waltman wrote:
    > Ian Collins wrote:
    >> I really don't understand why people get so hung up about bit fields, or
    >> why they'd want to muck about with shifts and masks. That low level
    >> stuff is the compiler's job. It isn't rocket science to determine the
    >> order of bit fields (my day to day platform has preprocessor macros for
    >> this) and to use them correctly and portably.

    >
    > Controlling the layout and contents of a byte/char/int/word/etc. on a
    > bit by bit basis is indispensable in embedded work, were is
    > commonplace to have to read and write hardware registers containing
    > several fields defined by bit position and width in bits.
    >
    > Not being able to use bitfields in a portable way, (because of the
    > implementation dependent aspects,) leaves no choice but to muck about
    > with shifts and masks ...



    In general, if the target boards have changed enough to where bit field
    orientation matters, you'll have other portability fun as well.

    > --
    > Roberto Waltman
    >
    > [ Please reply to the group,
    > return address is invalid ]
    >


    --
    Les Cargill
     
    Les Cargill, Jul 12, 2013
    #10
  11. mathog

    James Kuyper Guest

    On 07/12/2013 05:50 PM, Ian Collins wrote:
    > James Harris (es) wrote:

    ....
    >> It's safer (more portable and reliable) to define bitfields manually - i.e.
    >> with masks and shifts. There is some simple C code showing how to access and
    >> manipulate such fields at
    >>
    >> http://codewiki.wikispaces.com/bitfield_operations.c

    >
    > I really don't understand why people get so hung up about bit fields, or
    > why they'd want to muck about with shifts and masks. That low level
    > stuff is the compiler's job. It isn't rocket science to determine the
    > order of bit fields (my day to day platform has preprocessor macros for
    > this) and to use them correctly and portably.


    Well, I don't like using shifts and masks, but I use them because I
    don't known how to use bit-fields to correctly and portably parse an
    externally defined data structure. Would you care to demonstrate how it
    is done? To make things concrete, lets consider the following case:

    The raw data has a record size of 32 bits. The fields in that record
    have lengths of 10, 10, and 12 bits, respectively. I would use the
    following shift and mask instructions to extract them from an unsigned
    char buffer of length 4:

    #if CHAR_BIT != 8
    #error this code requires CHAR_BIT == 8
    #endif
    unsigned field1 = buffer[0] << 2 | buffer[1] >> 6;
    unsigned field2 = (buffer[1] & 0x3f) << 6 | buffer[2] >> 4;
    unsigned field3 = (buffer[2] & 0x0F) << 4 | buffer[3];
    If I were doing a lot of this, I'd define appropriate macros to simplify
    the extraction of those bit-fields, but this is what those macros would
    expand to.

    What would your code using bit-fields and preprocessor macros to extract
    these fields look like, given that it must be portable to both of the
    following fully-conforming implementations of C (among others)?

    Implementation A uses addressable storage units with a size of 16 bits,
    interprets those units as little-endian 16-bit integers. It forces
    consecutive bit-fields to share a storage unit, even if that means that
    they will have to cross a storage unit boundary. It assigns bit-fields
    to the bits of those 16-bit integers in order from high to low.

    Implementation B uses addressable storage units with a size of 8 bits.
    Bit-fields that are too big to fit in the remaining space of one storage
    unit start in the next storage unit. Within each storage unit, bits are
    assigned to bit-fields in order from low to high.
     
    James Kuyper, Jul 13, 2013
    #11
  12. mathog

    Ian Collins Guest

    Roberto Waltman wrote:
    > Ian Collins wrote:
    >> I really don't understand why people get so hung up about bit fields, or
    >> why they'd want to muck about with shifts and masks. That low level
    >> stuff is the compiler's job. It isn't rocket science to determine the
    >> order of bit fields (my day to day platform has preprocessor macros for
    >> this) and to use them correctly and portably.

    >
    > Controlling the layout and contents of a byte/char/int/word/etc. on a
    > bit by bit basis is indispensable in embedded work, were is
    > commonplace to have to read and write hardware registers containing
    > several fields defined by bit position and width in bits.
    >
    > Not being able to use bitfields in a portable way, (because of the
    > implementation dependent aspects,) leaves no choice but to muck about
    > with shifts and masks ...


    No it doesn't. I've been happily using bit fields for register mappings
    for the past three decades :) Nine times out of ten an embedded project
    uses a single compiler, so even if portability was an issue (which it
    seldom is) it is irrelevant. For driver development on bigger systems,
    the compilers for that system use the same mapping, and the platform
    provides appropriate architecture specific macros. An example from a
    Solaris header:

    struct ip {
    #ifdef _BIT_FIELDS_LTOH
    uchar_t ip_hl:4, /* header length */
    ip_v:4; /* version */
    #else
    uchar_t ip_v:4, /* version */
    ip_hl:4; /* header length */
    #endif

    --
    Ian Collins
     
    Ian Collins, Jul 13, 2013
    #12
  13. mathog

    mathog Guest

    Lew Pitcher wrote:

    > This is not a bit-level mapping of storage; it is a logical mapping against
    > the interpreted value in storage; in other words, for Microsoft C products,
    > the bitfields are interpreted in the same little-endian order that the
    > underlying integer type is interpreted.


    I found the problem, finally. Section 1.3.2 of the EMF+ documentation says:

    Data in the EMF+ metafile records are stored in little-endian format.

    Some computer architectures number bytes in a binary word from left
    to right, which is referred to as big-endian. The byte numbering used
    for bitfields in this specification is big-endian. Other
    architectures number the bytes in a binary word from right to left,
    which is referred to as little-endian. The byte numbering used for
    enumerations, objects, and records in this specification is little-
    endian.

    Why in the world would they use big-endian for bitfields and
    little-endian for everything else???????

    Thanks,

    David Mathog
     
    mathog, Jul 13, 2013
    #13
  14. mathog

    Ian Collins Guest

    James Kuyper wrote:
    > On 07/12/2013 05:50 PM, Ian Collins wrote:
    >> James Harris (es) wrote:

    > ....
    >>> It's safer (more portable and reliable) to define bitfields manually - i.e.
    >>> with masks and shifts. There is some simple C code showing how to access and
    >>> manipulate such fields at
    >>>
    >>> http://codewiki.wikispaces.com/bitfield_operations.c

    >>
    >> I really don't understand why people get so hung up about bit fields, or
    >> why they'd want to muck about with shifts and masks. That low level
    >> stuff is the compiler's job. It isn't rocket science to determine the
    >> order of bit fields (my day to day platform has preprocessor macros for
    >> this) and to use them correctly and portably.

    >
    > Well, I don't like using shifts and masks, but I use them because I
    > don't known how to use bit-fields to correctly and portably parse an
    > externally defined data structure. Would you care to demonstrate how it
    > is done? To make things concrete, lets consider the following case:


    Before we get to that, I'd just like to make it clear that I'm not
    claiming bit-fields are 100% applicable, more like 90+%. In all of the
    real wold code I've worked with they have been an appropriate solution.
    I have yet to encounter two compilers for the same platform that use
    different bit-field ordering. I know this isn't guaranteed, but in the
    real world this is one area where common sense prevails.

    > The raw data has a record size of 32 bits. The fields in that record
    > have lengths of 10, 10, and 12 bits, respectively. I would use the
    > following shift and mask instructions to extract them from an unsigned
    > char buffer of length 4:
    >
    > #if CHAR_BIT != 8
    > #error this code requires CHAR_BIT == 8
    > #endif
    > unsigned field1 = buffer[0] << 2 | buffer[1] >> 6;
    > unsigned field2 = (buffer[1] & 0x3f) << 6 | buffer[2] >> 4;
    > unsigned field3 = (buffer[2] & 0x0F) << 4 | buffer[3];
    > If I were doing a lot of this, I'd define appropriate macros to simplify
    > the extraction of those bit-fields, but this is what those macros would
    > expand to.


    In the absence of bit-fields, you should always use functions (or if you
    have a strong stomach, macros) for bit manipulation, otherwise you are
    vulnerable to changes in the layout.

    > What would your code using bit-fields and preprocessor macros to extract
    > these fields look like, given that it must be portable to both of the
    > following fully-conforming implementations of C (among others)?
    >
    > Implementation A uses addressable storage units with a size of 16 bits,
    > interprets those units as little-endian 16-bit integers. It forces
    > consecutive bit-fields to share a storage unit, even if that means that
    > they will have to cross a storage unit boundary. It assigns bit-fields
    > to the bits of those 16-bit integers in order from high to low.
    >
    > Implementation B uses addressable storage units with a size of 8 bits.
    > Bit-fields that are too big to fit in the remaining space of one storage
    > unit start in the next storage unit. Within each storage unit, bits are
    > assigned to bit-fields in order from low to high.


    Given that set of requirements, I would parse the data once into a
    naturally aligned struct and work with that. I would consider this more
    of a data serialisation task.

    --
    Ian Collins
     
    Ian Collins, Jul 13, 2013
    #14
  15. mathog

    mathog Guest

    mathog wrote:
    > I am having one of those days - what I am doing wrong here?


    (sorry, posted this in the wrong place a minute ago, so this is a duplicate)

    I found the problem, finally. Section 1.3.2 of the EMF+ documentation says:

    Data in the EMF+ metafile records are stored in little-endian format.

    Some computer architectures number bytes in a binary word from left
    to right, which is referred to as big-endian. The byte numbering used
    for bitfields in this specification is big-endian. Other
    architectures number the bytes in a binary word from right to left,
    which is referred to as little-endian. The byte numbering used for
    enumerations, objects, and records in this specification is little-
    endian.

    Why in the world would they use big-endian for bitfields and
    little-endian for everything else???????

    Thanks,

    David Mathog
     
    mathog, Jul 13, 2013
    #15
  16. mathog

    Eric Sosman Guest

    On 7/12/2013 5:50 PM, Ian Collins wrote:
    >[...]
    > I really don't understand why people get so hung up about bit fields, or
    > why they'd want to muck about with shifts and masks. That low level
    > stuff is the compiler's job. It isn't rocket science to determine the
    > order of bit fields (my day to day platform has preprocessor macros for
    > this) and to use them correctly and portably.


    Bit-fields are useless as a means of mapping an externally-
    defined format portably.

    This is just a special case of "structs are useless as a
    means of mapping an externally-defined format portably," except
    that when the struct has bit-fields it's even worse.

    For a specified compiler and target you may be privy to
    extra information that allows you to define a struct (with or
    without bit-fields) that matches a particular externally-defined
    format. But don't kid yourself by imagining that the recipe
    for one compiler/target pair will work with the next.

    As for macros -- Well, let's just start and end with the
    observation that the nature and size of the "addressable storage
    unit" that holds bit-fields is entirely the implementation's
    prerogative, and since the implementation is not even obliged
    to document it (it is "unspecified," not "implementation-defined")
    the only way you can define your macros is by hoping the compiler
    tells you more than is required, or by resorting to guesswork
    and hope.

    Structs (with or without bit-fields) tempt you, they seduce
    you, they lead you on and go nudge-nudge-wink-wink to entice
    you into using them to map external formats. But you'll hate
    yourself in the morning, even if you don't find yourself in the
    gutter minus your wallet and watch and plus a venereal disease.

    --
    Eric Sosman
    d
     
    Eric Sosman, Jul 13, 2013
    #16
  17. mathog

    Eric Sosman Guest

    On 7/12/2013 9:01 PM, mathog wrote:
    > mathog wrote:
    >> I am having one of those days - what I am doing wrong here?

    >
    > (sorry, posted this in the wrong place a minute ago, so this is a
    > duplicate)
    >
    > I found the problem, finally. Section 1.3.2 of the EMF+ documentation
    > says:
    >
    > Data in the EMF+ metafile records are stored in little-endian format.
    >
    > Some computer architectures number bytes in a binary word from left
    > to right, which is referred to as big-endian. The byte numbering used
    > for bitfields in this specification is big-endian. Other
    > architectures number the bytes in a binary word from right to left,
    > which is referred to as little-endian. The byte numbering used for
    > enumerations, objects, and records in this specification is little-
    > endian.
    >
    > Why in the world would they use big-endian for bitfields and
    > little-endian for everything else???????


    For portability.

    ;-)

    --
    Eric Sosman
    d
     
    Eric Sosman, Jul 13, 2013
    #17
  18. mathog

    Ian Collins Guest

    Eric Sosman wrote:
    > On 7/12/2013 5:50 PM, Ian Collins wrote:
    >> [...]
    >> I really don't understand why people get so hung up about bit fields, or
    >> why they'd want to muck about with shifts and masks. That low level
    >> stuff is the compiler's job. It isn't rocket science to determine the
    >> order of bit fields (my day to day platform has preprocessor macros for
    >> this) and to use them correctly and portably.

    >
    > Bit-fields are useless as a means of mapping an externally-
    > defined format portably.


    Tell that to the writers of most platform's IP headers.

    > This is just a special case of "structs are useless as a
    > means of mapping an externally-defined format portably," except
    > that when the struct has bit-fields it's even worse.


    "portability" has many meanings, ranging from "theoretically portable"
    to "works on windows"...

    > For a specified compiler and target you may be privy to
    > extra information that allows you to define a struct (with or
    > without bit-fields) that matches a particular externally-defined
    > format. But don't kid yourself by imagining that the recipe
    > for one compiler/target pair will work with the next.


    Some do (like IP related headers) and some don't. How you tackle
    marshaling them depends on the detail. Would you use masks to extract
    data from and insert data into an IP header, or would you use the
    structs provided by the system?

    > As for macros -- Well, let's just start and end with the
    > observation that the nature and size of the "addressable storage
    > unit" that holds bit-fields is entirely the implementation's
    > prerogative, and since the implementation is not even obliged
    > to document it (it is "unspecified," not "implementation-defined")
    > the only way you can define your macros is by hoping the compiler
    > tells you more than is required, or by resorting to guesswork
    > and hope.


    As I said in another response, in all of the real wold situations I have
    seen, common sense prevails here.

    > Structs (with or without bit-fields) tempt you, they seduce
    > you, they lead you on and go nudge-nudge-wink-wink to entice
    > you into using them to map external formats. But you'll hate
    > yourself in the morning, even if you don't find yourself in the
    > gutter minus your wallet and watch and plus a venereal disease.>


    :)

    --
    Ian Collins
     
    Ian Collins, Jul 13, 2013
    #18
  19. mathog

    Eric Sosman Guest

    On 7/12/2013 9:25 PM, Ian Collins wrote:
    > Eric Sosman wrote:
    >> On 7/12/2013 5:50 PM, Ian Collins wrote:
    >>> [...]
    >>> I really don't understand why people get so hung up about bit fields, or
    >>> why they'd want to muck about with shifts and masks. That low level
    >>> stuff is the compiler's job. It isn't rocket science to determine the
    >>> order of bit fields (my day to day platform has preprocessor macros for
    >>> this) and to use them correctly and portably.

    >>
    >> Bit-fields are useless as a means of mapping an externally-
    >> defined format portably.

    >
    > Tell that to the writers of most platform's IP headers.
    >
    >> This is just a special case of "structs are useless as a
    >> means of mapping an externally-defined format portably," except
    >> that when the struct has bit-fields it's even worse.

    >
    > "portability" has many meanings, ranging from "theoretically portable"
    > to "works on windows"...
    >
    >> For a specified compiler and target you may be privy to
    >> extra information that allows you to define a struct (with or
    >> without bit-fields) that matches a particular externally-defined
    >> format. But don't kid yourself by imagining that the recipe
    >> for one compiler/target pair will work with the next.

    >
    > Some do (like IP related headers) and some don't. How you tackle
    > marshaling them depends on the detail. Would you use masks to extract
    > data from and insert data into an IP header, or would you use the
    > structs provided by the system?
    >
    >> As for macros -- Well, let's just start and end with the
    >> observation that the nature and size of the "addressable storage
    >> unit" that holds bit-fields is entirely the implementation's
    >> prerogative, and since the implementation is not even obliged
    >> to document it (it is "unspecified," not "implementation-defined")
    >> the only way you can define your macros is by hoping the compiler
    >> tells you more than is required, or by resorting to guesswork
    >> and hope.

    >
    > As I said in another response, in all of the real wold situations I have
    > seen, common sense prevails here.
    >
    >> Structs (with or without bit-fields) tempt you, they seduce
    >> you, they lead you on and go nudge-nudge-wink-wink to entice
    >> you into using them to map external formats. But you'll hate
    >> yourself in the morning, even if you don't find yourself in the
    >> gutter minus your wallet and watch and plus a venereal disease.>

    >
    > :)


    I hope for your sake it's not penicillin-resistant.

    --
    Eric Sosman
    d
     
    Eric Sosman, Jul 13, 2013
    #19
  20. mathog

    Joe Pfeiffer Guest

    Ian Collins <> writes:
    >
    > Before we get to that, I'd just like to make it clear that I'm not
    > claiming bit-fields are 100% applicable, more like 90+%. In all of
    > the real wold code I've worked with they have been an appropriate
    > solution. I have yet to encounter two compilers for the same platform
    > that use different bit-field ordering. I know this isn't guaranteed,
    > but in the real world this is one area where common sense prevails.


    Exactly. I don't remember whether it was moving code from 68K to
    Sparc, or from Sun Sparc Solaris to i386 Linux that did it... but
    virtually all my bitfields were broken. I don't think I've used one
    since that particular burning.
     
    Joe Pfeiffer, Jul 13, 2013
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. zb32

    bitfield optimizations

    zb32, Jul 13, 2004, in forum: C++
    Replies:
    1
    Views:
    1,068
    David Harmon
    Jul 13, 2004
  2. Claudio

    bitfield & union strange ?!

    Claudio, Aug 1, 2004, in forum: C++
    Replies:
    2
    Views:
    4,897
    Gottfried Eibner
    Aug 2, 2004
  3. Replies:
    0
    Views:
    524
  4. Davide Bruzzone
    Replies:
    9
    Views:
    382
    Adam S. Roan
    Aug 27, 2003
  5. Sushil
    Replies:
    1
    Views:
    700
    Jack Klein
    Nov 28, 2003
Loading...

Share This Page