Help understading these proprocessor macros

Discussion in 'C Programming' started by drequena@gmail.com, Apr 13, 2005.

  1. Guest

    Hi All,

    I'm not that expert at C but I'm trying to understand some code that
    does extensive use of the following two preprocessor macros:

    #define WORD(ptr) (((*(octet *)(ptr))<<8)|(*(octet *)((ptr)+1)))
    #define DWORD(ptr) (((*(octet *)(ptr))<<24)|(*(octet *)((ptr)+1)<<16)|\
    (*(octet *)((ptr)+2)<<8)|(*(octet *)((ptr)+3)))

    I'm lost with all that shifting and indirection :-(
    Could some kind soul explain what these macros are doing?

    TIA,
    David
    , Apr 13, 2005
    #1
    1. Advertising

  2. writes:
    > I'm not that expert at C but I'm trying to understand some code that
    > does extensive use of the following two preprocessor macros:
    >
    > #define WORD(ptr) (((*(octet *)(ptr))<<8)|(*(octet *)((ptr)+1)))
    > #define DWORD(ptr) (((*(octet *)(ptr))<<24)|(*(octet *)((ptr)+1)<<16)|\
    > (*(octet *)((ptr)+2)<<8)|(*(octet *)((ptr)+3)))
    >
    > I'm lost with all that shifting and indirection :-(
    > Could some kind soul explain what these macros are doing?


    Presumably "octet" is a typedef for an 8-bit type, probably unsigned
    char. (The code probably won't work on a system with CHAR_BIT!=8.)

    WORD retrieves a 16-bit unsigned integer value, stored high-order byte
    first (big-endian, also known as network order), pointed to by ptr.

    DWORD retrieves a 32-bit unsigned integer value, stored
    high-order-byte first (big-endian), pointed to by ptr.

    Let's look at the definition of WORD, working from the inside out.

    (ptr) is a pointer.

    (octet *)(ptr) is ptr converted to a pointer-to-octet.

    (*(octet *)(ptr)) dereferences the converted pointer, yielding an
    octet value (a number in the range 0..255). Let's call this
    HIGH_OCTET.

    (ptr)+1 is points one element past the object pointed to by ptr. For
    this to work properly, the actual argument passed to WORD had better
    be of some character pointer type. If it's a void*, you can't legally
    perform arithmetic on it. If it's an int*, for example, adding one
    advances too far. If you wanted to allow WORD() to apply to arbitrary
    pointer-to-void or pointer-to-object types, you'd want to add 1
    *after* converting to (octet*). We'll assume that the argument is a
    pointer-to-character.

    (*(octet *)((ptr)+1)) retrieves the octet value from just after the
    location pointed to by ptr. Let's call this LOW_OCTET.

    The whole expression then becomes (HIGH_OCTET<<8|LOW_OCTET).

    So, if ptr is of type unsigned char*, and *ptr==0x12, and
    *(ptr+1)==0x34, then WORD(ptr) will be 0x1234.

    Similarly, if ptr points to a sequence of unsigned bytes with the
    values 0x12, 0x34, 0x56, and 0x78, DWORD(ptr) will be 0x12345678.

    Note that if type int is 16 bits, the default integer promotions may
    cause the DWORD() macro to fail, since there's nothing to indicate
    that any of the operands are 32 bits. This could be corrected by
    defining a 32-bit unsigned integer type (or using uint32_t) and
    inserting several casts. But it may not be a problem if the code is
    not required to be portable to such systems.

    Both macros could be simplified to convert the pointer to a pointer to
    the appropriate 16-bit or 32-bit type and dereference it, but *only*
    if the machine uses a big-endian representation *and* the byte arrays
    are appropriately aligned.

    I suspect the macros are used in networking code, to extract 16-bit
    and 32-bit unsigned integer values from network-order byte streams.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
    Keith Thompson, Apr 14, 2005
    #2
    1. Advertising

  3. In article <>,
    <> wrote:
    >I'm not that expert at C but I'm trying to understand some code that
    >does extensive use of the following two preprocessor macros:


    >#define WORD(ptr) (((*(octet *)(ptr))<<8)|(*(octet *)((ptr)+1)))
    >#define DWORD(ptr) (((*(octet *)(ptr))<<24)|(*(octet *)((ptr)+1)<<16)|\
    > (*(octet *)((ptr)+2)<<8)|(*(octet *)((ptr)+3)))


    >I'm lost with all that shifting and indirection :-(
    >Could some kind soul explain what these macros are doing?


    octet -likely- maps to unsigned char . So for WORD, the code
    takes the binary at the first character position, moves it 8 bits to the
    left, and or's in the binary at the second position. The result
    is going to be a 16 bit value (in a possibly wider integer type)
    which is the concatenation of the bit values.

    The code does not simply cast to a pointer to unsigned short for
    a few reasons:

    1) The pointer might not be properly aligned for an unsigned short;
    2) an unsigned short is not necessarily two positions wide;
    3) If char happens to be more than 8 bits wide in the implementation,
    then grabbing a short would end up with a gap between the two 8
    bit binary parts, whereas the code used will always put the 8 bit
    parts together even if CHAR_BITS is more than 8
    --
    'ignorandus (Latin): "deserving not to be known"'
    -- Journal of Self-Referentialism
    Walter Roberson, Apr 14, 2005
    #3
  4. -cnrc.gc.ca (Walter Roberson) writes:
    [snip]
    > The code does not simply cast to a pointer to unsigned short for
    > a few reasons:
    >
    > 1) The pointer might not be properly aligned for an unsigned short;
    > 2) an unsigned short is not necessarily two positions wide;
    > 3) If char happens to be more than 8 bits wide in the implementation,
    > then grabbing a short would end up with a gap between the two 8
    > bit binary parts, whereas the code used will always put the 8 bit
    > parts together even if CHAR_BITS is more than 8


    CHAR_BIT, not CHAR_BITS.

    If CHAR_BIT > 8, and some of the bytes being accessed happen to have
    values greater than 255, there could be some overlap and incorrect
    results. Running this code on a system with CHAR_BIT>8 would require
    some careful thought; in particular, it's not clear how the octet
    stream would be represented on a system with, say, 9-bit bytes.

    Also:

    4) The code assumes the data is represented in big-endian byte order;
    if a short (assuming it's 16 bits, and assuming proper alignment) is
    little-endian, casting the pointer to unsigned short* would yield
    incorrect results.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
    Keith Thompson, Apr 14, 2005
    #4
  5. mckennan Guest

    Many thanks Keith and Walter for your replys

    Your explanations have been very good, just what I needed :)

    Just for clarification, the code is in fact reading values from the
    header of a file wich are stored in network order (now I assume this is
    the same as big endian) on pc/windows wich is not big endian.

    The definition of octet found elsewhee in the source is:

    typedef unsigned char octet;

    So it seems you're right :)

    Example taken from the source:

    #define HDR_NUMRECORDS 12
    ...
    octet * pFile;
    octet * pRecord;
    int nRecords;
    ...
    nRecords = WORD(pFile+HDR_NUMRECORDS);

    If I understood correctly this is assigning a 16 bit integer value
    (short int fro this platform) to an int in inverse "endianness" of what
    is stored at address pFile+HDR_NUMRECORDS.

    right?

    Thanks again,
    David
    mckennan, Apr 14, 2005
    #5
  6. "mckennan" <> writes:
    [...]
    > The definition of octet found elsewhee in the source is:
    >
    > typedef unsigned char octet;
    >
    > So it seems you're right :)
    >
    > Example taken from the source:
    >
    > #define HDR_NUMRECORDS 12
    > ...
    > octet * pFile;
    > octet * pRecord;
    > int nRecords;
    > ...
    > nRecords = WORD(pFile+HDR_NUMRECORDS);
    >
    > If I understood correctly this is assigning a 16 bit integer value
    > (short int fro this platform) to an int in inverse "endianness" of what
    > is stored at address pFile+HDR_NUMRECORDS.
    >
    > right?


    Right. More precisely, it's retrieving a 16-bit big-endian value.
    That happens to be the reverse of what your processor uses, but the
    could should work equally well on a big-endian platform.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
    Keith Thompson, Apr 14, 2005
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    80
    Views:
    2,416
    Stephen J. Bevan
    Nov 7, 2003
  2. Replies:
    1
    Views:
    438
    Marco Antoniotti
    Oct 7, 2003
  3. Replies:
    5
    Views:
    493
  4. Michael T. Babcock

    Re: Explanation of macros; Haskell macros

    Michael T. Babcock, Nov 3, 2003, in forum: Python
    Replies:
    0
    Views:
    515
    Michael T. Babcock
    Nov 3, 2003
  5. Andrew Arro

    macros-loop? calling macros X times?

    Andrew Arro, Jul 23, 2004, in forum: C Programming
    Replies:
    2
    Views:
    488
    S.Tobias
    Jul 24, 2004
Loading...

Share This Page