Help understading these proprocessor macros

drequena · Apr 13, 2005

Hi All,

I'm not that expert at C but I'm trying to understand some code that
does extensive use of the following two preprocessor macros:

#define WORD(ptr) (((*(octet *)(ptr))<<8)|(*(octet *)((ptr)+1)))
#define DWORD(ptr) (((*(octet *)(ptr))<<24)|(*(octet *)((ptr)+1)<<16)|\
(*(octet *)((ptr)+2)<<8)|(*(octet *)((ptr)+3)))

I'm lost with all that shifting and indirection :-(
Could some kind soul explain what these macros are doing?

TIA,
David

Keith Thompson · Apr 14, 2005

I'm not that expert at C but I'm trying to understand some code that
does extensive use of the following two preprocessor macros:

#define WORD(ptr) (((*(octet *)(ptr))<<8)|(*(octet *)((ptr)+1)))
#define DWORD(ptr) (((*(octet *)(ptr))<<24)|(*(octet *)((ptr)+1)<<16)|\
(*(octet *)((ptr)+2)<<8)|(*(octet *)((ptr)+3)))

I'm lost with all that shifting and indirection :-(
Could some kind soul explain what these macros are doing?

Presumably "octet" is a typedef for an 8-bit type, probably unsigned
char. (The code probably won't work on a system with CHAR_BIT!=8.)

WORD retrieves a 16-bit unsigned integer value, stored high-order byte
first (big-endian, also known as network order), pointed to by ptr.

DWORD retrieves a 32-bit unsigned integer value, stored
high-order-byte first (big-endian), pointed to by ptr.

Let's look at the definition of WORD, working from the inside out.

(ptr) is a pointer.

(octet *)(ptr) is ptr converted to a pointer-to-octet.

(*(octet *)(ptr)) dereferences the converted pointer, yielding an
octet value (a number in the range 0..255). Let's call this
HIGH_OCTET.

(ptr)+1 is points one element past the object pointed to by ptr. For
this to work properly, the actual argument passed to WORD had better
be of some character pointer type. If it's a void*, you can't legally
perform arithmetic on it. If it's an int*, for example, adding one
advances too far. If you wanted to allow WORD() to apply to arbitrary
pointer-to-void or pointer-to-object types, you'd want to add 1
*after* converting to (octet*). We'll assume that the argument is a
pointer-to-character.

(*(octet *)((ptr)+1)) retrieves the octet value from just after the
location pointed to by ptr. Let's call this LOW_OCTET.

The whole expression then becomes (HIGH_OCTET<<8|LOW_OCTET).

So, if ptr is of type unsigned char*, and *ptr==0x12, and
*(ptr+1)==0x34, then WORD(ptr) will be 0x1234.

Similarly, if ptr points to a sequence of unsigned bytes with the
values 0x12, 0x34, 0x56, and 0x78, DWORD(ptr) will be 0x12345678.

Note that if type int is 16 bits, the default integer promotions may
cause the DWORD() macro to fail, since there's nothing to indicate
that any of the operands are 32 bits. This could be corrected by
defining a 32-bit unsigned integer type (or using uint32_t) and
inserting several casts. But it may not be a problem if the code is
not required to be portable to such systems.

Both macros could be simplified to convert the pointer to a pointer to
the appropriate 16-bit or 32-bit type and dereference it, but *only*
if the machine uses a big-endian representation *and* the byte arrays
are appropriately aligned.

I suspect the macros are used in networking code, to extract 16-bit
and 32-bit unsigned integer values from network-order byte streams.

Walter Roberson · Apr 14, 2005

I'm not that expert at C but I'm trying to understand some code that
does extensive use of the following two preprocessor macros:

#define WORD(ptr) (((*(octet *)(ptr))<<8)|(*(octet *)((ptr)+1)))
#define DWORD(ptr) (((*(octet *)(ptr))<<24)|(*(octet *)((ptr)+1)<<16)|\
(*(octet *)((ptr)+2)<<8)|(*(octet *)((ptr)+3)))

I'm lost with all that shifting and indirection :-(
Could some kind soul explain what these macros are doing?

octet -likely- maps to unsigned char . So for WORD, the code
takes the binary at the first character position, moves it 8 bits to the
left, and or's in the binary at the second position. The result
is going to be a 16 bit value (in a possibly wider integer type)
which is the concatenation of the bit values.

The code does not simply cast to a pointer to unsigned short for
a few reasons:

1) The pointer might not be properly aligned for an unsigned short;
2) an unsigned short is not necessarily two positions wide;
3) If char happens to be more than 8 bits wide in the implementation,
then grabbing a short would end up with a gap between the two 8
bit binary parts, whereas the code used will always put the 8 bit
parts together even if CHAR_BITS is more than 8

Keith Thompson · Apr 14, 2005

The code does not simply cast to a pointer to unsigned short for
a few reasons:

1) The pointer might not be properly aligned for an unsigned short;
2) an unsigned short is not necessarily two positions wide;
3) If char happens to be more than 8 bits wide in the implementation,
then grabbing a short would end up with a gap between the two 8
bit binary parts, whereas the code used will always put the 8 bit
parts together even if CHAR_BITS is more than 8

CHAR_BIT, not CHAR_BITS.

If CHAR_BIT > 8, and some of the bytes being accessed happen to have
values greater than 255, there could be some overlap and incorrect
results. Running this code on a system with CHAR_BIT>8 would require
some careful thought; in particular, it's not clear how the octet
stream would be represented on a system with, say, 9-bit bytes.

Also:

4) The code assumes the data is represented in big-endian byte order;
if a short (assuming it's 16 bits, and assuming proper alignment) is
little-endian, casting the pointer to unsigned short* would yield
incorrect results.

mckennan · Apr 14, 2005

Many thanks Keith and Walter for your replys

Your explanations have been very good, just what I needed

Just for clarification, the code is in fact reading values from the
header of a file wich are stored in network order (now I assume this is
the same as big endian) on pc/windows wich is not big endian.

The definition of octet found elsewhee in the source is:

typedef unsigned char octet;

So it seems you're right

Example taken from the source:

#define HDR_NUMRECORDS 12
...
octet * pFile;
octet * pRecord;
int nRecords;
...
nRecords = WORD(pFile+HDR_NUMRECORDS);

If I understood correctly this is assigning a 16 bit integer value
(short int fro this platform) to an int in inverse "endianness" of what
is stored at address pFile+HDR_NUMRECORDS.

right?

Thanks again,
David

Keith Thompson · Apr 14, 2005

mckennan said:
The definition of octet found elsewhee in the source is:

typedef unsigned char octet;

So it seems you're right

Example taken from the source:

#define HDR_NUMRECORDS 12
...
octet * pFile;
octet * pRecord;
int nRecords;
...
nRecords = WORD(pFile+HDR_NUMRECORDS);

If I understood correctly this is assigning a 16 bit integer value
(short int fro this platform) to an int in inverse "endianness" of what
is stored at address pFile+HDR_NUMRECORDS.

right?

Right. More precisely, it's retrieving a 16-bit big-endian value.
That happens to be the reverse of what your processor uses, but the
could should work equally well on a big-endian platform.

Endianness macros	4	Nov 27, 2009
Endianness macros	48	Aug 23, 2013
Bitset Macros	13	Dec 31, 2003
Help with Loop	0	Mar 30, 2023
Newbie needs help with pointers/memory allocation	12	Aug 17, 2006
Help with function-like macros	4	Dec 31, 2004
Having trouble understanding these bitwise macros	5	Oct 12, 2005
Processing in Python help	0	Aug 31, 2022

Help understading these proprocessor macros

drequena

Keith Thompson

Walter Roberson

Keith Thompson

mckennan

Keith Thompson

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads