Help understading these proprocessor macros

D

drequena

Hi All,

I'm not that expert at C but I'm trying to understand some code that
does extensive use of the following two preprocessor macros:

#define WORD(ptr) (((*(octet *)(ptr))<<8)|(*(octet *)((ptr)+1)))
#define DWORD(ptr) (((*(octet *)(ptr))<<24)|(*(octet *)((ptr)+1)<<16)|\
(*(octet *)((ptr)+2)<<8)|(*(octet *)((ptr)+3)))

I'm lost with all that shifting and indirection :-(
Could some kind soul explain what these macros are doing?

TIA,
David
 
K

Keith Thompson

I'm not that expert at C but I'm trying to understand some code that
does extensive use of the following two preprocessor macros:

#define WORD(ptr) (((*(octet *)(ptr))<<8)|(*(octet *)((ptr)+1)))
#define DWORD(ptr) (((*(octet *)(ptr))<<24)|(*(octet *)((ptr)+1)<<16)|\
(*(octet *)((ptr)+2)<<8)|(*(octet *)((ptr)+3)))

I'm lost with all that shifting and indirection :-(
Could some kind soul explain what these macros are doing?

Presumably "octet" is a typedef for an 8-bit type, probably unsigned
char. (The code probably won't work on a system with CHAR_BIT!=8.)

WORD retrieves a 16-bit unsigned integer value, stored high-order byte
first (big-endian, also known as network order), pointed to by ptr.

DWORD retrieves a 32-bit unsigned integer value, stored
high-order-byte first (big-endian), pointed to by ptr.

Let's look at the definition of WORD, working from the inside out.

(ptr) is a pointer.

(octet *)(ptr) is ptr converted to a pointer-to-octet.

(*(octet *)(ptr)) dereferences the converted pointer, yielding an
octet value (a number in the range 0..255). Let's call this
HIGH_OCTET.

(ptr)+1 is points one element past the object pointed to by ptr. For
this to work properly, the actual argument passed to WORD had better
be of some character pointer type. If it's a void*, you can't legally
perform arithmetic on it. If it's an int*, for example, adding one
advances too far. If you wanted to allow WORD() to apply to arbitrary
pointer-to-void or pointer-to-object types, you'd want to add 1
*after* converting to (octet*). We'll assume that the argument is a
pointer-to-character.

(*(octet *)((ptr)+1)) retrieves the octet value from just after the
location pointed to by ptr. Let's call this LOW_OCTET.

The whole expression then becomes (HIGH_OCTET<<8|LOW_OCTET).

So, if ptr is of type unsigned char*, and *ptr==0x12, and
*(ptr+1)==0x34, then WORD(ptr) will be 0x1234.

Similarly, if ptr points to a sequence of unsigned bytes with the
values 0x12, 0x34, 0x56, and 0x78, DWORD(ptr) will be 0x12345678.

Note that if type int is 16 bits, the default integer promotions may
cause the DWORD() macro to fail, since there's nothing to indicate
that any of the operands are 32 bits. This could be corrected by
defining a 32-bit unsigned integer type (or using uint32_t) and
inserting several casts. But it may not be a problem if the code is
not required to be portable to such systems.

Both macros could be simplified to convert the pointer to a pointer to
the appropriate 16-bit or 32-bit type and dereference it, but *only*
if the machine uses a big-endian representation *and* the byte arrays
are appropriately aligned.

I suspect the macros are used in networking code, to extract 16-bit
and 32-bit unsigned integer values from network-order byte streams.
 
W

Walter Roberson

I'm not that expert at C but I'm trying to understand some code that
does extensive use of the following two preprocessor macros:
#define WORD(ptr) (((*(octet *)(ptr))<<8)|(*(octet *)((ptr)+1)))
#define DWORD(ptr) (((*(octet *)(ptr))<<24)|(*(octet *)((ptr)+1)<<16)|\
(*(octet *)((ptr)+2)<<8)|(*(octet *)((ptr)+3)))
I'm lost with all that shifting and indirection :-(
Could some kind soul explain what these macros are doing?

octet -likely- maps to unsigned char . So for WORD, the code
takes the binary at the first character position, moves it 8 bits to the
left, and or's in the binary at the second position. The result
is going to be a 16 bit value (in a possibly wider integer type)
which is the concatenation of the bit values.

The code does not simply cast to a pointer to unsigned short for
a few reasons:

1) The pointer might not be properly aligned for an unsigned short;
2) an unsigned short is not necessarily two positions wide;
3) If char happens to be more than 8 bits wide in the implementation,
then grabbing a short would end up with a gap between the two 8
bit binary parts, whereas the code used will always put the 8 bit
parts together even if CHAR_BITS is more than 8
 
K

Keith Thompson

The code does not simply cast to a pointer to unsigned short for
a few reasons:

1) The pointer might not be properly aligned for an unsigned short;
2) an unsigned short is not necessarily two positions wide;
3) If char happens to be more than 8 bits wide in the implementation,
then grabbing a short would end up with a gap between the two 8
bit binary parts, whereas the code used will always put the 8 bit
parts together even if CHAR_BITS is more than 8

CHAR_BIT, not CHAR_BITS.

If CHAR_BIT > 8, and some of the bytes being accessed happen to have
values greater than 255, there could be some overlap and incorrect
results. Running this code on a system with CHAR_BIT>8 would require
some careful thought; in particular, it's not clear how the octet
stream would be represented on a system with, say, 9-bit bytes.

Also:

4) The code assumes the data is represented in big-endian byte order;
if a short (assuming it's 16 bits, and assuming proper alignment) is
little-endian, casting the pointer to unsigned short* would yield
incorrect results.
 
M

mckennan

Many thanks Keith and Walter for your replys

Your explanations have been very good, just what I needed :)

Just for clarification, the code is in fact reading values from the
header of a file wich are stored in network order (now I assume this is
the same as big endian) on pc/windows wich is not big endian.

The definition of octet found elsewhee in the source is:

typedef unsigned char octet;

So it seems you're right :)

Example taken from the source:

#define HDR_NUMRECORDS 12
...
octet * pFile;
octet * pRecord;
int nRecords;
...
nRecords = WORD(pFile+HDR_NUMRECORDS);

If I understood correctly this is assigning a 16 bit integer value
(short int fro this platform) to an int in inverse "endianness" of what
is stored at address pFile+HDR_NUMRECORDS.

right?

Thanks again,
David
 
K

Keith Thompson

mckennan said:
The definition of octet found elsewhee in the source is:

typedef unsigned char octet;

So it seems you're right :)

Example taken from the source:

#define HDR_NUMRECORDS 12
...
octet * pFile;
octet * pRecord;
int nRecords;
...
nRecords = WORD(pFile+HDR_NUMRECORDS);

If I understood correctly this is assigning a 16 bit integer value
(short int fro this platform) to an int in inverse "endianness" of what
is stored at address pFile+HDR_NUMRECORDS.

right?

Right. More precisely, it's retrieving a 16-bit big-endian value.
That happens to be the reverse of what your processor uses, but the
could should work equally well on a big-endian platform.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,020
Latest member
GenesisGai

Latest Threads

Top