Macro expansion surprise.

Kevin D. Quitt · May 6, 2009

I'm at a new job, with a funky coding standard, so don't blame me for that. I'm
also being looked at as the resident C expert after making some coding
recommendations (like do {...}while (0) for multi-line macros).

I was approached on how to swap the bytes (*sigh*) in a short, using a macro.
This particular compiler does not have the NTOH.. macro set, and they want to
implement them, sort of: they want to pass a pointer to what gets swapped. So
here's what I wrote (where UCHAR and USHORT are the expected typedefs):

#define SWAP_BYTES_AT_POINTER(bp) \
do { \
UCHAR c, *p; \
if ((UCHAR *)bp) { \
p = (UCHAR *)bp; \
c = p[1]; \
p[1] = p[0]; \
p[0] = c; \
} \
} while (0)

They were delighted by this and asked for a macro that implement (their way)
NTOHL; I gave them this:

#define SWAP_SHORTS_AT_POINTER(sp) \
do { \
USHORT c, *p; \
if ((USHORT *)sp) { \
p = (USHORT *)bp; \
c = p[1]; \
p[1] = p[0]; \
p[0] = c; \
SWAP_BYTES_AT_POINTER(p); \
SWAP_BYTES_AT_POINTER(p+1); \
} \
} while (0)

They weren't delighted by this because it segfaults at the first invocation of
SWAP_BYTES_AT_POINTER. On a whim, I changed the names of the variables in
SWAP_SHORTS_AT_POINTER, and the segfault went away. The original also blows up
under Windows, using Visual Studio, so I am led to believe that it isn't a
compiler problem.

I can't see why there's a problem. The variables are local in scope and
shouldn't interfere - right? What is the problem?

And for those out there who insist on seeing the work-around, here it is:

#define SWAP_SHORTS_AT_POINTER(sp) \
do { \
UCHAR c, *p; \
if ((UCHAR *)bp) { \
p = (UCHAR *)bp; \
c = p[0]; \
p[0] = p[3]; \
p[3] = c; \
c = p[1]; \
p[1] = p[2]; \
p[2] = c; \
} \
} while (0)

Hallvard B Furuseth · May 6, 2009

Kevin said:
I was approached on how to swap the bytes (*sigh*) in a short, using a macro.
(...)
#define SWAP_SHORTS_AT_POINTER(sp) \
do { \
USHORT c, *p; \
if ((USHORT *)sp) { \

Error: Enclose macro parameters in parenthesis:
if ((USHORT *)(sp)) { \
Remember that sp is not a C variable, it's textually replaced.
So SWAP_SHORTS_AT_POINTER(p + 3) gives 'if ((USHORT *)p + 3) {'.

Also, the 'p' in that macro call is supposed to refer to a p declared
outside of the macro, but since macros just do _textual_ replacement the
(USHORT *)p will refer to the p local to the macro.
Run it through cc -E to see how the compilers sees the expanded macro call.
Fix: Invent namespace rules and use those for variable names inside
macros, e.g. SWAP_SHORTS_c and SWAP_SHORTS_p.

p = (USHORT *)bp; \

Typo: bp should be sp.

SWAP_BYTES_AT_POINTER(p); \
SWAP_BYTES_AT_POINTER(p+1); \

And of course now you get the same name conflict again between the
two macros' "p"s.

As it happens, commonly ntoh* etc are commonly implemented with shifts,
e.g. (((val) >> 8) & 0xff) | (((val) & 0xff) << 8) plus whatever casts
you feel you need.

And remember that casts remove type checking, so you might want a
compromise with something lke like ((val) + 0UL) which converts signed
integer types up to long, to unsigned long. Then cast the final result
to unsigned short or whatever.

user923005 · May 6, 2009

Kevin said:
I'm at a new job, with a funky coding standard, so don't blame me for that. I'm
also being looked at as the resident C expert after making some coding
recommendations (like do {...}while (0) for multi-line macros).

I was approached on how to swap the bytes (*sigh*) in a short, using a macro.
This particular compiler does not have the NTOH.. macro set, and they want to
implement them, sort of: they want to pass a pointer to what gets swapped. So
here's what I wrote (where UCHAR and USHORT are the expected typedefs):

#define SWAP_BYTES_AT_POINTER(bp) \
do { \
UCHAR c, *p; \
if ((UCHAR *)bp) { \
p = (UCHAR *)bp; \
c = p[1]; \
p[1] = p[0]; \
p[0] = c; \
} \
} while (0)

They were delighted by this and asked for a macro that implement (their way)
NTOHL; I gave them this:

#define SWAP_SHORTS_AT_POINTER(sp) \
do { \
USHORT c, *p; \
if ((USHORT *)sp) { \
p = (USHORT *)bp; \
c = p[1]; \
p[1] = p[0]; \
p[0] = c; \
SWAP_BYTES_AT_POINTER(p); \
SWAP_BYTES_AT_POINTER(p+1); \
} \
} while (0)

They weren't delighted by this because it segfaults at the first invocation of
SWAP_BYTES_AT_POINTER. On a whim, I changed the names of the variables in
SWAP_SHORTS_AT_POINTER, and the segfault went away. The original also blows up
under Windows, using Visual Studio, so I am led to believe that it isn't a
compiler problem.

I can't see why there's a problem. The variables are local in scope and
shouldn't interfere - right? What is the problem?

And for those out there who insist on seeing the work-around, here it is:

#define SWAP_SHORTS_AT_POINTER(sp) \
do { \
UCHAR c, *p; \
if ((UCHAR *)bp) { \
p = (UCHAR *)bp; \
c = p[0]; \
p[0] = p[3]; \
p[3] = c; \
c = p[1]; \
p[1] = p[2]; \
p[2] = c; \
} \
} while (0)

This is what I get after expansion:

typedef unsigned char UCHAR;
typedef unsigned short USHORT;

int main(void)
{
short a = 0x1234;
short *pa = &a;
do {
USHORT c,
*p;
if ((USHORT *) pa) {
p = (USHORT *) bp;
c = p[1];
p[1] = p[0];
p[0] = c;
do {
UCHAR c,
*p;
if ((UCHAR *) p) {
p = (UCHAR *) p;
c = p[1];
p[1] = p[0];
p[0] = c;
}
} while (0);
do {
UCHAR c,
*p;
if ((UCHAR *) p + 1) {
p = (UCHAR *) p + 1;
c = p[1];
p[1] = p[0];
p[0] = c;
}
} while (0);
}
} while (0);
return 0;
}

There is no definition for bp, so I guess bp is defined somewhere
inline in your code or it would not even compile.

It's possible I made a typeographical error somewhere.

Hallvard B Furuseth · May 6, 2009

Kevin said:
UCHAR c, *p; \
if ((UCHAR *)bp) { \
p = (UCHAR *)bp; \

....oh, and this evaluates 'bp' twice, so the macro doc must say that the
argument must not have side effects (like ptr++). As does my suggestion
with shifts unless you pass a pointer after all - or just a
'destination' parameter.

In your case though, you can avoid this with
UCHAR c, *p = (UCHAR *)bp; \
if (p) \
or rather
UCHAR SWAP_BYTES_c, *SWAP_BYTES_p = (UCHAR *)(bp); \
if (SWAP_BYTES_p) \
according to the suggestions of my previous message.

Francois Grieu · May 6, 2009

Kevin D. Quitt wrote :

I'm at a new job, with a funky coding standard, so don't blame me for that. I'm
also being looked at as the resident C expert after making some coding
recommendations (like do {...}while (0) for multi-line macros).

I was approached on how to swap the bytes (*sigh*) in a short, using a macro.
This particular compiler does not have the NTOH.. macro set, and they want to
implement them, sort of: they want to pass a pointer to what gets swapped. So
here's what I wrote (where UCHAR and USHORT are the expected typedefs):

#define SWAP_BYTES_AT_POINTER(bp) \
do { \
UCHAR c, *p; \
if ((UCHAR *)bp) { \
p = (UCHAR *)bp; \
c = p[1]; \
p[1] = p[0]; \
p[0] = c; \
} \
} while (0)

This macro is unsafe for two reasons
- (UCHAR *)bp way not be bp cast to UCHAR *
this happens in particular for
SWAP_BYTES_AT_POINTER(p+1)
which the OP uses (see below)
- the argument if evaluated twice, which
stinks if bp is an expression with side
effect.
I suggest:

#define SWAP_BYTES_AT_POINTER(bp) \
do { \
UCHAR c, *p; \
p = (UCHAR *)(bp); \
if (p) { \
c = p[1]; \
p[1] = p[0]; \
p[0] = c; \
} \
} while (0)

They were delighted by this and asked for a macro that implement (their way)
NTOHL; I gave them this:

#define SWAP_SHORTS_AT_POINTER(sp) \
do { \
USHORT c, *p; \
if ((USHORT *)sp) { \
p = (USHORT *)bp; \

Typo: b should be s in the above line

c = p[1]; \
p[1] = p[0]; \
p[0] = c; \
SWAP_BYTES_AT_POINTER(p); \
SWAP_BYTES_AT_POINTER(p+1); \
} \
} while (0)

In addition to the typo, there are the same
problems as above. Further, the second swap
incurs an unwanted test.

> And for those out there who insist on seeing the work-around, here it is:
>
> #define SWAP_SHORTS_AT_POINTER(sp) \
> do { \
> UCHAR c, *p; \
> if ((UCHAR *)bp) { \
> p = (UCHAR *)bp; \

Typo: b should be s in the above two lines

> c = p[0]; \
> p[0] = p[3]; \
> p[3] = c; \
> c = p[1]; \
> p[1] = p[2]; \
> p[2] = c; \
> } \
> } while (0)

Same problems as above, save for the unwanted test.
I suggest:

#define SWAP_SHORTS_AT_POINTER(sp) \
do { \
UCHAR c, *p; \
p = (UCHAR *)(sp); \
if (p) { \
c = p[0]; \
p[0] = p[3]; \
p[3] = c; \
c = p[1]; \
p[1] = p[2]; \
p[2] = c; \
} \
} while (0)

Francois Grieu

Kevin D. Quitt · May 6, 2009

Thank you all. I knew I was being stupid, just not sure how.

Phil Carmody · May 7, 2009

Hallvard said:
Hallvard said:

...oh, and this evaluates 'bp' twice, so the macro doc must say that the
argument must not have side effects (like ptr++). As does my suggestion
with shifts unless you pass a pointer after all - or just a
'destination' parameter.

Click to expand...

I prefer to just have functions to read and write integers in well-defined
formats. For example:

inline unsigned get_le16( void const* in )
{
return ((unsigned char const*) in) [0] +
((unsigned char const*) in) [1] * 0x100;
}

Well, that's going to fail horribly on the DS9000, or at least not
do what you wanted.

inline void set_le16( void* out, unsigned in )
{
((unsigned char*) out) [0] = in & 0xFF;
((unsigned char*) out) [1] = in >> 8 & 0xFF;
}

Why use an unsigned int, when you're only using an unsigned
short's worth of data?

With the swap routines you need to know the byte-order of the host
machine, while with the above functions you don't.

But with yours, you need to know that CHAR_BIT is 8.

The above are also
probably more efficient when you don't need to write the value back to
memory, simply examine it, or when you have a value in a local variable
that you're writing back (not that efficiency is the main purpose of
these).

Phil

Spiros Bousbouras · May 9, 2009

...oh, and this evaluates 'bp' twice, so the macro doc must say that the
argument must not have side effects (like ptr++). As does my suggestion
with shifts unless you pass a pointer after all - or just a
'destination' parameter.

Click to expand...

I prefer to just have functions to read and write integers in well-defined
formats. For example:

inline unsigned get_le16( void const* in )
{
return ((unsigned char const*) in) [0] +
((unsigned char const*) in) [1] * 0x100;
}

Isn't it simpler to write

inline unsigned get_le16( void const* in ) {
return *((unsigned short *) in) & 0xffff ;
}

inline void set_le16( void* out, unsigned in )
{
((unsigned char*) out) [0] = in & 0xFF;
((unsigned char*) out) [1] = in >> 8 & 0xFF;
}

Similarly

inline void set_le16( void* out, unsigned in ) {
unsigned short *p = out ;
*p = in & 0xffff ;
}

Spiros Bousbouras · May 9, 2009

[email protected] (blargg) said:
[email protected] (blargg) said:

I prefer to just have functions to read and write integers in well-defined
formats. For example:

Click to expand...

inline unsigned get_le16( void const* in )
{
return ((unsigned char const*) in) [0] +
((unsigned char const*) in) [1] * 0x100;
}

Click to expand...

Well, that's going to fail horribly on the DS9000, or at least not
do what you wanted.

inline void set_le16( void* out, unsigned in )
{
((unsigned char*) out) [0] = in & 0xFF;
((unsigned char*) out) [1] = in >> 8 & 0xFF;
}

Click to expand...

Why use an unsigned int, when you're only using an unsigned
short's worth of data?

With the swap routines you need to know the byte-order of the host
machine, while with the above functions you don't.

Click to expand...

But with yours, you need to know that CHAR_BIT is 8.

I was assuming that CHAR_BIT == 8 was an implicit premise
throughout the thread. I mean many people posted solutions using
shifts with hard coded number of places and no appearance of
CHAR_BIT. I was also assuming a "benign" architecture. I mean if
you're going to bring DS9000 into the picture I wonder whether
you might get problems with trap representations with all the
&'ing and |'ing happening in some of the other posts.

So my assumptions throughout the thread were:
a) CHAR_BIT == 8
b) All pointers are to unsigned variables with no trap
representations and of sufficient width to allow all
the dereferencing and shifting taking place to be
defined.

James Kuyper · May 9, 2009

Spiros said:
On 7 May, 01:02, (e-mail address removed) (blargg) wrote: ....

inline unsigned get_le16( void const* in )
{
return ((unsigned char const*) in) [0] +
((unsigned char const*) in) [1] * 0x100;
}

Click to expand...

Isn't it simpler to write

inline unsigned get_le16( void const* in ) {
return *((unsigned short *) in) & 0xffff ;
}

Yes, it's simpler, but on many commonly used systems, it will not
produce the same result as blargg's version. On any system where
sizeof(unsigned short)!=2, your version will even access a different
amount of memory than blargg's version.

Spiros Bousbouras · May 9, 2009

Spiros said:
Spiros said:

blargg wrote: [...]

inline unsigned get_le16( void const* in )
{
return ((unsigned char const*) in) [0] +
((unsigned char const*) in) [1] * 0x100;
}

Click to expand...

Click to expand...

Isn't it simpler to write

Click to expand...

inline unsigned get_le16( void const* in ) {
return *((unsigned short *) in) & 0xffff ;

Click to expand...

Even simpler:

inline unsigned get_le16( void const* in ) { return 0; }

Both will produce about as useful results on a big-endian machine.

I could argue this but I see your point.

}

inline void set_le16( void* out, unsigned in )
{
((unsigned char*) out) [0] = in & 0xFF;
((unsigned char*) out) [1] = in >> 8 & 0xFF;
}

Click to expand...

Similarly

Click to expand...

inline void set_le16( void* out, unsigned in ) {
unsigned short *p = out ;
*p = in & 0xffff ;
}

Click to expand...

Why do you think ANDing with 0xFFFF does anything useful in either of these?

Since the value returned by your code only uses the first 16
bits of the "input" I was making sure my code does the same. But
I forgot to take endianess into account. Putting it more into
perspective I hadn't fully appreciated what you meant when you
said "I prefer to just have functions to read and write integers
in well-defined formats". Now I see that the purpose of your
code was to read and store a 16 bits value in little endian
format.

Phil Carmody · May 9, 2009

Spiros said:
Spiros said:

blargg wrote: [...]

inline unsigned get_le16( void const* in )
{
return ((unsigned char const*) in) [0] +
((unsigned char const*) in) [1] * 0x100;
}

Click to expand...

Isn't it simpler to write

inline unsigned get_le16( void const* in ) {
return *((unsigned short *) in) & 0xffff ;

Click to expand...

Even simpler:

inline unsigned get_le16( void const* in ) { return 0; }

His is right 1/256 of the time, yours is right only 1/65536 of the time
;-p

Phil

Old Wolf · May 10, 2009

#define SWAP_BYTES_AT_POINTER(bp) \
do { \
UCHAR c, *p; \
if ((UCHAR *)bp) { \
p = (UCHAR *)bp; \
c = p[1]; \
p[1] = p[0]; \
p[0] = c; \
} \
} while (0)

I think the following is much superior:
#define SWAP_BYTES_AT_POINTER(bp) \
SWAP( ((byte *)(bp))[0], ((byte *)(bp))[1] )

where SWAP is a standard swap macro, e.g.
#define SWAP(X,Y) ((X) ^= (Y), (Y) ^= (X), (X) ^= (Y))

No need to check for null, if someone does it on
null it's their own problem. If you are writing a
function you can check initial conditions, and
use a temporary variable etc., but in a macro I
would keep it concise.

Old Wolf · May 10, 2009

Isn't it simpler to write

inline unsigned get_le16( void const* in ) {
return *((unsigned short *) in) & 0xffff ;

"inline" isn't part of C90. I have to write code
to compile on multiple compilers and this aspect
of it is a real pain. (Some of the systems
support a variant of inline, but they all have
different problems, e.g. inefficient code
generation, spurious warnings about unused code,
symbol clashes, etc.) I have to maintain parallel
macro versions of each inline function.

Old Wolf · May 11, 2009

#define GET_LE16(in) \
( ((uint8_t const*)(in))[1]*0x100u + ((uint8_t const*)(in))[0] )

inline unsigned get_le16( void const* in )
{
return GET_LE16( in ); /* no source code duplication */
}

Good idea!

Need help finding Segmentation fault C++	0	Apr 16, 2022
Lexical Analysis on C++	1	Oct 31, 2023
Macro expansion : Confusion	2	Jun 7, 2009
Function is not worked in C	2	Jun 27, 2023
Help with C negative indexes	2	Feb 29, 2024
Drawing missing in bitmap in a pure C win32 program	4	Jun 3, 2023
How can I fix my pattern coding error in c++	0	Mar 19, 2023
Can't solve problems! please Help	0	Sep 26, 2022

Macro expansion surprise.

Kevin D. Quitt

Hallvard B Furuseth

user923005

Hallvard B Furuseth

Francois Grieu

Kevin D. Quitt

Phil Carmody

Spiros Bousbouras

Spiros Bousbouras

James Kuyper

Spiros Bousbouras

Phil Carmody

Old Wolf

Old Wolf

Old Wolf

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads