Macro expansion surprise.

K

Kevin D. Quitt

I'm at a new job, with a funky coding standard, so don't blame me for that. I'm
also being looked at as the resident C expert after making some coding
recommendations (like do {...}while (0) for multi-line macros).

I was approached on how to swap the bytes (*sigh*) in a short, using a macro.
This particular compiler does not have the NTOH.. macro set, and they want to
implement them, sort of: they want to pass a pointer to what gets swapped. So
here's what I wrote (where UCHAR and USHORT are the expected typedefs):

#define SWAP_BYTES_AT_POINTER(bp) \
do { \
UCHAR c, *p; \
if ((UCHAR *)bp) { \
p = (UCHAR *)bp; \
c = p[1]; \
p[1] = p[0]; \
p[0] = c; \
} \
} while (0)

They were delighted by this and asked for a macro that implement (their way)
NTOHL; I gave them this:

#define SWAP_SHORTS_AT_POINTER(sp) \
do { \
USHORT c, *p; \
if ((USHORT *)sp) { \
p = (USHORT *)bp; \
c = p[1]; \
p[1] = p[0]; \
p[0] = c; \
SWAP_BYTES_AT_POINTER(p); \
SWAP_BYTES_AT_POINTER(p+1); \
} \
} while (0)

They weren't delighted by this because it segfaults at the first invocation of
SWAP_BYTES_AT_POINTER. On a whim, I changed the names of the variables in
SWAP_SHORTS_AT_POINTER, and the segfault went away. The original also blows up
under Windows, using Visual Studio, so I am led to believe that it isn't a
compiler problem.

I can't see why there's a problem. The variables are local in scope and
shouldn't interfere - right? What is the problem?

And for those out there who insist on seeing the work-around, here it is:

#define SWAP_SHORTS_AT_POINTER(sp) \
do { \
UCHAR c, *p; \
if ((UCHAR *)bp) { \
p = (UCHAR *)bp; \
c = p[0]; \
p[0] = p[3]; \
p[3] = c; \
c = p[1]; \
p[1] = p[2]; \
p[2] = c; \
} \
} while (0)
 
H

Hallvard B Furuseth

Kevin said:
I was approached on how to swap the bytes (*sigh*) in a short, using a macro.
(...)
#define SWAP_SHORTS_AT_POINTER(sp) \
do { \
USHORT c, *p; \
if ((USHORT *)sp) { \

Error: Enclose macro parameters in parenthesis:
if ((USHORT *)(sp)) { \
Remember that sp is not a C variable, it's textually replaced.
So SWAP_SHORTS_AT_POINTER(p + 3) gives 'if ((USHORT *)p + 3) {'.

Also, the 'p' in that macro call is supposed to refer to a p declared
outside of the macro, but since macros just do _textual_ replacement the
(USHORT *)p will refer to the p local to the macro.
Run it through cc -E to see how the compilers sees the expanded macro call.
Fix: Invent namespace rules and use those for variable names inside
macros, e.g. SWAP_SHORTS_c and SWAP_SHORTS_p.
p = (USHORT *)bp; \

Typo: bp should be sp.
SWAP_BYTES_AT_POINTER(p); \
SWAP_BYTES_AT_POINTER(p+1); \

And of course now you get the same name conflict again between the
two macros' "p"s.


As it happens, commonly ntoh* etc are commonly implemented with shifts,
e.g. (((val) >> 8) & 0xff) | (((val) & 0xff) << 8) plus whatever casts
you feel you need.

And remember that casts remove type checking, so you might want a
compromise with something lke like ((val) + 0UL) which converts signed
integer types up to long, to unsigned long. Then cast the final result
to unsigned short or whatever.
 
U

user923005

Kevin said:
I'm at a new job, with a funky coding standard, so don't blame me for that. I'm
also being looked at as the resident C expert after making some coding
recommendations (like do {...}while (0) for multi-line macros).

I was approached on how to swap the bytes (*sigh*) in a short, using a macro.
This particular compiler does not have the NTOH.. macro set, and they want to
implement them, sort of: they want to pass a pointer to what gets swapped. So
here's what I wrote (where UCHAR and USHORT are the expected typedefs):

#define SWAP_BYTES_AT_POINTER(bp) \
do { \
UCHAR c, *p; \
if ((UCHAR *)bp) { \
p = (UCHAR *)bp; \
c = p[1]; \
p[1] = p[0]; \
p[0] = c; \
} \
} while (0)

They were delighted by this and asked for a macro that implement (their way)
NTOHL; I gave them this:

#define SWAP_SHORTS_AT_POINTER(sp) \
do { \
USHORT c, *p; \
if ((USHORT *)sp) { \
p = (USHORT *)bp; \
c = p[1]; \
p[1] = p[0]; \
p[0] = c; \
SWAP_BYTES_AT_POINTER(p); \
SWAP_BYTES_AT_POINTER(p+1); \
} \
} while (0)

They weren't delighted by this because it segfaults at the first invocation of
SWAP_BYTES_AT_POINTER. On a whim, I changed the names of the variables in
SWAP_SHORTS_AT_POINTER, and the segfault went away. The original also blows up
under Windows, using Visual Studio, so I am led to believe that it isn't a
compiler problem.

I can't see why there's a problem. The variables are local in scope and
shouldn't interfere - right? What is the problem?

And for those out there who insist on seeing the work-around, here it is:

#define SWAP_SHORTS_AT_POINTER(sp) \
do { \
UCHAR c, *p; \
if ((UCHAR *)bp) { \
p = (UCHAR *)bp; \
c = p[0]; \
p[0] = p[3]; \
p[3] = c; \
c = p[1]; \
p[1] = p[2]; \
p[2] = c; \
} \
} while (0)

This is what I get after expansion:

typedef unsigned char UCHAR;
typedef unsigned short USHORT;

int main(void)
{
short a = 0x1234;
short *pa = &a;
do {
USHORT c,
*p;
if ((USHORT *) pa) {
p = (USHORT *) bp;
c = p[1];
p[1] = p[0];
p[0] = c;
do {
UCHAR c,
*p;
if ((UCHAR *) p) {
p = (UCHAR *) p;
c = p[1];
p[1] = p[0];
p[0] = c;
}
} while (0);
do {
UCHAR c,
*p;
if ((UCHAR *) p + 1) {
p = (UCHAR *) p + 1;
c = p[1];
p[1] = p[0];
p[0] = c;
}
} while (0);
}
} while (0);
return 0;
}

There is no definition for bp, so I guess bp is defined somewhere
inline in your code or it would not even compile.

It's possible I made a typeographical error somewhere.
 
H

Hallvard B Furuseth

Kevin said:
UCHAR c, *p; \
if ((UCHAR *)bp) { \
p = (UCHAR *)bp; \

....oh, and this evaluates 'bp' twice, so the macro doc must say that the
argument must not have side effects (like ptr++). As does my suggestion
with shifts unless you pass a pointer after all - or just a
'destination' parameter.

In your case though, you can avoid this with
UCHAR c, *p = (UCHAR *)bp; \
if (p) \
or rather
UCHAR SWAP_BYTES_c, *SWAP_BYTES_p = (UCHAR *)(bp); \
if (SWAP_BYTES_p) \
according to the suggestions of my previous message.
 
F

Francois Grieu

Kevin D. Quitt wrote :
I'm at a new job, with a funky coding standard, so don't blame me for that. I'm
also being looked at as the resident C expert after making some coding
recommendations (like do {...}while (0) for multi-line macros).

I was approached on how to swap the bytes (*sigh*) in a short, using a macro.
This particular compiler does not have the NTOH.. macro set, and they want to
implement them, sort of: they want to pass a pointer to what gets swapped. So
here's what I wrote (where UCHAR and USHORT are the expected typedefs):

#define SWAP_BYTES_AT_POINTER(bp) \
do { \
UCHAR c, *p; \
if ((UCHAR *)bp) { \
p = (UCHAR *)bp; \
c = p[1]; \
p[1] = p[0]; \
p[0] = c; \
} \
} while (0)

This macro is unsafe for two reasons
- (UCHAR *)bp way not be bp cast to UCHAR *
this happens in particular for
SWAP_BYTES_AT_POINTER(p+1)
which the OP uses (see below)
- the argument if evaluated twice, which
stinks if bp is an expression with side
effect.
I suggest:

#define SWAP_BYTES_AT_POINTER(bp) \
do { \
UCHAR c, *p; \
p = (UCHAR *)(bp); \
if (p) { \
c = p[1]; \
p[1] = p[0]; \
p[0] = c; \
} \
} while (0)
They were delighted by this and asked for a macro that implement (their way)
NTOHL; I gave them this:

#define SWAP_SHORTS_AT_POINTER(sp) \
do { \
USHORT c, *p; \
if ((USHORT *)sp) { \
p = (USHORT *)bp; \
Typo: b should be s in the above line
c = p[1]; \
p[1] = p[0]; \
p[0] = c; \
SWAP_BYTES_AT_POINTER(p); \
SWAP_BYTES_AT_POINTER(p+1); \
} \
} while (0)

In addition to the typo, there are the same
problems as above. Further, the second swap
incurs an unwanted test.
> And for those out there who insist on seeing the work-around, here it is:
>
> #define SWAP_SHORTS_AT_POINTER(sp) \
> do { \
> UCHAR c, *p; \
> if ((UCHAR *)bp) { \
> p = (UCHAR *)bp; \
Typo: b should be s in the above two lines
> c = p[0]; \
> p[0] = p[3]; \
> p[3] = c; \
> c = p[1]; \
> p[1] = p[2]; \
> p[2] = c; \
> } \
> } while (0)

Same problems as above, save for the unwanted test.
I suggest:

#define SWAP_SHORTS_AT_POINTER(sp) \
do { \
UCHAR c, *p; \
p = (UCHAR *)(sp); \
if (p) { \
c = p[0]; \
p[0] = p[3]; \
p[3] = c; \
c = p[1]; \
p[1] = p[2]; \
p[2] = c; \
} \
} while (0)


Francois Grieu
 
P

Phil Carmody

Hallvard said:
...oh, and this evaluates 'bp' twice, so the macro doc must say that the
argument must not have side effects (like ptr++). As does my suggestion
with shifts unless you pass a pointer after all - or just a
'destination' parameter.

I prefer to just have functions to read and write integers in well-defined
formats. For example:

inline unsigned get_le16( void const* in )
{
return ((unsigned char const*) in) [0] +
((unsigned char const*) in) [1] * 0x100;
}

Well, that's going to fail horribly on the DS9000, or at least not
do what you wanted.
inline void set_le16( void* out, unsigned in )
{
((unsigned char*) out) [0] = in & 0xFF;
((unsigned char*) out) [1] = in >> 8 & 0xFF;
}

Why use an unsigned int, when you're only using an unsigned
short's worth of data?
With the swap routines you need to know the byte-order of the host
machine, while with the above functions you don't.

But with yours, you need to know that CHAR_BIT is 8.
The above are also
probably more efficient when you don't need to write the value back to
memory, simply examine it, or when you have a value in a local variable
that you're writing back (not that efficiency is the main purpose of
these).

Phil
 
S

Spiros Bousbouras

...oh, and this evaluates 'bp' twice, so the macro doc must say that the
argument must not have side effects (like ptr++). As does my suggestion
with shifts unless you pass a pointer after all - or just a
'destination' parameter.

I prefer to just have functions to read and write integers in well-defined
formats. For example:

inline unsigned get_le16( void const* in )
{
return ((unsigned char const*) in) [0] +
((unsigned char const*) in) [1] * 0x100;
}

Isn't it simpler to write

inline unsigned get_le16( void const* in ) {
return *((unsigned short *) in) & 0xffff ;
}
inline void set_le16( void* out, unsigned in )
{
((unsigned char*) out) [0] = in & 0xFF;
((unsigned char*) out) [1] = in >> 8 & 0xFF;
}

Similarly

inline void set_le16( void* out, unsigned in ) {
unsigned short *p = out ;
*p = in & 0xffff ;
}
 
S

Spiros Bousbouras

[email protected] (blargg) said:
I prefer to just have functions to read and write integers in well-defined
formats. For example:
inline unsigned get_le16( void const* in )
{
return ((unsigned char const*) in) [0] +
((unsigned char const*) in) [1] * 0x100;
}

Well, that's going to fail horribly on the DS9000, or at least not
do what you wanted.
inline void set_le16( void* out, unsigned in )
{
((unsigned char*) out) [0] = in & 0xFF;
((unsigned char*) out) [1] = in >> 8 & 0xFF;
}

Why use an unsigned int, when you're only using an unsigned
short's worth of data?
With the swap routines you need to know the byte-order of the host
machine, while with the above functions you don't.

But with yours, you need to know that CHAR_BIT is 8.

I was assuming that CHAR_BIT == 8 was an implicit premise
throughout the thread. I mean many people posted solutions using
shifts with hard coded number of places and no appearance of
CHAR_BIT. I was also assuming a "benign" architecture. I mean if
you're going to bring DS9000 into the picture I wonder whether
you might get problems with trap representations with all the
&'ing and |'ing happening in some of the other posts.

So my assumptions throughout the thread were:
a) CHAR_BIT == 8
b) All pointers are to unsigned variables with no trap
representations and of sufficient width to allow all
the dereferencing and shifting taking place to be
defined.
 
J

James Kuyper

Spiros said:
On 7 May, 01:02, (e-mail address removed) (blargg) wrote: ....
inline unsigned get_le16( void const* in )
{
return ((unsigned char const*) in) [0] +
((unsigned char const*) in) [1] * 0x100;
}

Isn't it simpler to write

inline unsigned get_le16( void const* in ) {
return *((unsigned short *) in) & 0xffff ;
}

Yes, it's simpler, but on many commonly used systems, it will not
produce the same result as blargg's version. On any system where
sizeof(unsigned short)!=2, your version will even access a different
amount of memory than blargg's version.
 
S

Spiros Bousbouras

Spiros said:
blargg wrote: [...]
inline unsigned get_le16( void const* in )
{
return ((unsigned char const*) in) [0] +
((unsigned char const*) in) [1] * 0x100;
}
Isn't it simpler to write
inline unsigned get_le16( void const* in ) {
return *((unsigned short *) in) & 0xffff ;

Even simpler:

inline unsigned get_le16( void const* in ) { return 0; }

Both will produce about as useful results on a big-endian machine.

I could argue this but I see your point.
}
inline void set_le16( void* out, unsigned in )
{
((unsigned char*) out) [0] = in & 0xFF;
((unsigned char*) out) [1] = in >> 8 & 0xFF;
}
Similarly

inline void set_le16( void* out, unsigned in ) {
unsigned short *p = out ;
*p = in & 0xffff ;
}

Why do you think ANDing with 0xFFFF does anything useful in either of these?

Since the value returned by your code only uses the first 16
bits of the "input" I was making sure my code does the same. But
I forgot to take endianess into account. Putting it more into
perspective I hadn't fully appreciated what you meant when you
said "I prefer to just have functions to read and write integers
in well-defined formats". Now I see that the purpose of your
code was to read and store a 16 bits value in little endian
format.
 
P

Phil Carmody

Spiros said:
blargg wrote: [...]
inline unsigned get_le16( void const* in )
{
return ((unsigned char const*) in) [0] +
((unsigned char const*) in) [1] * 0x100;
}

Isn't it simpler to write

inline unsigned get_le16( void const* in ) {
return *((unsigned short *) in) & 0xffff ;

Even simpler:

inline unsigned get_le16( void const* in ) { return 0; }

His is right 1/256 of the time, yours is right only 1/65536 of the time
;-p


Phil
 
O

Old Wolf

#define SWAP_BYTES_AT_POINTER(bp) \
    do {                          \
      UCHAR c, *p;                \
      if ((UCHAR *)bp) {          \
        p = (UCHAR *)bp;          \
        c = p[1];                 \
        p[1]  = p[0];             \
        p[0]  = c;                \
      }                           \
    } while (0)

I think the following is much superior:
#define SWAP_BYTES_AT_POINTER(bp) \
SWAP( ((byte *)(bp))[0], ((byte *)(bp))[1] )

where SWAP is a standard swap macro, e.g.
#define SWAP(X,Y) ((X) ^= (Y), (Y) ^= (X), (X) ^= (Y))

No need to check for null, if someone does it on
null it's their own problem. If you are writing a
function you can check initial conditions, and
use a temporary variable etc., but in a macro I
would keep it concise.
 
O

Old Wolf

Isn't it simpler to write

inline unsigned get_le16( void const* in ) {
    return *((unsigned short *) in) & 0xffff ;

"inline" isn't part of C90. I have to write code
to compile on multiple compilers and this aspect
of it is a real pain. (Some of the systems
support a variant of inline, but they all have
different problems, e.g. inefficient code
generation, spurious warnings about unused code,
symbol clashes, etc.) I have to maintain parallel
macro versions of each inline function.
 
O

Old Wolf

    #define GET_LE16(in) \
        ( ((uint8_t const*)(in))[1]*0x100u + ((uint8_t const*)(in))[0] )

    inline unsigned get_le16( void const* in )
    {
        return GET_LE16( in ); /* no source code duplication */
    }

Good idea!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,050
Latest member
AngelS122

Latest Threads

Top