Macro expansion surprise.

Discussion in 'C Programming' started by Kevin D. Quitt, May 6, 2009.

  1. I'm at a new job, with a funky coding standard, so don't blame me for that. I'm
    also being looked at as the resident C expert after making some coding
    recommendations (like do {...}while (0) for multi-line macros).

    I was approached on how to swap the bytes (*sigh*) in a short, using a macro.
    This particular compiler does not have the NTOH.. macro set, and they want to
    implement them, sort of: they want to pass a pointer to what gets swapped. So
    here's what I wrote (where UCHAR and USHORT are the expected typedefs):

    #define SWAP_BYTES_AT_POINTER(bp) \
    do { \
    UCHAR c, *p; \
    if ((UCHAR *)bp) { \
    p = (UCHAR *)bp; \
    c = p[1]; \
    p[1] = p[0]; \
    p[0] = c; \
    } \
    } while (0)

    They were delighted by this and asked for a macro that implement (their way)
    NTOHL; I gave them this:

    #define SWAP_SHORTS_AT_POINTER(sp) \
    do { \
    USHORT c, *p; \
    if ((USHORT *)sp) { \
    p = (USHORT *)bp; \
    c = p[1]; \
    p[1] = p[0]; \
    p[0] = c; \
    SWAP_BYTES_AT_POINTER(p); \
    SWAP_BYTES_AT_POINTER(p+1); \
    } \
    } while (0)

    They weren't delighted by this because it segfaults at the first invocation of
    SWAP_BYTES_AT_POINTER. On a whim, I changed the names of the variables in
    SWAP_SHORTS_AT_POINTER, and the segfault went away. The original also blows up
    under Windows, using Visual Studio, so I am led to believe that it isn't a
    compiler problem.

    I can't see why there's a problem. The variables are local in scope and
    shouldn't interfere - right? What is the problem?

    And for those out there who insist on seeing the work-around, here it is:

    #define SWAP_SHORTS_AT_POINTER(sp) \
    do { \
    UCHAR c, *p; \
    if ((UCHAR *)bp) { \
    p = (UCHAR *)bp; \
    c = p[0]; \
    p[0] = p[3]; \
    p[3] = c; \
    c = p[1]; \
    p[1] = p[2]; \
    p[2] = c; \
    } \
    } while (0)
     
    Kevin D. Quitt, May 6, 2009
    #1
    1. Advertising

  2. Kevin D. Quitt writes:
    > I was approached on how to swap the bytes (*sigh*) in a short, using a macro.
    > (...)
    > #define SWAP_SHORTS_AT_POINTER(sp) \
    > do { \
    > USHORT c, *p; \
    > if ((USHORT *)sp) { \


    Error: Enclose macro parameters in parenthesis:
    if ((USHORT *)(sp)) { \
    Remember that sp is not a C variable, it's textually replaced.
    So SWAP_SHORTS_AT_POINTER(p + 3) gives 'if ((USHORT *)p + 3) {'.

    Also, the 'p' in that macro call is supposed to refer to a p declared
    outside of the macro, but since macros just do _textual_ replacement the
    (USHORT *)p will refer to the p local to the macro.
    Run it through cc -E to see how the compilers sees the expanded macro call.
    Fix: Invent namespace rules and use those for variable names inside
    macros, e.g. SWAP_SHORTS_c and SWAP_SHORTS_p.

    > p = (USHORT *)bp; \


    Typo: bp should be sp.

    > SWAP_BYTES_AT_POINTER(p); \
    > SWAP_BYTES_AT_POINTER(p+1); \


    And of course now you get the same name conflict again between the
    two macros' "p"s.


    As it happens, commonly ntoh* etc are commonly implemented with shifts,
    e.g. (((val) >> 8) & 0xff) | (((val) & 0xff) << 8) plus whatever casts
    you feel you need.

    And remember that casts remove type checking, so you might want a
    compromise with something lke like ((val) + 0UL) which converts signed
    integer types up to long, to unsigned long. Then cast the final result
    to unsigned short or whatever.

    --
    Hallvard
     
    Hallvard B Furuseth, May 6, 2009
    #2
    1. Advertising

  3. Kevin D. Quitt

    user923005 Guest

    Kevin D. Quitt wrote:
    > I'm at a new job, with a funky coding standard, so don't blame me for that. I'm
    > also being looked at as the resident C expert after making some coding
    > recommendations (like do {...}while (0) for multi-line macros).
    >
    > I was approached on how to swap the bytes (*sigh*) in a short, using a macro.
    > This particular compiler does not have the NTOH.. macro set, and they want to
    > implement them, sort of: they want to pass a pointer to what gets swapped. So
    > here's what I wrote (where UCHAR and USHORT are the expected typedefs):
    >
    > #define SWAP_BYTES_AT_POINTER(bp) \
    > do { \
    > UCHAR c, *p; \
    > if ((UCHAR *)bp) { \
    > p = (UCHAR *)bp; \
    > c = p[1]; \
    > p[1] = p[0]; \
    > p[0] = c; \
    > } \
    > } while (0)
    >
    > They were delighted by this and asked for a macro that implement (their way)
    > NTOHL; I gave them this:
    >
    > #define SWAP_SHORTS_AT_POINTER(sp) \
    > do { \
    > USHORT c, *p; \
    > if ((USHORT *)sp) { \
    > p = (USHORT *)bp; \
    > c = p[1]; \
    > p[1] = p[0]; \
    > p[0] = c; \
    > SWAP_BYTES_AT_POINTER(p); \
    > SWAP_BYTES_AT_POINTER(p+1); \
    > } \
    > } while (0)
    >
    > They weren't delighted by this because it segfaults at the first invocation of
    > SWAP_BYTES_AT_POINTER. On a whim, I changed the names of the variables in
    > SWAP_SHORTS_AT_POINTER, and the segfault went away. The original also blows up
    > under Windows, using Visual Studio, so I am led to believe that it isn't a
    > compiler problem.
    >
    > I can't see why there's a problem. The variables are local in scope and
    > shouldn't interfere - right? What is the problem?
    >
    > And for those out there who insist on seeing the work-around, here it is:
    >
    > #define SWAP_SHORTS_AT_POINTER(sp) \
    > do { \
    > UCHAR c, *p; \
    > if ((UCHAR *)bp) { \
    > p = (UCHAR *)bp; \
    > c = p[0]; \
    > p[0] = p[3]; \
    > p[3] = c; \
    > c = p[1]; \
    > p[1] = p[2]; \
    > p[2] = c; \
    > } \
    > } while (0)


    This is what I get after expansion:

    typedef unsigned char UCHAR;
    typedef unsigned short USHORT;

    int main(void)
    {
    short a = 0x1234;
    short *pa = &a;
    do {
    USHORT c,
    *p;
    if ((USHORT *) pa) {
    p = (USHORT *) bp;
    c = p[1];
    p[1] = p[0];
    p[0] = c;
    do {
    UCHAR c,
    *p;
    if ((UCHAR *) p) {
    p = (UCHAR *) p;
    c = p[1];
    p[1] = p[0];
    p[0] = c;
    }
    } while (0);
    do {
    UCHAR c,
    *p;
    if ((UCHAR *) p + 1) {
    p = (UCHAR *) p + 1;
    c = p[1];
    p[1] = p[0];
    p[0] = c;
    }
    } while (0);
    }
    } while (0);
    return 0;
    }

    There is no definition for bp, so I guess bp is defined somewhere
    inline in your code or it would not even compile.

    It's possible I made a typeographical error somewhere.
     
    user923005, May 6, 2009
    #3
  4. Kevin D. Quitt writes:
    > UCHAR c, *p; \
    > if ((UCHAR *)bp) { \
    > p = (UCHAR *)bp; \


    ....oh, and this evaluates 'bp' twice, so the macro doc must say that the
    argument must not have side effects (like ptr++). As does my suggestion
    with shifts unless you pass a pointer after all - or just a
    'destination' parameter.

    In your case though, you can avoid this with
    UCHAR c, *p = (UCHAR *)bp; \
    if (p) \
    or rather
    UCHAR SWAP_BYTES_c, *SWAP_BYTES_p = (UCHAR *)(bp); \
    if (SWAP_BYTES_p) \
    according to the suggestions of my previous message.

    --
    Hallvard
     
    Hallvard B Furuseth, May 6, 2009
    #4
  5. Kevin D. Quitt wrote :
    > I'm at a new job, with a funky coding standard, so don't blame me for that. I'm
    > also being looked at as the resident C expert after making some coding
    > recommendations (like do {...}while (0) for multi-line macros).
    >
    > I was approached on how to swap the bytes (*sigh*) in a short, using a macro.
    > This particular compiler does not have the NTOH.. macro set, and they want to
    > implement them, sort of: they want to pass a pointer to what gets swapped. So
    > here's what I wrote (where UCHAR and USHORT are the expected typedefs):
    >
    > #define SWAP_BYTES_AT_POINTER(bp) \
    > do { \
    > UCHAR c, *p; \
    > if ((UCHAR *)bp) { \
    > p = (UCHAR *)bp; \
    > c = p[1]; \
    > p[1] = p[0]; \
    > p[0] = c; \
    > } \
    > } while (0)


    This macro is unsafe for two reasons
    - (UCHAR *)bp way not be bp cast to UCHAR *
    this happens in particular for
    SWAP_BYTES_AT_POINTER(p+1)
    which the OP uses (see below)
    - the argument if evaluated twice, which
    stinks if bp is an expression with side
    effect.
    I suggest:

    #define SWAP_BYTES_AT_POINTER(bp) \
    do { \
    UCHAR c, *p; \
    p = (UCHAR *)(bp); \
    if (p) { \
    c = p[1]; \
    p[1] = p[0]; \
    p[0] = c; \
    } \
    } while (0)

    > They were delighted by this and asked for a macro that implement (their way)
    > NTOHL; I gave them this:
    >
    > #define SWAP_SHORTS_AT_POINTER(sp) \
    > do { \
    > USHORT c, *p; \
    > if ((USHORT *)sp) { \
    > p = (USHORT *)bp; \

    Typo: b should be s in the above line
    > c = p[1]; \
    > p[1] = p[0]; \
    > p[0] = c; \
    > SWAP_BYTES_AT_POINTER(p); \
    > SWAP_BYTES_AT_POINTER(p+1); \
    > } \
    > } while (0)


    In addition to the typo, there are the same
    problems as above. Further, the second swap
    incurs an unwanted test.

    > And for those out there who insist on seeing the work-around, here it is:
    >
    > #define SWAP_SHORTS_AT_POINTER(sp) \
    > do { \
    > UCHAR c, *p; \
    > if ((UCHAR *)bp) { \
    > p = (UCHAR *)bp; \

    Typo: b should be s in the above two lines
    > c = p[0]; \
    > p[0] = p[3]; \
    > p[3] = c; \
    > c = p[1]; \
    > p[1] = p[2]; \
    > p[2] = c; \
    > } \
    > } while (0)


    Same problems as above, save for the unwanted test.
    I suggest:

    #define SWAP_SHORTS_AT_POINTER(sp) \
    do { \
    UCHAR c, *p; \
    p = (UCHAR *)(sp); \
    if (p) { \
    c = p[0]; \
    p[0] = p[3]; \
    p[3] = c; \
    c = p[1]; \
    p[1] = p[2]; \
    p[2] = c; \
    } \
    } while (0)


    Francois Grieu
     
    Francois Grieu, May 6, 2009
    #5
  6. Thank you all. I knew I was being stupid, just not sure how.
     
    Kevin D. Quitt, May 6, 2009
    #6
  7. Kevin D. Quitt

    Phil Carmody Guest

    (blargg) writes:
    > Hallvard B Furuseth wrote:
    >> Kevin D. Quitt writes:
    >> > UCHAR c, *p; \
    >> > if ((UCHAR *)bp) { \
    >> > p = (UCHAR *)bp; \

    >>
    >> ...oh, and this evaluates 'bp' twice, so the macro doc must say that the
    >> argument must not have side effects (like ptr++). As does my suggestion
    >> with shifts unless you pass a pointer after all - or just a
    >> 'destination' parameter.

    >
    > I prefer to just have functions to read and write integers in well-defined
    > formats. For example:
    >
    > inline unsigned get_le16( void const* in )
    > {
    > return ((unsigned char const*) in) [0] +
    > ((unsigned char const*) in) [1] * 0x100;
    > }


    Well, that's going to fail horribly on the DS9000, or at least not
    do what you wanted.

    > inline void set_le16( void* out, unsigned in )
    > {
    > ((unsigned char*) out) [0] = in & 0xFF;
    > ((unsigned char*) out) [1] = in >> 8 & 0xFF;
    > }


    Why use an unsigned int, when you're only using an unsigned
    short's worth of data?

    > With the swap routines you need to know the byte-order of the host
    > machine, while with the above functions you don't.


    But with yours, you need to know that CHAR_BIT is 8.

    > The above are also
    > probably more efficient when you don't need to write the value back to
    > memory, simply examine it, or when you have a value in a local variable
    > that you're writing back (not that efficiency is the main purpose of
    > these).


    Phil
    --
    Marijuana is indeed a dangerous drug.
    It causes governments to wage war against their own people.
    -- Dave Seaman (sci.math, 19 Mar 2009)
     
    Phil Carmody, May 7, 2009
    #7
  8. On 7 May, 01:02, (blargg) wrote:
    > Hallvard B Furuseth wrote:
    > > Kevin D. Quitt writes:
    > > > UCHAR c, *p; \
    > > > if ((UCHAR *)bp) { \
    > > > p = (UCHAR *)bp; \

    >
    > > ...oh, and this evaluates 'bp' twice, so the macro doc must say that the
    > > argument must not have side effects (like ptr++). As does my suggestion
    > > with shifts unless you pass a pointer after all - or just a
    > > 'destination' parameter.

    >
    > I prefer to just have functions to read and write integers in well-defined
    > formats. For example:
    >
    > inline unsigned get_le16( void const* in )
    > {
    > return ((unsigned char const*) in) [0] +
    > ((unsigned char const*) in) [1] * 0x100;
    > }


    Isn't it simpler to write

    inline unsigned get_le16( void const* in ) {
    return *((unsigned short *) in) & 0xffff ;
    }

    > inline void set_le16( void* out, unsigned in )
    > {
    > ((unsigned char*) out) [0] = in & 0xFF;
    > ((unsigned char*) out) [1] = in >> 8 & 0xFF;
    > }


    Similarly

    inline void set_le16( void* out, unsigned in ) {
    unsigned short *p = out ;
    *p = in & 0xffff ;
    }

    --
    Who's your mama?
     
    Spiros Bousbouras, May 9, 2009
    #8
  9. On 7 May, 03:54, Phil Carmody <> wrote:
    > (blargg) writes:
    >
    > > I prefer to just have functions to read and write integers in well-defined
    > > formats. For example:

    >
    > > inline unsigned get_le16( void const* in )
    > > {
    > > return ((unsigned char const*) in) [0] +
    > > ((unsigned char const*) in) [1] * 0x100;
    > > }

    >
    > Well, that's going to fail horribly on the DS9000, or at least not
    > do what you wanted.
    >
    > > inline void set_le16( void* out, unsigned in )
    > > {
    > > ((unsigned char*) out) [0] = in & 0xFF;
    > > ((unsigned char*) out) [1] = in >> 8 & 0xFF;
    > > }

    >
    > Why use an unsigned int, when you're only using an unsigned
    > short's worth of data?
    >
    > > With the swap routines you need to know the byte-order of the host
    > > machine, while with the above functions you don't.

    >
    > But with yours, you need to know that CHAR_BIT is 8.


    I was assuming that CHAR_BIT == 8 was an implicit premise
    throughout the thread. I mean many people posted solutions using
    shifts with hard coded number of places and no appearance of
    CHAR_BIT. I was also assuming a "benign" architecture. I mean if
    you're going to bring DS9000 into the picture I wonder whether
    you might get problems with trap representations with all the
    &'ing and |'ing happening in some of the other posts.

    So my assumptions throughout the thread were:
    a) CHAR_BIT == 8
    b) All pointers are to unsigned variables with no trap
    representations and of sufficient width to allow all
    the dereferencing and shifting taking place to be
    defined.

    --
    Shakespeare also predicted structured programming:
    "Go to, go to, thou are a foolish fellow. Let me be clear of thee."
    Twelfth Night, Act IV, Scene 1.
    from http://www.iidb.org/vbb/showthread.php?p=3352214#post3352214
     
    Spiros Bousbouras, May 9, 2009
    #9
  10. Kevin D. Quitt

    James Kuyper Guest

    Spiros Bousbouras wrote:
    > On 7 May, 01:02, (blargg) wrote:

    ....
    >> inline unsigned get_le16( void const* in )
    >> {
    >> return ((unsigned char const*) in) [0] +
    >> ((unsigned char const*) in) [1] * 0x100;
    >> }

    >
    > Isn't it simpler to write
    >
    > inline unsigned get_le16( void const* in ) {
    > return *((unsigned short *) in) & 0xffff ;
    > }


    Yes, it's simpler, but on many commonly used systems, it will not
    produce the same result as blargg's version. On any system where
    sizeof(unsigned short)!=2, your version will even access a different
    amount of memory than blargg's version.
     
    James Kuyper, May 9, 2009
    #10
  11. On 9 May, 10:40, (blargg) wrote:
    > Spiros Bousbouras wrote:
    > > blargg wrote:

    > [...]
    > > > inline unsigned get_le16( void const* in )
    > > > {
    > > > return ((unsigned char const*) in) [0] +
    > > > ((unsigned char const*) in) [1] * 0x100;
    > > > }

    >
    > > Isn't it simpler to write

    >
    > > inline unsigned get_le16( void const* in ) {
    > > return *((unsigned short *) in) & 0xffff ;

    >
    > Even simpler:
    >
    > inline unsigned get_le16( void const* in ) { return 0; }
    >
    > Both will produce about as useful results on a big-endian machine.


    I could argue this but I see your point.

    > > }

    >
    > > > inline void set_le16( void* out, unsigned in )
    > > > {
    > > > ((unsigned char*) out) [0] = in & 0xFF;
    > > > ((unsigned char*) out) [1] = in >> 8 & 0xFF;
    > > > }

    >
    > > Similarly

    >
    > > inline void set_le16( void* out, unsigned in ) {
    > > unsigned short *p = out ;
    > > *p = in & 0xffff ;
    > > }

    >
    > Why do you think ANDing with 0xFFFF does anything useful in either of these?


    Since the value returned by your code only uses the first 16
    bits of the "input" I was making sure my code does the same. But
    I forgot to take endianess into account. Putting it more into
    perspective I hadn't fully appreciated what you meant when you
    said "I prefer to just have functions to read and write integers
    in well-defined formats". Now I see that the purpose of your
    code was to read and store a 16 bits value in little endian
    format.

    --
    Some people say there is a God.Some say there is no God.Personally,
    I feel the truth lies probably somewhere between these two extremes.
    J.B. Morton
     
    Spiros Bousbouras, May 9, 2009
    #11
  12. Kevin D. Quitt

    Phil Carmody Guest

    (blargg) writes:
    > Spiros Bousbouras wrote:
    >> blargg wrote:

    > [...]
    >> > inline unsigned get_le16( void const* in )
    >> > {
    >> > return ((unsigned char const*) in) [0] +
    >> > ((unsigned char const*) in) [1] * 0x100;
    >> > }

    >>
    >> Isn't it simpler to write
    >>
    >> inline unsigned get_le16( void const* in ) {
    >> return *((unsigned short *) in) & 0xffff ;

    >
    > Even simpler:
    >
    > inline unsigned get_le16( void const* in ) { return 0; }


    His is right 1/256 of the time, yours is right only 1/65536 of the time
    ;-p


    Phil
    --
    Marijuana is indeed a dangerous drug.
    It causes governments to wage war against their own people.
    -- Dave Seaman (sci.math, 19 Mar 2009)
     
    Phil Carmody, May 9, 2009
    #12
  13. Kevin D. Quitt

    Old Wolf Guest

    On May 7, 7:40 am, Kevin D. Quitt <> wrote:

    > #define SWAP_BYTES_AT_POINTER(bp) \
    >     do {                          \
    >       UCHAR c, *p;                \
    >       if ((UCHAR *)bp) {          \
    >         p = (UCHAR *)bp;          \
    >         c = p[1];                 \
    >         p[1]  = p[0];             \
    >         p[0]  = c;                \
    >       }                           \
    >     } while (0)


    I think the following is much superior:
    #define SWAP_BYTES_AT_POINTER(bp) \
    SWAP( ((byte *)(bp))[0], ((byte *)(bp))[1] )

    where SWAP is a standard swap macro, e.g.
    #define SWAP(X,Y) ((X) ^= (Y), (Y) ^= (X), (X) ^= (Y))

    No need to check for null, if someone does it on
    null it's their own problem. If you are writing a
    function you can check initial conditions, and
    use a temporary variable etc., but in a macro I
    would keep it concise.
     
    Old Wolf, May 10, 2009
    #13
  14. Kevin D. Quitt

    Old Wolf Guest

    On May 9, 2:21 pm, Spiros Bousbouras <> wrote:
    > Isn't it simpler to write
    >
    > inline unsigned get_le16( void const* in ) {
    >     return *((unsigned short *) in) & 0xffff ;


    "inline" isn't part of C90. I have to write code
    to compile on multiple compilers and this aspect
    of it is a real pain. (Some of the systems
    support a variant of inline, but they all have
    different problems, e.g. inefficient code
    generation, spurious warnings about unused code,
    symbol clashes, etc.) I have to maintain parallel
    macro versions of each inline function.
     
    Old Wolf, May 10, 2009
    #14
  15. Kevin D. Quitt

    Old Wolf Guest

    On May 10, 11:42 pm, (blargg) wrote:
    >     #define GET_LE16(in) \
    >         ( ((uint8_t const*)(in))[1]*0x100u + ((uint8_t const*)(in))[0] )
    >
    >     inline unsigned get_le16( void const* in )
    >     {
    >         return GET_LE16( in ); /* no source code duplication */
    >     }


    Good idea!
     
    Old Wolf, May 11, 2009
    #15
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. me
    Replies:
    1
    Views:
    1,134
    Victor Bazarov
    Nov 9, 2004
  2. Benjamin Niemann
    Replies:
    3
    Views:
    335
    Caleb Hattingh
    Aug 26, 2004
  3. Ark

    A question on macro expansion

    Ark, Jul 22, 2004, in forum: C Programming
    Replies:
    3
    Views:
    360
  4. Vittal

    Macro Expansion

    Vittal, Mar 22, 2005, in forum: C Programming
    Replies:
    3
    Views:
    416
    Eric Sosman
    Mar 23, 2005
  5. Dom Gilligan

    Macro expansion of '#__LINE__'?

    Dom Gilligan, Nov 4, 2005, in forum: C Programming
    Replies:
    4
    Views:
    461
    Dom Gilligan
    Nov 4, 2005
Loading...

Share This Page