Unreadable source code

Discussion in 'C Programming' started by Ellixis, May 23, 2004.

  1. Ellixis

    Ellixis Guest

    I have been looking at "sh" source code and have found this strange
    thing:

    /**** syntax.c ****/
    #define ndx(ch) (ch + 1 - CHAR_MIN)
    #define set(ch, val) [ndx(ch)] = val,
    #define set_range(s, e, val) [ndx(s) ... ndx(e)] = val,

    /* character classification table */
    const char is_type[257] = { 0,
    set_range('0', '9', ISDIGIT)
    set_range('a', 'z', ISLOWER)
    set_range('A', 'Z', ISUPPER)
    set('_', ISUNDER)
    set('#', ISSPECL)
    set('?', ISSPECL)
    set('$', ISSPECL)
    set('!', ISSPECL)
    set('-', ISSPECL)
    set('*', ISSPECL)
    set('@', ISSPECL)
    };
    /**** !syntax.c ****/


    /**** gcc -E syntax.c ****/
    const char is_type[257] = { 0,
    [( '0' + 1 - (-0x7f-1) ) ... ( '9' + 1 - (-0x7f-1) ) ] =
    01 ,
    [( 'a' + 1 - (-0x7f-1) ) ... ( 'z' + 1 - (-0x7f-1) ) ] =
    04 ,
    [( 'A' + 1 - (-0x7f-1) ) ... ( 'Z' + 1 - (-0x7f-1) ) ] =
    02 ,
    [( '_' + 1 - (-0x7f-1) ) ] = 010 ,
    [( '#' + 1 - (-0x7f-1) ) ] = 020 ,
    [( '?' + 1 - (-0x7f-1) ) ] = 020 ,
    [( '$' + 1 - (-0x7f-1) ) ] = 020 ,
    [( '!' + 1 - (-0x7f-1) ) ] = 020 ,
    [( '-' + 1 - (-0x7f-1) ) ] = 020 ,
    [( '*' + 1 - (-0x7f-1) ) ] = 020 ,
    [( '@' + 1 - (-0x7f-1) ) ] = 020 ,
    };
    /**** !gcc -E syntax.c ****/

    It compiles without error message or warning. Does somebody have an
    explanation of this portion of source code ?
    Ellixis, May 23, 2004
    #1
    1. Advertising

  2. Ellixis

    Chris Torek Guest

    In article <>
    Ellixis <> writes:
    >I have been looking at "sh" source code and have found this strange
    >thing:
    >
    >/**** syntax.c ****/
    >#define ndx(ch) (ch + 1 - CHAR_MIN)
    >#define set(ch, val) [ndx(ch)] = val,


    The first macro simply offsets a value (named ch, and without
    parentheses so that the macro misbehaves if "ch" is an expression
    using an operator such as "&" -- e.g., ndx(1 & 3) does not work
    "as desired") by 1-CHAR_MIN, typically 1-(-128) or 1-0. In
    other words, it generally adds either 129 or 1.

    The second macro is designed to use the C99 "designated initializer"
    syntax.

    >#define set_range(s, e, val) [ndx(s) ... ndx(e)] = val,


    This macro produces a syntax error.

    (GCC has an extension in which this error becomes valid and
    meaningful, but this extension is *not* valid C99.)

    >/* character classification table */
    >const char is_type[257] = { 0,
    > set_range('0', '9', ISDIGIT)


    This uses the GCC extension to make sure that is_type[ndx('0')] is set
    to ISDIGIT, is_type[ndx('1')] is set to ISDIGIT, is_type[ndx('2')] is
    set to ISDIGIT, and so on, through is_type[ndx('9')]. Since Standard
    C requires that the integer values of '0' through '9' be contiguous
    and sequential, this always works (provided your compiler implements
    the GCC extension).

    > set_range('a', 'z', ISLOWER)


    This uses the GCC extension to make sure that is_type[ndx('a')] is
    set to ISLOWER, etc., as before. It causes is_type[ndx(various
    EBCDIC non-letter characters)] *also* to be ISLOWER, i.e., it does
    not work on many IBM mainframes. (It assumes instead that you are
    using ASCII. There *are* GCC ports to IBM mainframes, but presumably
    either the code will be run in "ASCII mode" or else this program
    will never be compiled for use in "EBCDIC mode".)

    [Remainder snipped; it all works along the same lines.]

    The new C99 syntax is, e.g.:

    int a[10] = { [3] = 42, [9] = -1 };
    /* sets a[] to the sequence {0,0,0,42,0,0,0,0,0,-1} */

    and:

    struct blah { int i; double d; char *s; };
    struct blah x = { .s = "hello" };
    /* sets x.i to 0, x.d to 0.0, and x.s to point to "hello" */

    Again, these are called "designated initializers", because you
    designate (name) the element to initialize. (Actually, the draft
    of C99 I use just calls them "designators" and does not have a
    formal term for the designator=value sub-syntax.)
    --
    In-Real-Life: Chris Torek, Wind River Systems
    Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
    email: forget about it http://web.torek.net/torek/index.html
    Reading email is like searching for food in the garbage, thanks to spammers.
    Chris Torek, May 23, 2004
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Surajit Laha

    Unreadable QueryString

    Surajit Laha, May 4, 2004, in forum: ASP .Net
    Replies:
    4
    Views:
    408
    David Jessee
    May 4, 2004
  2. Andoni
    Replies:
    2
    Views:
    347
    Roedy Green
    Aug 19, 2005
  3. Replies:
    3
    Views:
    690
    Oliver Wong
    Jan 11, 2006
  4. Toby A Inkster

    Re: Unreadable "HTML"

    Toby A Inkster, Aug 18, 2003, in forum: HTML
    Replies:
    1
    Views:
    411
  5. alex

    unreadable js code

    alex, Oct 11, 2005, in forum: Javascript
    Replies:
    11
    Views:
    149
Loading...

Share This Page