Re: Parsing with typedefs

Discussion in 'C Programming' started by Mauro Persano, Jul 1, 2003.

  1. Jun,

    "Jun Woong" <> wrote in message news:<bdsfpm$lcc$>...
    > "Mauro Persano" <> wrote in message news:...
    > > In the follow two lines:
    > >
    > > typedef int foo;
    > > unsigned foo;

    >
    > Invalid because of redeclaration of an identifier without a linkage.


    Thanks for the comments. But maybe my question was ill-formulated.

    The fragment of the grammar I'm interested in is:

    declaration
    : declaration-specifiers
    | declaration-specifiers init-declarator-list

    declaration-specifiers
    : type-specifier
    | type-specifier declaration-specifiers

    declaration-specifiers
    : declarator

    declarator
    : identifier

    type-specifier
    : int
    | typedef-name

    It's up to the lexer to decide whether the second 'foo' is an
    identifier or a typedef-name. When it reaches the second 'foo',
    the lexer must already know that foo is a typedef - otherwise,
    if it found

    foo bar;

    the parser would think 'oh - two consecutive identifiers' and
    bail out with a syntax error. So how does the lexer knows that,
    in

    int foo;

    'foo' *must* be an identifier?

    Both interpretations should be equally valid, at least
    from a syntactic standpoint. Or not?

    Thanks again,

    Mauro
    Mauro Persano, Jul 1, 2003
    #1
    1. Advertising

  2. Mauro Persano

    Chris Torek Guest

    In article <>
    Mauro Persano <> writes:
    >Thanks for the comments. But maybe my question was ill-formulated.

    [grammar part snipped]
    >It's up to the lexer to decide whether the second 'foo' is an
    >identifier or a typedef-name. When it reaches the second 'foo',
    >the lexer must already know that foo is a typedef - otherwise,
    >if it found
    >
    > foo bar;
    >
    >the parser would think 'oh - two consecutive identifiers' and
    >bail out with a syntax error. So how does the lexer knows that,
    >in
    >
    > int foo;
    >
    >'foo' *must* be an identifier?
    >
    >Both interpretations should be equally valid, at least
    >from a syntactic standpoint. Or not?


    The C standard (or standards, if you use both C89 and C99 :) )
    says what must happen. It does not go so far as to say *how* it
    happens, and it is up to you -- the one implementing a parser --
    to make it happen, somehow.

    In practice, this stuff is *very* tricky with automated tools like
    yacc and lex, and still pretty tricky even with hand-rolled code.
    One gimmick in lex (or equivalent) is something like this:

    <ident-seq> {
    id_info *idp = id_lookup(the_identifier);
    if (idp != NULL && idp->id_type == TYPEDEF_NAME) {
    current_token.tvalue = idp->id_what_it_aliases;
    return TYPE_NAME;
    }
    /* not a typedef name, or not even defined yet */
    current_token.tvalue = idp; /* avoid repeating lookup() */
    return IDENTIFIER;
    }

    But this goes wrong when "re-typedef"ing an identifier, as in this
    (valid) C code fragment:

    typedef int x; { typedef double x;

    so one then adds "downward" information from the grammar into the
    lexer to tell it when to disable typedef lookups, as for the second
    "x" here. One then discovers that the LALR(1) parser yacc sometimes,
    but not always, does one token of look-ahead, so that the "downward
    info" from parser into lexer has to be carefully tweaked.

    (This is one of the many reasons I dislike typedefs in C, in general.
    Not only are they hard for humans to read correctly, they are even
    hard for the computer to read correctly -- although that may be
    mainly because of the yacc-and-lex-derived tools we keep using.)
    --
    In-Real-Life: Chris Torek, Wind River Systems (BSD engineering)
    Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
    email: forget about it http://67.40.109.61/torek/index.html (for the moment)
    Reading email is like searching for food in the garbage, thanks to spammers.
    Chris Torek, Jul 1, 2003
    #2
    1. Advertising

  3. Mauro Persano

    Jun Woong Guest

    "Mauro Persano" <> wrote in message news:...
    > Jun,
    >
    > "Jun Woong" <> wrote in message news:<bdsfpm$lcc$>...
    > > "Mauro Persano" <> wrote in message news:...
    > > > In the follow two lines:
    > > >
    > > > typedef int foo;
    > > > unsigned foo;

    > >
    > > Invalid because of redeclaration of an identifier without a linkage.

    >
    > Thanks for the comments. But maybe my question was ill-formulated.
    >
    > The fragment of the grammar I'm interested in is:
    >

    [...]
    >
    > Both interpretations should be equally valid, at least
    > from a syntactic standpoint. Or not?
    >


    Chris Torek already provided an excellent explanation. The syntax
    alone says not all things needed to parse C programs, the syntax plus
    semantics do. Allowing an ordinary identifier to appear as a type
    specifier (through the typedef mechanism) is one of the C's features
    to confuse people including some implementers when abused.


    --
    Jun, Woong ()
    Dept. of Physics, Univ. of Seoul
    Jun Woong, Jul 2, 2003
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Alexander Stippler

    Re: visibility of typedefs

    Alexander Stippler, Jun 24, 2003, in forum: C++
    Replies:
    5
    Views:
    813
    tom_usenet
    Jun 25, 2003
  2. emerth
    Replies:
    3
    Views:
    370
    emerth
    Aug 8, 2003
  3. dwrayment

    Templates and Typedefs

    dwrayment, Aug 12, 2003, in forum: C++
    Replies:
    6
    Views:
    394
    Sam Holden
    Aug 14, 2003
  4. Dave
    Replies:
    4
    Views:
    758
    Andrey Tarasevich
    Dec 5, 2003
  5. Kevin Easton

    Re: Parsing with typedefs

    Kevin Easton, Jul 2, 2003, in forum: C Programming
    Replies:
    6
    Views:
    804
    Jun Woong
    Jul 3, 2003
Loading...

Share This Page