is there a command that can take a C source code as input and outputa token tree

Discussion in 'C Programming' started by learner1020, Aug 31, 2010.

  1. learner1020

    learner1020 Guest

    I know gcc does compiling by converting a C source code into a token
    tree, but I don't if there is a command options to make it output just
    token tree (in, say, xml format).

    Thanks in advance.
     
    learner1020, Aug 31, 2010
    #1
    1. Advertising

  2. learner1020

    Nobody Guest

    Re: is there a command that can take a C source code as input and output a token tree

    On Tue, 31 Aug 2010 15:32:01 -0700, learner1020 wrote:

    > I know gcc does compiling by converting a C source code into a token
    > tree, but I don't if there is a command options to make it output just
    > token tree (in, say, xml format).


    Not in XML. You can use e.g. -fdump-tree-original-raw to get the parse
    tree as a list of nodes.
     
    Nobody, Sep 1, 2010
    #2
    1. Advertising

  3. learner1020

    Jorgen Grahn Guest

    Re: is there a command that can take a C source code as input andoutput a token tree

    On Wed, 2010-09-01, Nobody wrote:
    > On Tue, 31 Aug 2010 15:32:01 -0700, learner1020 wrote:
    >
    >> I know gcc does compiling by converting a C source code into a token
    >> tree, but I don't if there is a command options to make it output just
    >> token tree (in, say, xml format).

    >
    > Not in XML. You can use e.g. -fdump-tree-original-raw to get the parse
    > tree as a list of nodes.


    And if I recall correctly there are people experimenting with the gcc
    source code in this area. People are interested in using gcc as a C++
    parser for use in static analysis, because it's so hard to write one
    from scratch. (This might not apply to the C compiler; I don't know
    much about this.)

    /Jorgen

    --
    // Jorgen Grahn <grahn@ Oo o. . .
    \X/ snipabacken.se> O o .
     
    Jorgen Grahn, Sep 1, 2010
    #3
  4. learner1020

    Gene Guest

    Re: is there a command that can take a C source code as input andoutput a token tree

    On Aug 31, 6:32 pm, learner1020 <> wrote:
    > I know gcc does compiling by converting a C source code into a token
    > tree, but I don't if there is a command options to make it output just
    > token tree (in, say, xml format).
    >
    > Thanks in advance.


    If you are not tied to gcc, look at clang. I recall one of the
    project's threads is to emit abstract syntax trees as XML for C,
    Objective-C, and C++. Don't know where that effort stands. This is a
    new build with benefit of "going to school" on gcc and lots of recent
    research and experience. The code looks much easier to get a handle on
    than gcc's.
     
    Gene, Sep 2, 2010
    #4
  5. Re: is there a command that can take a C source code as input and output a token tree

    "Jorgen Grahn" <> wrote in message
    news:...
    > On Wed, 2010-09-01, Nobody wrote:
    >> On Tue, 31 Aug 2010 15:32:01 -0700, learner1020 wrote:
    >>
    >>> I know gcc does compiling by converting a C source code into a token
    >>> tree, but I don't if there is a command options to make it output just
    >>> token tree (in, say, xml format).

    >>
    >> Not in XML. You can use e.g. -fdump-tree-original-raw to get the parse
    >> tree as a list of nodes.

    >
    > And if I recall correctly there are people experimenting with the gcc
    > source code in this area. People are interested in using gcc as a C++
    > parser for use in static analysis, because it's so hard to write one
    > from scratch. (This might not apply to the C compiler; I don't know
    > much about this.)
    >


    parsing C is not particularly difficult...

    a few kloc of code can do the trick, although it may be a little work to
    understand how to write it (it helps to first have experience with simpler
    languages, like Scheme and JavaScript, as each will give the experience and
    a foundation to build on).


    (the real evils are deeper in the compiler internals...).

    if my server were up right now (it is down recently because internet
    bandwidth here is too limited and others complain if I "waste" the bandwidth
    over something so trivial as having a webserver running...), I could post a
    link to my parser, which can parse C (and also Java and C#), and emits an
    XML-based AST (not a token-tree / CST though, if this is what the OP
    wanted).


    personally my bias is to avoid things like parser generators, as to me they
    seem like more of a trick to make people *think* they are making the task
    easier for themselves, but setting themselves up for much pain once they get
    past simple languages, and into languages with all sorts of bizarre stuff
    going on (such as tokens which may or may not exist or may be parsed
    differently depending on context, as may exist in languages such as C++ or
    C#, or syntax which is ambiguous apart from knowing prior declarations,
    such as in C and C++, ...).

    personally, I am a fan of hand-written recursive descent, as IME it seems to
    work fairly well, and I just haven't really run into problems where parser
    generators would seem to be the right tool for the job.

    a lexer may make sense to generate from a tool, although personally I don't
    really think this is necessary either.

    or such...
     
    BGB / cr88192, Sep 2, 2010
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Cronus
    Replies:
    1
    Views:
    721
    Paul Mensonides
    Jul 15, 2004
  2. Stub

    B tree, B+ tree and B* tree

    Stub, Nov 12, 2003, in forum: C Programming
    Replies:
    3
    Views:
    10,242
  3. G Fernandes
    Replies:
    1
    Views:
    558
  4. Wessi
    Replies:
    3
    Views:
    915
    Lawrence Kirby
    Aug 11, 2005
  5. =?Utf-8?B?Y2FzaGRlc2ttYWM=?=

    This is an unexpected token. The expected token is 'NAME'

    =?Utf-8?B?Y2FzaGRlc2ttYWM=?=, Jul 13, 2007, in forum: ASP .Net
    Replies:
    2
    Views:
    828
    =?Utf-8?B?Y2FzaGRlc2ttYWM=?=
    Jul 13, 2007
Loading...

Share This Page