Parsing C Preprocessor files

Discussion in 'Python' started by Bram Stolk, Jun 23, 2004.

  1. Bram Stolk

    Bram Stolk Guest

    Hi there,

    What could I use to parse CPP macros in Python?
    I tried the Parnassus Vaults, and python lib docs, but could not
    find a suitable module.

    Thanks,

    Bram


    --
    ------------------------------------------------------------------------------
    Bram Stolk, VR Engineer.
    SARA Academic Computing Services Amsterdam, PO Box 94613, 1090 GP AMSTERDAM
    email: Phone +31-20-5923059 Fax +31-20-6683167

    "Software is math. Math is not patentable."
    OR
    "Software is literature. Literature is not patentable." -- slashdot comment
    ------------------------------------------------------------------------------
     
    Bram Stolk, Jun 23, 2004
    #1
    1. Advertising

  2. Bram Stolk

    Peter Hansen Guest

    Bram Stolk wrote:

    > What could I use to parse CPP macros in Python?
    > I tried the Parnassus Vaults, and python lib docs, but could not
    > find a suitable module.


    Does it really need to be in Python? There are probably
    dozens of free and adequate macro preprocessors out there
    already.

    (You might also want to clarify what you mean by "parse"
    in this case... do you mean actually running the whole
    preprocessor over an input file and expanding all macros,
    or do you mean something else?)

    -Peter
     
    Peter Hansen, Jun 23, 2004
    #2
    1. Advertising

  3. Bram Stolk

    Bram Stolk Guest

    On Wed, 23 Jun 2004 08:32:08 -0400
    Peter Hansen <> wrote:

    > Bram Stolk wrote:
    >
    > > What could I use to parse CPP macros in Python?
    > > I tried the Parnassus Vaults, and python lib docs, but could not
    > > find a suitable module.

    >
    > Does it really need to be in Python? There are probably
    > dozens of free and adequate macro preprocessors out there
    > already.


    I want to trigger Python actions for certain nodes or states in the
    parse tree. I want to traverse this tree, an be able to make
    intelligent actions. For this, I want to use python.

    > (You might also want to clarify what you mean by "parse"
    > in this case... do you mean actually running the whole
    > preprocessor over an input file and expanding all macros,
    > or do you mean something else?)


    Roughly speaking, I want to be able to identify sections that are
    guarded with #ifdef FOO
    Because conditionals can be nested, you would have to count the
    ifs/endifs, and additionally, the conditional values may depend on other
    preprocessor command, e.g. values may have been defined in included
    files.

    If I can traverse the #if/#endif tree in Python, a preprocessor file
    becomes much more managable.

    Bram

    > -Peter



    --
    ------------------------------------------------------------------------------
    Bram Stolk, VR Engineer.
    SARA Academic Computing Services Amsterdam, PO Box 94613, 1090 GP AMSTERDAM
    email: Phone +31-20-5923059 Fax +31-20-6683167

    "Software is math. Math is not patentable."
    OR
    "Software is literature. Literature is not patentable." -- slashdot comment
    ------------------------------------------------------------------------------
     
    Bram Stolk, Jun 23, 2004
    #3
  4. Bram Stolk wrote:
    > Hi there,
    >
    > What could I use to parse CPP macros in Python?
    > I tried the Parnassus Vaults, and python lib docs, but could not
    > find a suitable module.


    I wrote a program called SeeGramWrap. It uses Java and ANTLR to parse
    C files. See

    http://members.tripod.com/~edcjones/SeeGramWrap.2004.03.03.tar.gz
     
    Edward C. Jones, Jun 23, 2004
    #4
  5. Bram Stolk

    Paul McGuire Guest

    "Bram Stolk" <> wrote in message
    news:...
    > Hi there,
    >
    > What could I use to parse CPP macros in Python?
    > I tried the Parnassus Vaults, and python lib docs, but could not
    > find a suitable module.
    >
    > Thanks,
    >
    > Bram
    >


    Try pyparsing, at http://pyparsing.sourceforge.net . The examples include a
    file scanExamples.py, that does some simple C macro parsing. This should be
    pretty straightforward to adapt to matching #ifdef's and #endif's.

    -- Paul
    (I'm sure pyparsing is listed in Vaults of Parnassus. Why did you think it
    would not be applicable?)
     
    Paul McGuire, Jun 23, 2004
    #5
  6. Bram Stolk

    Bram Stolk Guest

    On Wed, 23 Jun 2004 13:58:04 GMT
    "Paul McGuire" <._bogus_.com> wrote:

    > (I'm sure pyparsing is listed in Vaults of Parnassus. Why did you think it
    > would not be applicable?)


    Because I searched for "parser", "macro", "preprocessor", "cpp", and none
    of those searches comes up with "pyparsing". I should have searched for
    "parsing" I guess.

    Bram


    --
    ------------------------------------------------------------------------------
    Bram Stolk, VR Engineer.
    SARA Academic Computing Services Amsterdam, PO Box 94613, 1090 GP AMSTERDAM
    email: Phone +31-20-5923059 Fax +31-20-6683167

    "Software is math. Math is not patentable."
    OR
    "Software is literature. Literature is not patentable." -- slashdot comment
    ------------------------------------------------------------------------------
     
    Bram Stolk, Jun 23, 2004
    #6
  7. Bram Stolk

    Bram Stolk Guest

    pyHi(),

    I would like to thank the people who responded on my question about
    preprocessor parsing. However, I think I will just roll my own, as I
    found out that it takes a mere 16 lines of code to create a #ifdef tree.

    I simply used a combination of lists and tuples. A tuple denotes a #if
    block (startline,body,endline). A body is a list of lines/tuples.

    This will parse the following text:

    Top level line
    #if foo
    on foo level
    #if bar
    on bar level
    #endif
    #endif
    #ifdef bla
    on bla level
    #ifdef q
    q
    #endif
    #if r
    r
    #endif
    #endif

    into:

    ['Top level line\n', ('#if foo\n', ['on foo level\n', ('#if bar\n', ['on bar level\n'], '#endif\n')], '#endif\n'), ('#ifdef bla\n', ['on bla level\n', ('#ifdef q\n', ['q\n'], '#endif\n'), ('#if r\n', ['r\n'], '#endif\n')], '#endif\n')]

    Which is very suitable for me.

    Code is:

    def parse_block(lines) :
    retval = []
    while lines :
    line = lines.pop(0)
    if line.find("#if") != -1 :
    headline = line
    b=parse_block(lines)
    endline = lines.pop(0)
    retval.append( (headline, b, endline) )
    else :
    if line.find("#endif") != -1 :
    lines.insert(0, line)
    return retval
    else :
    retval.append(line)
    return retval

    And pretty pretting with indentation is easy:

    def traverse_block(block, indent) :
    while block:
    i = block.pop(0)
    if type(i) == type((1,2,3)) :
    print indent*"\t"+i[0],
    traverse_block(i[1], indent+1)
    print indent*"\t"+i[2],
    else :
    print indent*"\t"+i,

    I think extending it with '#else' is trivial. Handling includes and
    expressions is much harder ofcourse, but not immediately req'd for me.

    Bram

    On Wed, 23 Jun 2004 14:01:51 +0200
    Bram Stolk <> wrote:

    > Hi there,
    >
    > What could I use to parse CPP macros in Python?
    > I tried the Parnassus Vaults, and python lib docs, but could not
    > find a suitable module.
    >


    --
    ------------------------------------------------------------------------------
    Bram Stolk, VR Engineer.
    SARA Academic Computing Services Amsterdam, PO Box 94613, 1090 GP AMSTERDAM
    email: Phone +31-20-5923059 Fax +31-20-6683167

    "Software is math. Math is not patentable."
    OR
    "Software is literature. Literature is not patentable." -- slashdot comment
    ------------------------------------------------------------------------------
     
    Bram Stolk, Jun 23, 2004
    #7
  8. Nice and simple algorithm, but you should use an iterator to iterate over
    your lines, or else shifting your big array of lines with pop() is gonna
    be very slow.

    Instead of :

    > line = lines.pop(0)


    Try :

    lines = iter( some line array )

    Or just pass the file handle ; python will split the lines for you.

    You can replace your "while lines" with a "for" on this iterator. You'll
    need to avoid pushing data in the array (think about it)...

    also "#if" in line is prettier.

    Another way to do it is without recursion : have an array which is your
    stack, advance one level when you get a #if, go back one level at #endif ;
    no more recursion.


    Have fun !
     
    =?iso-8859-15?Q?Pierre-Fr=E9d=E9ric_Caillaud?=, Jun 23, 2004
    #8
  9. I thought about it and...

    Here's a stackless version with #include and #if. 20 minutes in the
    making...
    You'll need a pen and paper to figure how the stack works though :) but
    it's fun.
    It uses references...


    file1 = """Top level line
    #if foo
    on foo level
    #if bar
    on bar level
    #endif
    re foo level
    #include file2
    #else
    not foo
    #endif
    top level
    #ifdef bla
    on bla level
    #ifdef q
    q
    #else
    not q
    #endif
    check
    #if r
    r
    #endif
    #endif"""

    file2 = """included file:
    #ifdef stuff
    stuff level
    #endif
    """

    # simple class to process included files
    class myreader( object ):
    def __init__(self):
    self.queue = [] # queue of iterables to be played

    def __iter__(self):
    return self

    # insert an iterable into the current flow
    def insert( self, iterator ):
    self.queue.append( iterator )

    def next(self):
    while self.queue:
    try:
    return self.queue[-1].next()
    except StopIteration:
    self.queue.pop() # this iterable is finished, throw it away
    raise StopIteration

    reader = myreader()
    reader.insert( iter( file1.split("\n") ))

    # stackless parser !
    result = []
    stack = [result]
    stacktop = stack[-1]

    for line in reader:
    ls = line.strip()
    if ls.startswith( "#" ): # factor all # cases for speed
    keyword = ls.split(" \t\r\n",1)[0]
    if keyword == "#if":
    next = []
    stacktop.append( [line, next] )
    stack.append( next )
    stacktop = next
    elif keyword == "#else":
    stack.pop()
    stack[-1][-1].append(line)
    next = []
    stack[-1][-1].append( next )
    stack.append( next )
    stacktop = next
    elif keyword == "#endif":
    stack.pop()
    stack[-1][-1] = tuple( stack[-1][-1] + [line] )
    elif keyword == "#include":
    # I don't parse the filename... replace the iter() below by something
    like open(filename)
    reader.insert( iter(file2.split("\n")) )
    else:
    stacktop.append(line)

    def printblock(block, indent=0) :
    ind = "\t"*indent
    for elem in block:
    if type( elem ) == list:
    printblock( elem, indent+1 )
    elif type( elem ) == tuple:
    printblock( elem, indent )
    else:
    print ind, elem

    print result
    printblock(result)
     
    =?iso-8859-15?Q?Pierre-Fr=E9d=E9ric_Caillaud?=, Jun 24, 2004
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. GIMME
    Replies:
    2
    Views:
    893
    GIMME
    Feb 11, 2004
  2. Cronus
    Replies:
    1
    Views:
    693
    Paul Mensonides
    Jul 15, 2004
  3. Denis Remezov

    Preprocessor, nested files etc.

    Denis Remezov, Aug 12, 2004, in forum: C++
    Replies:
    12
    Views:
    950
    Karl Heinz Buchegger
    Aug 13, 2004
  4. Prashant Mahajan

    Preprocessor and Compiled Files

    Prashant Mahajan, Jan 9, 2006, in forum: C Programming
    Replies:
    2
    Views:
    330
    Chris Torek
    Jan 9, 2006
  5. Preprocessor parsing rules

    , Mar 24, 2006, in forum: C Programming
    Replies:
    2
    Views:
    388
    S.Tobias
    Mar 24, 2006
Loading...

Share This Page