Problem of finding funtion names in any C file

Discussion in 'C Programming' started by athiane, Mar 9, 2006.

  1. athiane

    athiane Guest

    I want a way to parse out all function names that appear in a couple of
    C files.
    When the parsing logic finds a function name in a file, it should print
    out the Function name, line number and file in which the Function was
    found.

    What approach should i follow to tackle this problem ?
    Is there any option in the gcc compiler that prints out this
    information ?
     
    athiane, Mar 9, 2006
    #1
    1. Advertising

  2. athiane

    pemo Guest

    athiane wrote:
    > I want a way to parse out all function names that appear in a couple
    > of C files.
    > When the parsing logic finds a function name in a file, it should
    > print out the Function name, line number and file in which the
    > Function was found.
    >
    > What approach should i follow to tackle this problem ?
    > Is there any option in the gcc compiler that prints out this
    > information ?


    Don't know what you mean by 'parse out' - display in a console, provide
    output/code for 'homework'?


    --
    ==============
    *Not a pedant*
    ==============
     
    pemo, Mar 9, 2006
    #2
    1. Advertising

  3. athiane

    Micah Cowan Guest

    "pemo" <> writes:

    > athiane wrote:
    > > I want a way to parse out all function names that appear in a couple
    > > of C files.
    > > When the parsing logic finds a function name in a file, it should
    > > print out the Function name, line number and file in which the
    > > Function was found.
    > >
    > > What approach should i follow to tackle this problem ?
    > > Is there any option in the gcc compiler that prints out this
    > > information ?

    >
    > Don't know what you mean by 'parse out' - display in a console, provide
    > output/code for 'homework'?


    'parse out' has a pretty clear meaning, IMHO. It means to recognize it
    apart from everything else in the file in question.

    To answer the OP's question: this is not a trivial problem. In order
    to do what you want, you have to write a program that understands C:
    you'll need a preprocessor, a lexical scanner, and a grammar
    parser. While tools like lex and yacc (flex/bison) make this easier,
    it's overkill.

    Is this a homework assignment? If so, please provide more context: the
    problem as stated is far too difficult to be mere homework. Does the
    source file have to conform to certain additional restrictions?

    If this is solely for your own benefit, I strongly suggest you look
    for programs that already do this (why reinvent the wheel?). The
    standard "ctags" program can do something awfully close to, if not
    exactly what you need. Check out http://ctags.sourceforge.net/. You
    should be able to use grep, perhaps in combination with sed or awk, to
    make it do /exactly/ what you want. For more information on any of
    those tools (all of which are off-topic for this newsgroup), please
    ask at comp.unix.programmer.

    HTH.
     
    Micah Cowan, Mar 9, 2006
    #3
  4. On 2006-03-09, Micah Cowan <> wrote:
    >> athiane wrote:
    >> > I want a way to parse out all function names that appear in a couple
    >> > of C files.
    >> > When the parsing logic finds a function name in a file, it should
    >> > print out the Function name, line number and file in which the
    >> > Function was found.
    >> >
    >> > What approach should i follow to tackle this problem ?
    >> > Is there any option in the gcc compiler that prints out this
    >> > information ?


    > To answer the OP's question: this is not a trivial problem. In order
    > to do what you want, you have to write a program that understands C:
    > you'll need a preprocessor, a lexical scanner, and a grammar
    > parser. While tools like lex and yacc (flex/bison) make this easier,
    > it's overkill.


    I might be missing something, since I haven't given much thought to the
    matter admittedly, but I believe that detecting the functions should be
    quite easy. I remember I did something similar for C++ and it didn't
    work because my simple algorithm was also matching constructors. (so I
    then did a preprocessing stage to collect everything followed by the
    class keyword in a file, and exclude those, but that's way OT now).

    So my idea would be to first pass the source from a pre-processor to get
    rid of macros, and then match every valid symbol (sequence of characters
    and numbers starting from a character), followed by a '('.

    Am I missing something obvious that would ruin this?


    --
    John Tsiombikas (Nuclear / Mindlapse)

    http://nuclear.demoscene.gr/
     
    John Tsiombikas (Nuclear / Mindlapse), Mar 9, 2006
    #4
  5. "John Tsiombikas (Nuclear / Mindlapse)" <> writes:
    > On 2006-03-09, Micah Cowan <> wrote:
    >>> athiane wrote:
    >>> > I want a way to parse out all function names that appear in a couple
    >>> > of C files.
    >>> > When the parsing logic finds a function name in a file, it should
    >>> > print out the Function name, line number and file in which the
    >>> > Function was found.
    >>> >
    >>> > What approach should i follow to tackle this problem ?
    >>> > Is there any option in the gcc compiler that prints out this
    >>> > information ?

    >
    >> To answer the OP's question: this is not a trivial problem. In order
    >> to do what you want, you have to write a program that understands C:
    >> you'll need a preprocessor, a lexical scanner, and a grammar
    >> parser. While tools like lex and yacc (flex/bison) make this easier,
    >> it's overkill.

    >
    > I might be missing something, since I haven't given much thought to the
    > matter admittedly, but I believe that detecting the functions should be
    > quite easy. I remember I did something similar for C++ and it didn't
    > work because my simple algorithm was also matching constructors. (so I
    > then did a preprocessing stage to collect everything followed by the
    > class keyword in a file, and exclude those, but that's way OT now).
    >
    > So my idea would be to first pass the source from a pre-processor to get
    > rid of macros, and then match every valid symbol (sequence of characters
    > and numbers starting from a character), followed by a '('.
    >
    > Am I missing something obvious that would ruin this?


    I think that by "valid symbol (sequence of characters and numbers
    starting from a character)", you really mean (or *should* mean) "valid
    identifier (sequence of letters, digits, and underscores starting with
    a letter or underscore)".

    According to your description, you'd find all function *calls* as well
    as function definitions and declarations; it's hard to tell whether
    that's consistent with the requirements. You'd also find things like
    sizeof(int), unless you filter out keywords.

    You'll also miss some legal occurences of function names like:
    (printf)("Hello, world\n");
    and catch some occurrences of names of function pointer objects.

    It's probably possible to all function names using some simplified
    grammar. The problem is simpler if you assume the source is correct
    and don't care about catching syntax errors.

    On the other hand, since full C parsers already exist in the wild,
    adapting one to give you the information you want is probably easier
    than writinga simpler parser that does *only* what you want.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
     
    Keith Thompson, Mar 10, 2006
    #5
  6. athiane

    Jaspreet Guest

    athiane wrote:
    > I want a way to parse out all function names that appear in a couple of
    > C files.
    > When the parsing logic finds a function name in a file, it should print
    > out the Function name, line number and file in which the Function was
    > found.
    >
    > What approach should i follow to tackle this problem ?
    > Is there any option in the gcc compiler that prints out this
    > information ?


    Why reinvent the wheel when you have so many wheel manufacturers
    around. You could use any of the freely available parsers (like ctags).
    I have been using cscope though.

    I dont know much of ctags but cscope does exactly that (print function
    name, line number, file name) when it finds a function calling instance
    in a file.

    Would they not serve you or do you have some specific requirements ?
     
    Jaspreet, Mar 10, 2006
    #6
  7. On 2006-03-10, Keith Thompson <> wrote:
    > I think that by "valid symbol (sequence of characters and numbers
    > starting from a character)", you really mean (or *should* mean) "valid
    > identifier (sequence of letters, digits, and underscores starting with
    > a letter or underscore)".


    Yes, that's what I meant, forgot to include the _ and used the wrong
    term.

    > According to your description, you'd find all function *calls* as well
    > as function definitions and declarations; it's hard to tell whether
    > that's consistent with the requirements. You'd also find things like
    > sizeof(int), unless you filter out keywords.
    >
    > You'll also miss some legal occurences of function names like:
    > (printf)("Hello, world\n");
    > and catch some occurrences of names of function pointer objects.


    True, you are absolutely right, I forgot about sizeof(int) and calls
    through function pointers, and the (printf)("Hello, world\n"); wouldn't
    even cross my mind, I had to look twice to realize you call a function
    there, most unusual but valid :)

    So it is indeed more invovled than I thought initially.

    --
    John Tsiombikas (Nuclear / Mindlapse)

    http://nuclear.demoscene.gr/
     
    John Tsiombikas (Nuclear / Mindlapse), Mar 10, 2006
    #7
  8. athiane

    CBFalconer Guest

    "John Tsiombikas (Nuclear / Mindlapse)" wrote:
    > On 2006-03-09, Micah Cowan <> wrote:
    >> athiane wrote:

    >
    >>> I want a way to parse out all function names that appear in a
    >>> couple of C files. When the parsing logic finds a function
    >>> name in a file, it should print out the Function name, line
    >>> number and file in which the Function was found.
    >>>
    >>> What approach should i follow to tackle this problem ?
    >>> Is there any option in the gcc compiler that prints out this
    >>> information ?

    >
    >> To answer the OP's question: this is not a trivial problem. In
    >> order to do what you want, you have to write a program that
    >> understands C: you'll need a preprocessor, a lexical scanner,
    >> and a grammar parser. While tools like lex and yacc (flex/bison)
    >> make this easier, it's overkill.

    >
    > I might be missing something, since I haven't given much thought
    > to the matter admittedly, but I believe that detecting the
    > functions should be quite easy. I remember I did something
    > similar for C++ and it didn't work because my simple algorithm
    > was also matching constructors. (so I then did a preprocessing
    > stage to collect everything followed by the class keyword in a
    > file, and exclude those, but that's way OT now).
    >
    > So my idea would be to first pass the source from a pre-processor
    > to get rid of macros, and then match every valid symbol (sequence
    > of characters and numbers starting from a character), followed by
    > a '('.
    >
    > Am I missing something obvious that would ruin this?


    Don't bother with the macro sweep. Then you will be including
    functional macros. My xrefc program does this, and tags all
    functional refs with a terminal () in the symbol table. That
    distinguishes them from pointer references, but the results are
    adjacent in the output. If, in addition, you organize the code
    properly (definition before use) the first occurance of the
    function name is that definition. Handy. You don't bother
    following include files.

    --
    "If you want to post a followup via groups.google.com, don't use
    the broken "Reply" link at the bottom of the article. Click on
    "show options" at the top of the article, then click on the
    "Reply" at the bottom of the article headers." - Keith Thompson
    More details at: <http://cfaj.freeshell.org/google/>
    Also see <http://www.safalra.com/special/googlegroupsreply/>
     
    CBFalconer, Mar 10, 2006
    #8
  9. athiane

    Netocrat Guest

    On Fri, 10 Mar 2006 00:45:12 +0000, Keith Thompson wrote:
    > "John Tsiombikas (Nuclear / Mindlapse)" <> writes:

    [re a simple parser to extract a list of function names from C source]
    >> So my idea would be to first pass the source from a pre-processor to get
    >> rid of macros, and then match every valid symbol (sequence of characters
    >> and numbers starting from a character), followed by a '('.
    >>
    >> Am I missing something obvious that would ruin this?

    >
    > I think that by "valid symbol (sequence of characters and numbers
    > starting from a character)", you really mean (or *should* mean) "valid
    > identifier (sequence of letters, digits, and underscores starting with
    > a letter or underscore)".
    >
    > According to your description, you'd find all function *calls* as well
    > as function definitions and declarations; it's hard to tell whether
    > that's consistent with the requirements. You'd also find things like
    > sizeof(int), unless you filter out keywords.
    >
    > You'll also miss some legal occurences of function names like:
    > (printf)("Hello, world\n");
    > and catch some occurrences of names of function pointer objects.


    You could also get false positives from string literals:
    char help_msg[] = "Declare the prototype as int somefunc(int);";

    --
    http://members.dodo.com.au/~netocrat
     
    Netocrat, Mar 10, 2006
    #9
  10. athiane

    CBFalconer Guest

    Netocrat wrote:
    > Keith Thompson wrote:
    >

    .... snip on parsing function names ...
    >>
    >> You'll also miss some legal occurences of function names like:
    >> (printf)("Hello, world\n");
    >> and catch some occurrences of names of function pointer objects.

    >
    > You could also get false positives from string literals:
    > char help_msg[] = "Declare the prototype as int somefunc(int);";


    Not if you make the lexical scanner handle complete strings in the
    first place. Pseudocode:

    while (EOF != (ch = getnextch())) {
    switch (chclass(ch)) {
    case alpha:
    case '_': acquireid(); break;
    case '"': acquirestring(); break;
    case ' ': break;
    .....
    }
    }


    --
    "If you want to post a followup via groups.google.com, don't use
    the broken "Reply" link at the bottom of the article. Click on
    "show options" at the top of the article, then click on the
    "Reply" at the bottom of the article headers." - Keith Thompson
    More details at: <http://cfaj.freeshell.org/google/>
    Also see <http://www.safalra.com/special/googlegroupsreply/>
     
    CBFalconer, Mar 10, 2006
    #10
  11. athiane

    Guest

    I want to be able to find function names from any C file and use that
    information to form a hierarchical function call tree, with details on
    which line number, file name the function is in. I would like it to
    print this function call tree to the console prompt or a file.
    Eg;
    C file, test.c:

    1 int func1()
    2 {
    3 func3();
    4 }
    5
    6 int func2()
    7 {
    8 func4();
    9 func5();
    10 }
    11
    12
    13 int main()
    14 {
    15
    16 func1();
    17 func2();
    18
    19 return 0;
    20 }

    The function call tree after processing above C file should be :

    +main() (line=13, file=test.c)
    |_____func1(line=16, file=test.c)
    | |_____func3(line=3, file=test.c)
    |
    |_____func2(line=17, file=test.c)
    |_____func4(line=8, file=test.c)
    |_____func5(line=9, file=test.c)

    If already some tool provides such a listing it would fit my need ?
    Any shortcuts to tackle this problem are welcome.
     
    , Mar 10, 2006
    #11
  12. <> wrote in message
    news:...
    > "Jaspreet" <> wrote in message

    news:...
    > > Why reinvent the wheel when you have so many wheel manufacturers
    > > around. You could use any of the freely available parsers (like ctags).
    > > I have been using cscope though.

    >
    > > I dont know much of ctags but cscope does exactly that (print function
    > > name, line number, file name) when it finds a function calling instance
    > > in a file.

    >
    > I want to be able to find function names from any C file and use that
    > information to form a hierarchical function call tree, with details on
    > which line number, file name the function is in. I would like it to
    > print this function call tree to the console prompt or a file.


    As Jaspreet pointed out, there are a large number of parsers available.
    Some of the specialized ones are check, cproto, cdecl, ctool, cxref, etc. I
    think cxref is one of the ones you want. I recall there being another
    program on DECUS which would build a complete searchable database from your
    source, but I don't know what it was called. Most of these are available
    from comp.sources.unix or DECUS.

    comp.sources.unix, cxref is in volume1:
    http://ftp.sunet.se/pub/usenet/ftp.uu.net/comp.sources.unix/

    DECUS:
    Index (files unavailable) http://www.decus.org/encompass/software/
    Files ftp://ftp.encompassus.org/lib/


    Rod Pemberton
     
    Rod Pemberton, Mar 10, 2006
    #12
  13. athiane

    Guest

    Thank you all very much for the wealth of information!
     
    , Mar 11, 2006
    #13
  14. athiane

    Netocrat Guest

    On Fri, 10 Mar 2006 11:33:05 -0500, CBFalconer wrote:
    > Netocrat wrote:
    >> Keith Thompson wrote:
    >>

    > ... snip on parsing function names ...
    >>>
    >>> You'll also miss some legal occurences of function names like:
    >>> (printf)("Hello, world\n");
    >>> and catch some occurrences of names of function pointer objects.

    >>
    >> You could also get false positives from string literals:
    >> char help_msg[] = "Declare the prototype as int somefunc(int);";

    >
    > Not if you make the lexical scanner handle complete strings in the first
    > place. Pseudocode:
    >
    > while (EOF != (ch = getnextch())) {
    > switch (chclass(ch)) {
    > case alpha:
    > case '_': acquireid(); break;
    > case '"': acquirestring(); break;
    > case ' ': break;
    > ....
    > }
    > }


    Sure, there are ways to handle it, but as the suggestion stood there were
    a few holes. You'd also need to make sure that the scanner didn't count
    double-quotes within comments.

    --
    http://members.dodo.com.au/~netocrat
     
    Netocrat, Mar 11, 2006
    #14
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Sam Holden

    Re: fraction reducing funtion

    Sam Holden, Aug 3, 2003, in forum: C++
    Replies:
    2
    Views:
    373
    Sam Holden
    Aug 4, 2003
  2. garyolsen

    Virtual Funtion Questions

    garyolsen, Dec 2, 2003, in forum: C++
    Replies:
    3
    Views:
    1,217
    jeffc
    Dec 2, 2003
  3. Lee Xuzhang
    Replies:
    5
    Views:
    340
    Kevin D. Quitt
    Jun 14, 2006
  4. Randell D.

    Duplicate funtion names - what happens?

    Randell D., Nov 3, 2003, in forum: Javascript
    Replies:
    5
    Views:
    116
    John G Harris
    Nov 3, 2003
  5. Steve Bishop

    Funtion Problem

    Steve Bishop, Feb 11, 2004, in forum: Javascript
    Replies:
    2
    Views:
    94
    kaeli
    Feb 11, 2004
Loading...

Share This Page