Removing dead code and unused functions

Discussion in 'C Programming' started by Geronimo W. Christ Esq, Jun 19, 2005.

  1. Are there any scripts or tools out there that could look recursively
    through a group of C/C++ source files, and allow unreferenced function
    calls or values to be easily identified ?

    LXR is handy for indexing source code, and for a given function or
    global variable it can show you all the places where it is referenced.
    It would be really nice to have a tool that would simply list all of the
    referenced functions, so that you could go through and remove them.
     
    Geronimo W. Christ Esq, Jun 19, 2005
    #1
    1. Advertising

  2. Geronimo W. Christ Esq

    Greg Guest

    Geronimo W. Christ Esq wrote:
    > Are there any scripts or tools out there that could look recursively
    > through a group of C/C++ source files, and allow unreferenced function
    > calls or values to be easily identified ?
    >
    > LXR is handy for indexing source code, and for a given function or
    > global variable it can show you all the places where it is referenced.
    > It would be really nice to have a tool that would simply list all of the
    > referenced functions, so that you could go through and remove them.


    There is in fact such a tool, it's commonly called a "linker." And the
    list of unreferenced code and data that it strips from a build is
    usually cataloged in a file it can be directed to create. This file is
    commonly called a "link map."

    Greg
     
    Greg, Jun 19, 2005
    #2
    1. Advertising

  3. Greg wrote:

    >>LXR is handy for indexing source code, and for a given function or
    >>global variable it can show you all the places where it is referenced.
    >>It would be really nice to have a tool that would simply list all of the
    >>referenced functions, so that you could go through and remove them.

    >
    > There is in fact such a tool, it's commonly called a "linker." And the
    > list of unreferenced code and data that it strips from a build is
    > usually cataloged in a file it can be directed to create. This file is
    > commonly called a "link map."


    Got a link ? The GNU linker at least only puts symbols that are included
    into the link map. No mention of it cataloging symbols it excludes.
     
    Geronimo W. Christ Esq, Jun 19, 2005
    #3
  4. Le 19/06/2005 17:49, dans ,
    « Geronimo W. Christ Esq » <> a écrit :

    > Greg wrote:
    >
    >>> LXR is handy for indexing source code, and for a given function or
    >>> global variable it can show you all the places where it is referenced.
    >>> It would be really nice to have a tool that would simply list all of the
    >>> referenced functions, so that you could go through and remove them.

    >>
    >> There is in fact such a tool, it's commonly called a "linker." And the
    >> list of unreferenced code and data that it strips from a build is
    >> usually cataloged in a file it can be directed to create. This file is
    >> commonly called a "link map."

    >
    > Got a link ? The GNU linker at least only puts symbols that are included
    > into the link map. No mention of it cataloging symbols it excludes.


    I'm not sure but "nm" could be useful here.
     
    Jean-Claude Arbaut, Jun 19, 2005
    #4
  5. In article <BEDB6241.5C02%>,
    Jean-Claude Arbaut <> wrote:

    >> Got a link ? The GNU linker at least only puts symbols that are included
    >> into the link map. No mention of it cataloging symbols it excludes.


    >I'm not sure but "nm" could be useful here.


    Linkers typically do not exclude functions in the user program that are
    unused. They only do that with libraries.

    More useful would be one of the many tools that generate call graphs.

    -- Richard
     
    Richard Tobin, Jun 19, 2005
    #5
  6. Jean-Claude Arbaut wrote:

    >>Got a link ? The GNU linker at least only puts symbols that are included
    >>into the link map. No mention of it cataloging symbols it excludes.

    >
    > I'm not sure but "nm" could be useful here.


    This problem can't be appropriately solved with a linker, particularly
    not the GNU linker. GNU ld can only throw out sections, not unused
    functions or global variables; so if you've got a file containing 10
    functions, 9 of which are unused, all ten will still get linked.

    Parsing the source code is the answer, it's just surprising no-one seems
    to have done this yet.
     
    Geronimo W. Christ Esq, Jun 19, 2005
    #6
  7. Richard Tobin wrote:

    > Linkers typically do not exclude functions in the user program that are
    > unused. They only do that with libraries.
    >
    > More useful would be one of the many tools that generate call graphs.


    Can you think of any examples ?

    I know of runtime tools which do this (could even use gcov at a pinch)
    but in that case you need to come up with a set of test cases which
    exercise the code fully.
     
    Geronimo W. Christ Esq, Jun 19, 2005
    #7
  8. Geronimo W. Christ Esq

    Phlip Guest

    Geronimo W. Christ Esq wrote:

    > Are there any scripts or tools out there that could look recursively
    > through a group of C/C++ source files, and allow unreferenced function
    > calls or values to be easily identified ?


    Write unit tests for every feature. Pass all tests between every 1~10 edits.

    Constantly try to remove parameters, variables, lines, methods, classes, and
    modules. If any test fails, hit Undo.

    This process is great for growing code to maximize features and minimize
    lines.

    --
    Phlip
    http://www.c2.com/cgi/wiki?ZeekLand
     
    Phlip, Jun 19, 2005
    #8
  9. Phlip wrote:

    >>Are there any scripts or tools out there that could look recursively
    >>through a group of C/C++ source files, and allow unreferenced function
    >>calls or values to be easily identified ?

    >
    > Write unit tests for every feature. Pass all tests between every 1~10 edits.


    That's what I would do if I had a group of developers and six months to
    do it in. Unfortunately many of us in this post-dotcom age do not work
    to near-infinite budgets.
     
    Geronimo W. Christ Esq, Jun 19, 2005
    #9
  10. Geronimo W. Christ Esq

    jacob navia Guest

    Geronimo W. Christ Esq wrote:
    > Are there any scripts or tools out there that could look recursively
    > through a group of C/C++ source files, and allow unreferenced function
    > calls or values to be easily identified ?
    >
    > LXR is handy for indexing source code, and for a given function or
    > global variable it can show you all the places where it is referenced.
    > It would be really nice to have a tool that would simply list all of the
    > referenced functions, so that you could go through and remove them.


    The lcc-win32 IDE will do that. Select Object file cross-reference in
    the analysis menu, then look for symbols that are not referenced anywhere.

    A problem with this approach is that the IDE doesn't recognize functions
    that are referenced in the same file. For instance:

    int foo(void)
    {
    // ...
    }

    int main(void)
    {
    foo();
    }

    The function will appear as not referenced. Besdies the IDE only handles
    C programs (it is a C IDE).

    http://www.cs.virginia.edu/~lcc-win32.
     
    jacob navia, Jun 19, 2005
    #10
  11. Geronimo W. Christ Esq

    Phlip Guest

    Geronimo W. Christ Esq wrote:

    > Phlip wrote:
    >
    > >>Are there any scripts or tools out there that could look recursively
    > >>through a group of C/C++ source files, and allow unreferenced function
    > >>calls or values to be easily identified ?

    > >
    > > Write unit tests for every feature. Pass all tests between every 1~10

    edits.
    >
    > That's what I would do if I had a group of developers and six months to
    > do it in. Unfortunately many of us in this post-dotcom age do not work
    > to near-infinite budgets.


    Do you have time and resources to debug?

    You can leverage tests, like that, to replace many long hours of debugging
    for a few short minutes writing tests.

    The idea that automated testing requires an "infinite budget" is a myth.

    (And if you indeed have a short deadline, why bother removing harmless but
    unused code?)

    --
    Phlip
    http://www.c2.com/cgi/wiki?ZeekLand
     
    Phlip, Jun 19, 2005
    #11
  12. Geronimo W. Christ Esq

    Tim Prince Guest

    Jean-Claude Arbaut wrote:
    >
    >
    > Le 19/06/2005 17:49, dans ,
    > « Geronimo W. Christ Esq » <> a écrit :
    >
    >
    >>Greg wrote:
    >>
    >>
    >>>>LXR is handy for indexing source code, and for a given function or
    >>>>global variable it can show you all the places where it is referenced.
    >>>>It would be really nice to have a tool that would simply list all of the
    >>>>referenced functions, so that you could go through and remove them.
    >>>
    >>>There is in fact such a tool, it's commonly called a "linker." And the
    >>>list of unreferenced code and data that it strips from a build is
    >>>usually cataloged in a file it can be directed to create. This file is
    >>>commonly called a "link map."

    >>
    >>Got a link ? The GNU linker at least only puts symbols that are included
    >>into the link map. No mention of it cataloging symbols it excludes.

    >
    >
    > I'm not sure but "nm" could be useful here.
    >

    In times gone by, the lorder and tsort tools showed which .o files were
    not used, as well as finding a single pass link order, if one exists.
    Now that no one cares about the single pass link, we don't find these
    tools installed automatically.
     
    Tim Prince, Jun 19, 2005
    #12
  13. In article <ygjte.85$>,
    Phlip <> wrote:
    >Do you have time and resources to debug?


    >You can leverage tests, like that, to replace many long hours of debugging
    >for a few short minutes writing tests.


    >The idea that automated testing requires an "infinite budget" is a myth.


    >(And if you indeed have a short deadline, why bother removing harmless but
    >unused code?)


    If you are handed a large program and told to "make it work",
    then the first thing you need to do is bring it under control. Machines
    are a lot faster and more accurate about matters such as which functions
    are potentially callable, so it makes sense to mechanically
    pre-process the code instead of going in and writing tests for
    each section under the assumption that the code will be used.
    One can spend endless hours trying to "fix" a routine that
    isn't even needed. Overview first, -then- ensure that each
    function performs its proper role in the design.

    A program such as 'cscope' can assist in finding unused functions
    and in finding locations from which functions are called.


    >The idea that automated testing requires an "infinite budget" is a myth.


    Well, sure it is: there are only a finite number of states that
    a program can be in on a given system, so the amount of testing
    one has to do has a finite upper bound, not an infinite bound.

    There's the small issue that current scientific thought suggests
    that the Universe will not last long enough to test even fairly
    trivial programs (e.g., it takes 1E21 years to test a program
    with merely two 64-bit floating point numbers if the tests can be
    done at 10 gigaflop).

    But you are absolutely right that that won't require an infinite budget --
    it only requires a budget larger than is likely to be available at
    any time before Homo Sapiens Sapiens die off or evolve into something
    else.
    --
    Would you buy a used bit from this man??
     
    Walter Roberson, Jun 19, 2005
    #13
  14. Geronimo W. Christ Esq

    CBFalconer Guest

    jacob navia wrote:
    > Geronimo W. Christ Esq wrote:
    >
    >> Are there any scripts or tools out there that could look
    >> recursively through a group of C/C++ source files, and allow
    >> unreferenced function calls or values to be easily identified ?
    >>
    >> LXR is handy for indexing source code, and for a given function
    >> or global variable it can show you all the places where it is
    >> referenced. It would be really nice to have a tool that would
    >> simply list all of the referenced functions, so that you could
    >> go through and remove them.

    >
    > The lcc-win32 IDE will do that. Select Object file
    > cross-reference in the analysis menu, then look for symbols that
    > are not referenced anywhere.
    >
    > A problem with this approach is that the IDE doesn't recognize
    > functions that are referenced in the same file. For instance:


    Those shouldn't appear in the first place. They should have been
    declared static and omitted from the .h file.

    --
    Some informative links:
    news:news.announce.newusers
    http://www.geocities.com/nnqweb/
    http://www.catb.org/~esr/faqs/smart-questions.html
    http://www.caliburn.nl/topposting.html
    http://www.netmeister.org/news/learn2quote.html
     
    CBFalconer, Jun 19, 2005
    #14
  15. On 19/06/2005 21:15, Tim Prince wrote:

    > Jean-Claude Arbaut wrote:
    >>
    >>
    >> Le 19/06/2005 17:49, dans ,
    >> « Geronimo W. Christ Esq » <> a écrit :
    >>
    >>
    >>> Greg wrote:
    >>>
    >>>
    >>>>> LXR is handy for indexing source code, and for a given function or
    >>>>> global variable it can show you all the places where it is referenced.
    >>>>> It would be really nice to have a tool that would simply list all of the
    >>>>> referenced functions, so that you could go through and remove them.
    >>>>
    >>>> There is in fact such a tool, it's commonly called a "linker." And the
    >>>> list of unreferenced code and data that it strips from a build is
    >>>> usually cataloged in a file it can be directed to create. This file is
    >>>> commonly called a "link map."
    >>>
    >>> Got a link ? The GNU linker at least only puts symbols that are included
    >>> into the link map. No mention of it cataloging symbols it excludes.

    >>
    >>
    >> I'm not sure but "nm" could be useful here.
    >>

    > In times gone by, the lorder and tsort tools showed which .o files were
    > not used, as well as finding a single pass link order, if one exists.
    > Now that no one cares about the single pass link, we don't find these
    > tools installed automatically.


    I didn't know they show this information, but that's true they are not very
    useful nowadays. I think they are still part of the binutils package.
     
    Jean-Claude Arbaut, Jun 19, 2005
    #15
  16. Geronimo W. Christ Esq

    Ben Pope Guest

    Walter Roberson wrote:
    > In article <ygjte.85$>,
    > Phlip wrote:
    >
    >>The idea that automated testing requires an "infinite budget" is a myth.

    >
    >
    > Well, sure it is: there are only a finite number of states that
    > a program can be in on a given system, so the amount of testing
    > one has to do has a finite upper bound, not an infinite bound.
    >
    > There's the small issue that current scientific thought suggests
    > that the Universe will not last long enough to test even fairly
    > trivial programs (e.g., it takes 1E21 years to test a program
    > with merely two 64-bit floating point numbers if the tests can be
    > done at 10 gigaflop).


    Then it is clear that you do not understand unit testing.

    Ben
    --
    A7N8X FAQ: www.ben.pope.name/a7n8x_faq.html
    Questions by email will likely be ignored, please use the newsgroups.
    I'm not just a number. To many, I'm known as a String...
     
    Ben Pope, Jun 19, 2005
    #16
  17. Geronimo W. Christ Esq

    Phlip Guest

    Walter Roberson wrote:

    > If you are handed a large program and told to "make it work",
    > then the first thing you need to do is bring it under control.


    Read /Working Effectively with Legacy Code/ by Mike Feathers. He's a
    consultant who routinely guides teams thru that exact situation.

    A boss has spent a lot of money to build a codebase, with very little
    return. Then a team must make the code valuable, without wasting more time
    and effort.

    > Machines
    > are a lot faster and more accurate about matters such as which functions
    > are potentially callable, so it makes sense to mechanically
    > pre-process the code instead of going in and writing tests for
    > each section under the assumption that the code will be used.
    > One can spend endless hours trying to "fix" a routine that
    > isn't even needed. Overview first, -then- ensure that each
    > function performs its proper role in the design.
    >
    > A program such as 'cscope' can assist in finding unused functions
    > and in finding locations from which functions are called.


    Yes, automated tools that scan code and interpret it will help. But I don't
    see the relation between "Where the bugs are" and "Where control flow is
    not". The principle "Ain't broke don't fix it" applies here. Dead code ain't
    broke. Bugs will lead to investigation of the live code causing them.

    > >The idea that automated testing requires an "infinite budget" is a myth.

    >
    > Well, sure it is: there are only a finite number of states that
    > a program can be in on a given system, so the amount of testing
    > one has to do has a finite upper bound, not an infinite bound.


    The idea that developer tests should be like quality assurance tests is also
    a myth. Developer tests are little more than the scaffolding used to support
    a building while you build it. Earthquake-proofing the building is an
    orthogonal concern.

    > There's the small issue that current scientific thought suggests
    > that the Universe will not last long enough to test even fairly
    > trivial programs (e.g., it takes 1E21 years to test a program
    > with merely two 64-bit floating point numbers if the tests can be
    > done at 10 gigaflop).


    That's hardly an excuse not to try. The goal is _not_ "prove there are no
    bugs". A math proof is, indeed, NP-incomplete. Tests can get within 99.9% of
    a proof with a trivial effort. The last 0.1% is what costs so much.

    The goal is "prevent 99.9% of bugs". You can get there by running tests
    frequently, and hitting Undo if any test breaks, to back out the most recent
    edit. That's infinitely preferrable to debugging.

    --
    Phlip
    http://www.c2.com/cgi/wiki?ZeekLand
     
    Phlip, Jun 19, 2005
    #17
  18. Geronimo W. Christ Esq

    Martijn Guest

    >> A problem with this approach is that the IDE doesn't recognize
    >> functions that are referenced in the same file. For instance:

    >
    > Those shouldn't appear in the first place. They should have been
    > declared static and omitted from the .h file.



    And if you do so, most decent compilers (at least GCC does) with the
    appropriate warnings enabled will find unreferenced static functions for
    you.

    --
    Martijn
    http://www.sereneconcepts.nl
     
    Martijn, Jun 19, 2005
    #18
  19. Phlip wrote:

    >>That's what I would do if I had a group of developers and six months to
    >>do it in. Unfortunately many of us in this post-dotcom age do not work
    >>to near-infinite budgets.

    >
    > Do you have time and resources to debug?
    >
    > You can leverage tests, like that, to replace many long hours of debugging
    > for a few short minutes writing tests.


    I've got just under a million lines of code here that have just come
    into my possession. I'd love to believe that a few minutes would allow
    me to create a suite of tests proving that the program generated from
    that codebase worked the same before and after any changes, but I remain
    somewhat cynical.

    > The idea that automated testing requires an "infinite budget" is a myth.


    Timescales and budgets do not presently permit me to sit down and write
    tests for a huge body of code which I am not completely familiar with. I
    have no doubts about the wisdom or long term benefits of doing it, but I
    don't possess the resources at the moment.

    > (And if you indeed have a short deadline, why bother removing harmless but
    > unused code?)


    I don't believe I've mentioned anything about a deadline. What I do have
    is a limited resource to work with. I can leverage that resource better
    if I can grasp the code more easily. The code can be grasped more easily
    if the redundant bits of it are removed.
     
    Geronimo W. Christ Esq, Jun 19, 2005
    #19
  20. Walter Roberson wrote:

    <snip> thank you for that articulate contribution, Walter.

    > A program such as 'cscope' can assist in finding unused functions
    > and in finding locations from which functions are called.


    cscope is very handy (as is LXR as I mentioned before). I can indeed go
    through each function manually and determine whether it is needed or
    not. But I figure that the computer should be able to do that for me,
    automatically. Cscope's (or LXR's) generated database contains all the
    information that would be required to do that. It's just odd that no-one
    has attempted to do the kind of source code profiling that I am talking
    about yet, using those databases to generate lists of redundant
    functions (or duplicate code).

    The reason why it has to be automated is because you have to make
    several passes. For example, you could come to function bar() and not
    remove that because it is needed by function foo(). However, only later
    would you find that function foo() is also unused. You would have to
    make a second pass to remove bar(). Take that trivial example and scale
    it up to a source base that has a few tens or hundreds of thousands of
    functions defined within it and you can see the scale of the issue.
     
    Geronimo W. Christ Esq, Jun 19, 2005
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    1
    Views:
    485
    Adrienne Boswell
    May 31, 2006
  2. Felix Kater

    linker: does it skip unused functions?

    Felix Kater, Dec 29, 2004, in forum: C Programming
    Replies:
    9
    Views:
    306
    Eltee
    Dec 30, 2004
  3. Dan Henry
    Replies:
    0
    Views:
    393
    Dan Henry
    Jun 21, 2005
  4. Greg
    Replies:
    10
    Views:
    1,534
    Dave Thompson
    Jul 4, 2005
  5. Dom Gilligan
    Replies:
    6
    Views:
    2,239
    Dom Gilligan
    Aug 18, 2005
Loading...

Share This Page