[OT] C function call cost.

Discussion in 'C Programming' started by Andrew Au \(Newsgroup\), Sep 16, 2004.

  1. I knew it is OT, comp.lang.c is a very narrow group that talk about nothing
    except standard C. But I really need a group to ask this question, please
    redirect me to a *real* newsgroup if it is really OT.

    My question is about function call.

    In my program, I make a function called a lot of times. I want to see:

    What is the cost involved in a function call?
    If there is something like inlining call in c++, so that I can reduce the
    cost?
    Can I reduce function call cost by reducing number of argument, for example,
    make them global?

    If there is some really short function call that is the bottleneck, what can
    I do?

    float utterance_getData(Utterance utterance, int time, int dimension) {
    return utterance->data[time][dimension];
    }
     
    Andrew Au \(Newsgroup\), Sep 16, 2004
    #1
    1. Advertisements

  2. comp.programming is almost always a safe bet.
    Depends on the implementation, of course. Usually, you need to
    put the arguments somewhere (unless there are very few of them, for
    example, in which case you can just pass them in registers on some
    architectures), and then you need to call the function. If the
    function is complicated, it'll do some setup and cleanup of its
    own. Then you'll need to clean up the stuff you did to pass the
    arguments in the first place.
    This is a question better directed to a group dedicated to your
    platform (comp.os.msdos.programmer? comp.unix.programmer?).
    'inline'. :) In extreme cases, #define. E.g., in your example
    below it might help to write

    #define U_GETDATA(u,tm,dn) ((u)->data[tm][dn])

    You lose the typechecking, but you also lose the function-call overhead.
    On some architectures, yes. But that's almost certainly a Bad Idea.
    For one thing, you lose reentrancy---if you use GlobalDataBuffer1 to
    pass arguments to FunctionFoo, then you can't have two invocations of
    FunctionFoo running at once with different arguments. This may or may
    not be a problem for you.
    'gprof' or another profiler, to make sure it /is/ the bottleneck.
    Then, if it is really short (as your example below), and is /still/
    a bottleneck, it must be because it's in a loop or a recursive
    function call. See if you can lift it out of the loop, or reduce the
    number of times the loop executes. Use a more efficient data structure.
    Cache the results of the function in a table and use table lookup
    instead of direct computation.
    HTH,
    -Arthur
     
    Arthur J. O'Dwyer, Sep 16, 2004
    #2
    1. Advertisements

  3. We don't talk about it because nobody can reasonably say. You need
    to know the exact compiler and version the program was compiled
    with, the processor the code is run on and lots of other information
    which would bore the hell out of most people reading the group and
    would have nothing to do with C but just one special implementation.
    Not officially in the C89 standard but in the "new" C99 standard.
    But many compilers already had an extensions before, allowing you
    to inline functions. See the documentation of your compiler.
    Yes, it could speed up function calls (but again, that's nothing that
    has much to do with C) - but on the other hand it's also easy to imagine
    situations where it actually slows down the call: for example if the
    function argument is an often used variable in the caller it could
    already be in a register when the call comes and then be passed
    directly to the function via the register. But if you make a global
    variable out of it that value would have to be written to memory first
    and then back into a register within the function, which probably
    would be slower.

    So the answer is basically mu. If you really need to know if some-
    thing is faster or not measure it. Or look at the assembler produced
    by the compiler and try to guess from that.
    How do you know that that is a bottleneck? Did you ever run your program
    under a profiler to find out how much time that call takes? And if you're
    really that concerned with speed why do you use such a kind of function
    at all? And why do you use floats? That only requires that the numbers
    have to be converted first to double and than back to float all of the
    time because calculations are always done in double.

    Regards, Jens
     
    Jens.Toerring, Sep 16, 2004
    #3
  4. Andrew Au \(Newsgroup\)

    Malcolm Guest

    Tens of instructions, typically. The compiler must assign arguments to
    registers, in some cases push them on the stack, then save the registers it
    has corrupted, make the call, restore corrupted data, and process the return
    value.
    In C99, yes, but you are unlikely to have one. Make sure your code compiles
    under C++ to take advantage of "inline" portably.
    Yes. if you have a recursive function that is called many times, this is a
    common technique. In many cases you waste time on setting up the global that
    you save on avoiding the parameter, so it really isn't worth it. Note that
    this is micro-optimisation, the last step if the program doesn't run quickly
    enough after the algorithm has been optimised as much as possible.
    Compile under C99 or C++ and declare inline. Use a macro. Manually insert
    the code into the body of the calling function (usually not as big a task as
    it sounds).
     
    Malcolm, Sep 16, 2004
    #4
  5. Andrew Au \(Newsgroup\)

    CBFalconer Guest

    It IS about the language, so it is on-topic.
    That is system dependant, and the only way to check it (on your
    own system) is to measure it. You can create a function with an
    empty body and call it many times in a loop. Measure the
    execution time of the loop with no function call, then measure it
    again with the function call, and the difference will be the call
    time multiplied by the loop repetitions. This portion of the
    question is marginally OT.
    In C99 you can specify a function be inlined. This is normally
    only worthwhile for very short functions:

    int inline foo(int arg)
    {
    /* foo body */
    } /* foo */

    You may or may not have a C99 compiler, but many others also have
    the capability. gcc is one. You can measure the effect as above.
    This is a horrible idea, and leads to confused code and errors.

    The urge to perform such optimization should be firmly resisted,
    as probably being worthless. Profiling your code will give you
    clues as to the hot spots if some performance is not
    satisfactory. Without the profile you are just guessing.
     
    CBFalconer, Sep 16, 2004
    #5
  6. May a C99 compiler ignore the inline keyword if it so chooses (i.e.,
    refuse to attempt to call the function inline)?
     
    Christopher Benson-Manica, Sep 16, 2004
    #6
  7. Andrew Au \(Newsgroup\)

    Ben Pfaff Guest

    Of course. C99 says this:

    Making a function an inline function suggests that calls to
    the function be as fast as possible.118) The extent to
    which such suggestions are effective is
    implementation-defined.119)
     
    Ben Pfaff, Sep 16, 2004
    #7
  8. Yes. The standard says (C99 6.7.4p5):

    Making a function an inline function suggests that calls to the
    function be as fast as possible. The extent to which such
    suggestions are effective is implementation-defined.

    Specifying that a function is inline is similar to specifying that a
    variable is a register variable, except that inline is more likely to
    be useful with modern compilers. (The common wisdom is that the
    compiler is probably smarter than you in deciding which variables
    should be stored in registers; that's probably less true for deciding
    which functions should be inlined.)
     
    Keith Thompson, Sep 16, 2004
    #8
  9. Of course. A good C compiler will take "inline" as a hint that you want
    calls to this function to be fast, at the possible cost of a bit more
    code. And on that basis it will produce the best possible code. If
    inlining would make your code slower, then a good compiler will not
    inline; for example two calls in a row to an inline function f (int x)
    which contains 10000 lines of code will usually run slower when inlined.

    It will also sometimes inline functions that don't use the inline
    keyword. For example, if you write int f (int x) { return 1; } then a
    good compiler will often replace calls to function f with the evaluation
    of the argument for possible side effects, with the constant 1
    substituted for the result.
     
    Christian Bau, Sep 16, 2004
    #9
  10. "inline" also eliminates function-call overhead (indeed, this is the
    main reason inlining is done). I don't see any compelling reason to use
    a macro as an inline function, unless it's for type polymorphism or
    because you're writing in C90.
    A very bad idea. A better way of achieving this is to put all your
    arguments in a structure (an "info" structure which can hopefully be
    reused elsewhere) and pass a pointer to it. Then the caller is
    responsible for its storage.

    Inline is best - it preserves the original design and is designed for
    just such cases. Even a table lookup is liable to be outperformed by a
    very short inline function. If for some reason the OP doesn't want to do
    inlining, maybe due to code size constraints, this particular case can
    be improved in a sequential loop by caching the partial lookup
    utterance->data[time].
     
    Derrick Coetzee, Sep 16, 2004
    #10
  11. This is so completely bound to your implementation, that you will need
    to either look at the code generated or run some timing tests.
    There are two pieces of good news.
    1) 'inline' is part of the current C standard and available as an
    extension for many compilers written when it was not yet part of the
    standard.
    2) the cost for calling functions tends to be lower -- sometimes much
    lower -- in C than in C++.
    The cost of passing an argument in C is normally quite small. None of
    the gyrations that C++ might need are necessary. The exceptions tend to
    be where a struct is passed as an argument. The relatively high cost of
    copying structs is why C++ (which uses analogous stuff often) introduced
    call by reference, and why C programmers tend to, unless doing otherwise
    is unavoidable, pass the address of a struct rather than the struct itself.
    Are you sure that this is your function. 'utterance' is used as a
    pointer-to-struct in the code (quite cheap) and seems to be a struct in
    the parameter list (possibly expensive). Of course this should lead to a
    diagnosed mismatch unless you have your diagnostics set too low. Are you
    _sure_ that this code should not be

    float utterance_getData(Utterance *utterance, int time, int dimension) {
    return utterance->data[time][dimension];
    }

    and you could request that it be inlined as well, or #define it as a
    macro instead

    #define utterance_getData(utterance,time,dimension)\
    ((utterance)->(data)[(time)][(dimension)])
     
    Martin Ambuhl, Sep 17, 2004
    #11
  12. Yes.
     
    Martin Ambuhl, Sep 17, 2004
    #12
  13. What is the cost involved in a function call?

    1.7592837 Euros, plus VAT.
    This is comp.lang.c. We don't do C++ here.
    In C90 there is no way to PREVENT the compiler from inlining in
    standards terms (although there may be ways to make it so difficult
    to inline few compilers will try).
    Use of global variables may INCREASE the cost, due to (on some processors)
    having to use full-length addresses rather than offsets from a stack
    frame pointer. Or it might not.

    Reducing the number of USELESS arguments will probably reduce the cost.
    This sounds like a good situation where replacing a function call with a
    macro (#define) would be appropriate.

    Gordon L. Burditt
     
    Gordon Burditt, Sep 17, 2004
    #13
  14. Thanks for all, all these ideas are helpful.

    For your information:

    Firstly, I am using a C compiler which is gcc 3.3.1, probably, I can use
    inline function.

    Secondly, I really did profile my code using gprof, that's why I am sure
    that function is a bottleneck, the short function is called 260835344 times,
    I think slight optimization will be able to improve the performance by a
    lot.

    Thirdly, Utterance is in fact defined by typedef struct UtteranceStructure*
    Utterance, therefore I am passing pointer, not structure. (You may get a
    hint when I write utterance->getData(), which -> implies the utterance
    variable is a pointer.

    Fourth, I dare not to try making it a macro because I will lose profiling
    information, I never know how fast a macro run (gprof only measure function
    call speed, after all)
     
    Andrew Au \(Newsgroup\), Sep 17, 2004
    #14
  15. Andrew Au \(Newsgroup\)

    CBFalconer Guest

    If the function is inlined I don't believe gprof can discriminate
    against it. If it is called that often it may well be that you
    could reorganize your algorithms, which is usually the best way to
    speed up something. Also, gprof should be able to show you the
    actual time spent in the function, as a percentage of overall
    run-time, for a sufficiently long run. If it isn't a significant
    portion that optimization won't help.
     
    CBFalconer, Sep 17, 2004
    #15
  16. Andrew Au \(Newsgroup\)

    Paul Hsieh Guest

    "inline", if I understand it correctly, is a deceptively named
    modifier from the all but abandoned C99 standard, much like
    "register". What "inline" really means is that you guarantee that you
    will not take the address of that function -- and that's all. Good
    compilers will not take hints from programmers in this way about what
    code is actually inlined or not. Bad compilers may take it as a hint
    about inlining your code but that just reveals the fact that such
    compilers don't know how to clone, constant propogate or inline code
    automatically (and therefore probably have high function call
    overhead, which is what you seem concerned about). Both "register"
    and "inline" should have been unified as a single keyword like
    "noaddress" since that's what they really mean.

    Any good compiler should be able to inline any function which is
    either static or can be cloned as static, is a leaf function (or
    recursively so, by considering any function which only calls inlined
    functions or none at all as a leaf function) and is below some
    threshold in size.
     
    Paul Hsieh, Sep 17, 2004
    #16
  17. Andrew Au \(Newsgroup\)

    Chris Barts Guest

    Does declaring a function inline prevent the programmer from using its
    address? (I'm guessing this is the case, but I don't know the C99 standard
    well enough to do anything but guess.)
     
    Chris Barts, Sep 17, 2004
    #17
  18. Andrew Au \(Newsgroup\)

    Michael Mair Guest

    Cheerio,

    I am not entirely sure but footnote 118) of 6.7.4 seems to suggest
    otherwise and I have not found anything that says you cannot use the
    address:

    6.7.4 Function specifiers
    Syntax 1
    function-specifier: inline

    Constraints 2 Function specifiers shall be used only in the declaration
    of an identifier for a function.

    3 An inline definition of a function with external linkage shall not
    contain a definition of a modifiable object with static storage
    duration, and shall not contain a reference to an identifier with
    internal linkage.

    4 In a hosted environment, the inline function specifier shall not
    appear in a declaration of main.

    Semantics 5 A function declared with an inline function specifier is an
    inline function. The function specifier may appear more than once; the
    behavior is the same as if it appeared only once. Making a function an
    inline function suggests that calls to the function be as fast as
    possible.118)
    -----------------------------------------------------------------------
    118) By using, for example, an alternative to the usual function call
    mechanism, such as inline substitution . Inline substitution is not
    textual substitution, nor does it create a new function. Therefore, for
    example, the expansion of a macro used within the body of the function
    uses the definition it had at the point the function body appears, and
    not where the function is called; and identifiers refer to the
    declarations in scope where the body occurs.
    Likewise, the function has a single address, regardless of the number of
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    inline definitions that occur in addition to the external definition.
    -----------------------------------------------------------------------
    The extent to which such suggestions are effective is
    implementation-defined.
    .....



    HTH
    Michael
     
    Michael Mair, Sep 17, 2004
    #18
  19. Andrew Au \(Newsgroup\)

    Flash Gordon Guest

    "inline", if I understand it correctly, is a deceptively named
    modifier from the all but abandoned C99 standard, much like
    "register". What "inline" really means is that you guarantee that you
    will not take the address of that function -- and that's all. Good
    compilers will not take hints from programmers in this way about what
    code is actually inlined or not. Bad compilers may take it as a hint
    about inlining your code but that just reveals the fact that such
    compilers don't know how to clone, constant propogate or inline code
    automatically (and therefore probably have high function call
    overhead, which is what you seem concerned about). Both "register"
    and "inline" should have been unified as a single keyword like
    "noaddress" since that's what they really mean.

    Any good compiler should be able to inline any function which is
    either static or can be cloned as static, is a leaf function (or
    recursively so, by considering any function which only calls inlined
    functions or none at all as a leaf function) and is below some
    threshold in size.[/QUOTE]

    I agree with you about register, but I don't completely agree with you
    about inline. The compiler is balancing execution time against
    executable size and it might be that for a given function the compiler
    would not inline it because it is called more than once and is large,
    but you want it inlined anyway because you have enough space and speed
    is of a greater concern. Of course, for you to make that decision you
    also have to consider the instruction cache, if there is one.

    Having said that, I don't think I would make much use of inline for the
    work I am currently doing even if all compilers where C99 compliant.
     
    Flash Gordon, Sep 17, 2004
    #19
  20. Andrew Au \(Newsgroup\)

    CBFalconer Guest

    .... snip reply to Paul Hsieh ...
    I think that is a strategic mistake. Using inline means you can
    freely remove those pesky 3 to 10 or so line phrases from your
    functions and give them a meaningful parametized name. You no
    longer need added local variables for them. This enhances both
    the readability and reliability of your code, without any runtime
    penalty. Each function then becomes much shorter, deals with
    fewer entities, and is much more easily verified by inspection.
    Using something from the standard library as an example, compare:

    char *str_dup(const char *s)
    {
    char *p;

    if (p = malloc(1 + strlen(s)) strcpy(p, s);
    return p;
    }

    which I consider fairly clear and trivially correct, with:

    char *str_dup(const char *s)
    {
    char *p, *pp
    size_t sz;

    p = s;
    while (*p++) continue;
    sz = p - s;
    if (p = malloc(sz)) {
    pp = p;
    while (*pp++ = *s++) continue;
    }
    return p;
    }

    which I believe to do the same job, with strlen and strcpy
    embedded. Yet this might well generate the identical code to the
    first example, assuming strlen and strcpy were defined as inline
    functions. The readability, and the possibility of error, are now
    much lower and higher respectively.

    A side benefit of using inline will be that you develop more
    useful and reusable functions for your application, which makes
    the whole schmeer clearer, tauter, and more elegant.

    And, using inline, you can always #define it away for porting to
    compilers that don't understand.
     
    CBFalconer, Sep 17, 2004
    #20
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.