finding how much time that is consumed?

R

raybakk

Hi there.

If I make a function in c (I acually use gnu right now), is there any
way to find out how many clocksycluses that function takes?

If I divide some numbers etc Var1 = Var2/Var3, is it a fix amount of
clocksycluses that is been used for that division, or does it varies?

Raymond
 
E

Eric Sosman

Hi there.

If I make a function in c (I acually use gnu right now), is there any
way to find out how many clocksycluses that function takes?

("Clocksycluses?" After some puzzlement the light dawned:
I *think* what you mean is the two-word phrase "clock cycles."
However, the word "clocksycluses" has a certain fascination,
and people may take it up and start using it. You may have
gained immortality by enriching the lexicon!)

C provides the clock() function, which returns the amount
of CPU time consumed by your program since some fixed arbitrary
moment. You use it like this:

#include <stdio.h>
#include <time.h>
...
clock_t t0, t1;
t0 = clock();
do_something();
t1 = clock();
printf ("Used %g CPU seconds\n",
(t1 - t0) / (double)CLOCKS_PER_SEC);
If I divide some numbers etc Var1 = Var2/Var3, is it a fix amount of
clocksycluses that is been used for that division, or does it varies?

There are at least two problems here. First, the Standard
says nothing about how precise the clock() measurement is, how
rapidly the clock "ticks." On typical systems, the "tick rate"
is somewhere between 18Hz and 1000Hz; 100Hz is a fairly common
value. What this means is that clock() is probably too coarse-
grained to measure the execution time of a few instructions or
even a few tens of instructions; the measured time for something
as short as one division will probably be zero.

The second problem is that the C language says nothing about
how much time various operations take. On actual machines, the
time taken for your division will probably be affected by many
influences, such as

- Operand type: floating-point divisions and integer
divisions might run at different speeds

- Operand values: dividing by a denormal might take more
or less time than dividing by a normalized value

- Operand location: there's probably a cascade of different
places the operands might reside (CPU, various caches,
main memory, swap device), all with different speeds

- Interference: the division might compete with other
operations for scarce resources like pipelines, floating-
point units, internal CPU latches, and whatnot

.... and, of course, many more. Modern computers are complicated
systems, and it is all but meaningless to speak of "the" amount
of time a single operation takes.
 
T

Tim Prince

Eric said:
("Clocksycluses?" After some puzzlement the light dawned:
I *think* what you mean is the two-word phrase "clock cycles."
However, the word "clocksycluses" has a certain fascination,
and people may take it up and start using it. You may have
gained immortality by enriching the lexicon!)

C provides the clock() function, which returns the amount
of CPU time consumed by your program since some fixed arbitrary
moment. You use it like this:

#include <stdio.h>
#include <time.h>
...
clock_t t0, t1;
t0 = clock();
do_something();
t1 = clock();
printf ("Used %g CPU seconds\n",
(t1 - t0) / (double)CLOCKS_PER_SEC);


There are at least two problems here. First, the Standard
says nothing about how precise the clock() measurement is, how
rapidly the clock "ticks." On typical systems, the "tick rate"
is somewhere between 18Hz and 1000Hz; 100Hz is a fairly common
value. What this means is that clock() is probably too coarse-
grained to measure the execution time of a few instructions or
even a few tens of instructions; the measured time for something
as short as one division will probably be zero.
Most platforms do have useful (but non-portable) ways to measure clock
ticks. Often, it is done by counting bus clock ticks and multiplying by
a factor burned into the internal ROM of the CPU. It likely takes an
indeterminate number of ticks to obtain the result. See the _rdtsc()
macros built into certain compilers for x86 and related platforms.
There is not likely to be any relationship between native clock ticks
and the integral count returned by clock(); in fact, most
implementations cite Posix standards as requiring that clock() must
return some large increment after each native time interval, to permit
"posix" applications to avoid using CLOCKS_PER_SEC, thus throwing away a
number of useful bits.
The second problem is that the C language says nothing about
how much time various operations take. On actual machines, the
time taken for your division will probably be affected by many
influences, such as

- Operand type: floating-point divisions and integer
divisions might run at different speeds

- Operand values: dividing by a denormal might take more
or less time than dividing by a normalized value

- Operand location: there's probably a cascade of different
places the operands might reside (CPU, various caches,
main memory, swap device), all with different speeds

- Interference: the division might compete with other
operations for scarce resources like pipelines, floating-
point units, internal CPU latches, and whatnot

... and, of course, many more. Modern computers are complicated
systems, and it is all but meaningless to speak of "the" amount
of time a single operation takes.

However, there are many architectures where a division stalls the
floating point pipeline for a fixed number of cycles, once it begins
execution, depending on the width of the operands.
 
J

jmcgill

Hi there.

If I make a function in c (I acually use gnu right now), is there any
way to find out how many clocksycluses that function takes?

If I divide some numbers etc Var1 = Var2/Var3, is it a fix amount of
clocksycluses that is been used for that division, or does it varies?

In a modern processor, with pipelining, forwarding, branch detection,
and other optimizations in the datapath, even at the machine instruction
level, it will be difficult to determine the exact number of clock
cycles used by a given sequence of instructions. It is relatively easy
to measure "wall time" for a routine, but counting clock cycles can be
very difficult to do.

Are you doing floating point division in your example? Here's something
to ponder: It's entirely possible that when your machine code runs,
your division has been performed before the statement in your code that
comes "before" the division. That may sound like a preposterous claim,
until you explore how datapath optimizations are done.

It is also quite possible that one run through your routine is evaluated
differently than another.

But this is all architecture-specific, and has nothing at all to do with
C programming, aside from the assumption that you reached your
architecture-specific code by writing C.

It would be more helpful if you explained what you were trying to
determine, and why. Are you trying to optimize something specific, or
are you simply trying to determine how many clock cycles a given set of
instructions will require on your platform? (You will need a debugger,
and an instruction set reference for your processor).

If you are trying to optimize at a higher level than machine code, there
may be decisions you can make in your C design that will generally be
beneficial across platforms, but that's going to fundamentally depend on
what you are doing, and how you are doing it.

Your example fragment doesn't get us anywhere near what we'd need to
actually give you advice.
 
M

Malcolm

Hi there.

If I make a function in c (I acually use gnu right now), is there any
way to find out how many clocksycluses that function takes?

If I divide some numbers etc Var1 = Var2/Var3, is it a fix amount of
clocksycluses that is been used for that division, or does it varies?
It depends on your computer.
If it is simple embedded processor with 8 bits probably each instruction
takes one or more cycle, and you can calculate execution time by counting
the cycles.
If it's modern Pentium VI, Hexium 7, or whatever it probaly has really
complicated scheduling, caching, multi-tasking code. Whilst there will be
some relationship between wall clock time and the number of instructions you
give the machine, it won't necessarily be a simple one.

You might even be programming a Chinese room. Assembly instructions are
translated, by a Chinaman, to Chinese, and put in room containing an English
speaker, who doesn't speak Chinese. However he has instructions for
manipulating the symbols, and the answer comes out in Chinese, which the
Chinaman understands. the then goes around madly putting ASCII characters on
little wooden boats onto a flowing channel of water, and you can read them
off.

This type of computer has certain philosophical advantages. However it is
not very fast. Which is surprising, seeing that humans can do tasks such as
image recognition much faster and more accurately than computers.
 
R

Raymond

Thank you all for the answers.

It is actually embedded yes Malcolm, sorry (to everybody) for not
putting that strait.

I am thinking of taking FFT on some signals and I am curious about how
much time it will take.
Seams like I will start a counter before the function and stop it after
to get a good average time on the function.
I guess I was curious about if there was a way to simulate the amount
of clock cycles used.

I am thinking of using a 32bit processor in an FPGA (MicroBlaze).

Raymond
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top