Execution time of code?

C

coal

I'm aware of that, but don't see the point here.  Both the Boost and
Ebenezer numbers would be divided by the same constant.  It is
simpler,
I think, to just add up the times from clock for each version and then

It is probably clearer to say: add up the results from clock...
 
A

Alf P. Steinbach

* (e-mail address removed):
I'm aware of that, but don't see the point here. Both the Boost and
Ebenezer numbers would be divided by the same constant. It is
simpler,
I think, to just add up the times from clock for each version and then
figure out the ratio.

How is that different from what you quoted?

I was commenting on the "problem" with values like "10,000".

If you *really*, actually, have values like those you exemplified, like
"10,000", then that indicates with high probability that somewhere in the
testing code an integer division is used where a floating point division should
have been used.

But I assumed that was not the case, that it was just a case of dramatizing the
effect of low resolution.

If you really, actually have such values, then check out the division of 'clock'
result.

(I could document the results from clock, but
for now I just document the ratio.) I use semicolons within a shell
to run each version 3 times in a row. I execute that command twice.
The second group starts up right on the heels of the first. So the
test is run 6 times total. I ignore the first 3 runs/times and do
those just to get the machine ready for the next 3. Anyway, my
impression, and it seemed like Victor has a similar impression, is
the output from clock isn't as precise as it could be. The range
I got earlier from clock, 1.4 - 4.5, leaves quite a bit of room for
manipulation if that is a person's goal.

Uh, the person doing the timing is presumably /not/ your adversary, but yourself?

Anyways, if you have a range of 1.4 to 4.5 for the same code, tested in *nix
(indicated by your comment about semicolons), using 'clock' which in *nix-land
reports processor time, then Something Is Wrong.

Perhaps the integer division issue mentioned above?


Cheers & hth.,

- Alf
 
J

James Kanze

You have cooties. Ergo, std::clock() lacks sufficient
precision for profiling.

std::clock() has nothing to do with profiling; it's not a
profiler. It's a useful tool for comparing different
implementations of a function, once profiling has determined
which functions need attention.
FWIW, I use it, too, just for first-order approximations.

I don't really know of a better function for what it does, at
least when implemented correctly. It should give you the best
precision available on the machine.
 
J

James Kanze

On Mar 5, 11:55 pm, "Alf P. Steinbach" <[email protected]> wrote:
[...]
I've done some performance testing on Windows and Linux
--www.webEbenezer.net/comparison.html. On Windows I use
clock and on Linux I use gettimeofday. From what I can
tell gettimeofday gives more accurate results than clock
on Linux. Depending on how this thread works out, I may
start using the function Victor mentioned on Windows.
On Unix based machines, clock() and gettimeofday() measure
different things. I use clock() when I want what clock()
measures, and gettimeofday() when I want what gettimeofday()
measures. For comparing algorithms to see which is more
effective, this means clock().
I've just retested the test that saves/sends a list<int> using
clock on Linux. The range of ratios from the Boost version to
my version was between 1.4 and 4.5. The thing about clock is
it returns values like 10,000, 20,000, 30,000, 50,000, 60,000,
etc.

This sounds like a defective (albeit legal) implementation.
Posix requires CLOCKS_PER_SEC to be 1000000 precisely so that
implementations can offer more precision if the system supports
it. Linux does. I'd file a bug report.

Of course, historically, a lot of systems had clocks generated
from the mains, which meant a CLOCKS_PER_SEC of 50 (in Europe)
or 60 (in North America). On such systems, better precision
simply wasn't available, and I've gotten into the habit of not
counting on values of benchmarks that run for less than about 5
minutes. So I would tend not to noticed such anomalies as you
describe.
I would be more comfortable with it if I could get it to round
its results less. The range of results with gettimeofday for
the same test is not so wide -- between 2.0 and 2.8. I don't
run other programs while I'm testing besides a shell/vi and
firefox. I definitely don't start or stop any of those
between the tests, so I'm of the opinion that the elapsed time
results are meaningful.

The relative values are probably meaningful if the actual values
are large enough (a couple of minutes, at least) and they are
reproduceable. The actual values, not really (but that's
generally not what you're interested in).

In my own tests, with clock(), under both Linux and Solaris, I
generally get differences from one run to the next of
considerably less than 10%. Which is about as accurate as
you're going to get, I think. Under Windows, I have to be more
careful about the surrounding environment, and even then, there
will be an outlier from time to time.
Except for the part about the functions being purely CPU, this
describes my approach/intent.

Again, it depends on what you are trying to measure. If you
want to capture disk transfer speed, for example, then clock()
is NOT the function you want (except under Windows).
 
J

James Kanze

* (e-mail address removed): [...]
I was commenting on the "problem" with values like "10,000".
If you *really*, actually, have values like those you
exemplified, like "10,000", then that indicates with high
probability that somewhere in the testing code an integer
division is used where a floating point division should have
been used.

I've just done a few tests on my Linux machine here, and it does
seem to be an error in the library implementation under Linux.

For some reason, Posix requires that CLOCKS_PER_SEC be 1000000,
regardless of the actual accuracy available. So a machine whose
only timer source is the 50 Hz line frequency (I don't know of
any today, but that used to be a frequent case, many years ago)
will return the values 0, 20000, 40000, etc. (or 0, 10000,
20000, etc., if it triggers on each zero crossing). This sort
of defeats the purpose of CLOCKS_PER_SEC, as defined by the C
standard, but Posix does occasionally get confused. And of
course, on a machine which does support more precision (i.e. all
modern machines), you should get it, at least from a QoI point
of view.
But I assumed that was not the case, that it was just a case
of dramatizing the effect of low resolution.
If you really, actually have such values, then check out the
division of 'clock' result.
Uh, the person doing the timing is presumably /not/ your
adversary, but yourself?
Anyways, if you have a range of 1.4 to 4.5 for the same code,
tested in *nix (indicated by your comment about semicolons),
using 'clock' which in *nix-land reports processor time, then
Something Is Wrong.

Yes. And that is probably true even if clock only has a
resolution of 10 ms. You don't bench a single run; you bench a
large number of runs, in a loop. For any significant
measurements, I would expect a total measured time of something
like 5 minutes, at least. Any decent benchmark harness should
be able to handle this sort of stuff.
 
A

Alf P. Steinbach

* Jeff Schwab:
What's that got to do with clock frequency? (And why use the generator
frequency? Since we have triphase power, couldn't the grid be used to
generate 150 or 180 Hz signals?)

Hum, this is REALLY off-topic. But as I recall, in Windows the 'clock'
resolution has to do with ordinary Windows timer resolution which again, if I
recall this correctly, and I think I do, has to do with the wiring of the very
first IBM PC's timer chip, which as I recall had three timers on the chip, and
it was sort of 52 interrupts per second.

Let me check with gOOgle, just wait a moment...

Ah, not quite, it interrupted every 55 msec, that is about 18.2 times per
second. I remembered that about three channels correctly, though. :)

And doesn't seem to be connected to Windows timer resolution after all, dang!
But while in this really off-topic mode, that search found useful article, <url:
www.microsoft.com/technet/sysinternals/information/HighResolutionTimers.mspx>.


How so? Even the slowest processors I've ever seen had clock speeds on
the order of KHz. If you run slowly enough, weird stuff can happen;
capacitors leak voltage, and stored values flip.

I'm too lazy to check the value of CLOCKS_PER_SEC with Windows compilers.


Cheers,

- Alf
 
C

coal

On Mar 6, 8:50 am, (e-mail address removed) wrote:
    [...]
I've just retested the test that saves/sends a list<int> using
clock on Linux.  The range of ratios from the Boost version to
my version was between 1.4 and 4.5.  The thing about clock is
it returns values like 10,000, 20,000, 30,000, 50,000, 60,000,
etc.

This sounds like a defective (albeit legal) implementation.
Posix requires CLOCKS_PER_SEC to be 1000000 precisely so that
implementations can offer more precision if the system supports
it.  Linux does.  I'd file a bug report.

Of course, historically, a lot of systems had clocks generated
from the mains, which meant a CLOCKS_PER_SEC of 50 (in Europe)
or 60 (in North America).  On such systems, better precision
simply wasn't available, and I've gotten into the habit of not
counting on values of benchmarks that run for less than about  5
minutes.  So I would tend not to noticed such anomalies as you
describe.
I would be more comfortable with it if I could get it to round
its results less.  The range of results with gettimeofday for
the same test is not so wide -- between 2.0 and 2.8.  I don't
run other programs while I'm testing besides a shell/vi and
firefox.  I definitely don't start or stop any of those
between the tests, so I'm of the opinion that the elapsed time
results are meaningful.

The relative values are probably meaningful if the actual values
are large enough (a couple of minutes, at least) and they are
reproduceable.  The actual values, not really (but that's
generally not what you're interested in).

My testing till now has been of tests that take less than a
second. I'll add a loop to the tests to make them last for
several minutes.
Again, it depends on what you are trying to measure.  If you
want to capture disk transfer speed, for example, then clock()
is NOT the function you want (except under Windows).

All of my tests measure time to marshal data to disk or from a
disk. I'll test the longer running versions on Linux with
gettimeofday.


Brian Wood
Ebenezer Enterprises
www.webEbenezer.net
 
C

coal

All of my tests measure time to marshal data to disk or from a
disk.  I'll test the longer running versions on Linux with
gettimeofday.

I added the following loop to the Linux versions of the tests
that send a list<int>

for (int reps = 1; reps <= elements/100; ++ reps) {

}


elements is gotten from the command line and controls how
many ints are added to the list. I tested with values of
200,000 and 300,000 for elements. I tested the Ebenezer
version and then the Boost Serialization version and
went back and forth like that. Here are the results in
seconds.

input Boost Serialization C++ Middleware Writer
------------------------------------------------------
200000 169 84
200000 102 81
200000 173 82
200000 101 82
200000 103 82

300000 367 187
300000 295 187
300000 296 189
300000 228 188
300000 300 188


The size of the output files when the input was 200,000
are a little over 1.6 billion bytes. When the input
was 300,000, the output files are a little over 3.6
billion bytes. I had to use O_LARGEFILE with open()
in the Ebenezer version to get the correct results
when the input was 300,000. All of the Ebenezer results
are with the version that uses O_LARGEFILE even though
it wasn't needed when the input was 200,000.

The ratios from these tests are less than the 2.4 to
2.7 that are posted on the website. I'm not convinced
though that the posted ratios are inaccurate. Those
tests may reflect more typical usage than these.
Finally I think it's worth noting how the Ebenezer
results were more stable than the Boost Serialization
results.


Brian Wood
Ebenezer Enterprises
www.webEbenezer.net
 
J

James Kanze

Of course!

Well, it seems "of course" to those of us who actually lived
it:).
What's that got to do with clock frequency? (And why use the
generator frequency? Since we have triphase power, couldn't
the grid be used to generate 150 or 180 Hz signals?)

I don't know what the grid could have been used to generate, but
most computers then (and now) got their power from a standard
wall plug, which delived single phase 110 V, 60 Hz in North
America, and 220 V, 50 Hz. in Europe.
How so? Even the slowest processors I've ever seen had clock
speeds on the order of KHz. If you run slowly enough, weird
stuff can happen; capacitors leak voltage, and stored values
flip.

Back then, quarz clocks were a luxury; the CPU "clock" was often
generated by an RC feedback to a Schmitt trigger. With a
precision of well under 10%. Whereas the line sector (at least
in France) was guaranteed by the electric company to have
4320000 +/- 0.5 oscillations in each 24 hour period. (Also, a
lot of machines at the time could run in pure step by step mode,
with each clock pulse being triggered manually. Dynamic RAM had
just been invented, and still wasn't too wide spread, and
magnetic core memory doesn't flip at whim.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,598
Members
45,158
Latest member
Vinay_Kumar Nevatia
Top