How to measure elapsed time under Windows?

Grant Edwards · Feb 9, 2010

What's the correct way to measure small periods of elapsed
time. I've always used time.clock() in the past:

start = time.clock()
[stuff being timed]
stop = time.clock()

delta = stop-start

However on multi-processor machines that doesn't work.
Sometimes I get negative values for delta. According to
google, this is due to a bug in Windows that causes the value
of time.clock() to be different depending on which core in a
multi-core CPU you happen to be on. [insert appropriate
MS-bashing here]

Is there another way to measure small periods of elapsed time
(say in the 1-10ms range)?

Is there a way to lock the python process to a single core so
that time.clock() works right?

Jean-Michel Pichavant · Feb 9, 2010

Grant said:
What's the correct way to measure small periods of elapsed
time. I've always used time.clock() in the past:

start = time.clock()
[stuff being timed]
stop = time.clock()

delta = stop-start

However on multi-processor machines that doesn't work.
Sometimes I get negative values for delta. According to
google, this is due to a bug in Windows that causes the value
of time.clock() to be different depending on which core in a
multi-core CPU you happen to be on. [insert appropriate
MS-bashing here]

Is there another way to measure small periods of elapsed time
(say in the 1-10ms range)?

Is there a way to lock the python process to a single core so
that time.clock() works right?

Did you try with the datetime module ?

import datetime
t0 = datetime.datetime.now()
t1 = t0 - datetime.datetime.now()
t1.microseconds
Out[4]: 644114

JM

Grant Edwards · Feb 9, 2010

Grant said:
Grant said:

What's the correct way to measure small periods of elapsed
time. I've always used time.clock() in the past:

start = time.clock()
[stuff being timed]
stop = time.clock()

delta = stop-start

However on multi-processor machines that doesn't work.
Sometimes I get negative values for delta. According to
google, this is due to a bug in Windows that causes the value
of time.clock() to be different depending on which core in a
multi-core CPU you happen to be on. [insert appropriate
MS-bashing here]

Is there another way to measure small periods of elapsed time
(say in the 1-10ms range)?

Is there a way to lock the python process to a single core so
that time.clock() works right?

Click to expand...

Did you try with the datetime module ?

No. What mechanism does it use to get the current date/time?

import datetime
t0 = datetime.datetime.now()
t1 = t0 - datetime.datetime.now()
t1.microseconds
Out[4]: 644114

That looks very broken to me. I need to measure stuff in the
1-20ms range, and the smallest value you can measure using the
method above appears to be 640ms. Thats almost 2/3 of a second.

Grant Edwards · Feb 9, 2010

Did you try with the datetime module ?

Click to expand...

No. What mechanism does it use to get the current date/time?

import datetime
t0 = datetime.datetime.now()
t1 = t0 - datetime.datetime.now()
t1.microseconds
Out[4]: 644114

Click to expand...

That looks very broken to me. I need to measure stuff in the
1-20ms range, and the smallest value you can measure using the
method above appears to be 640ms. Thats almost 2/3 of a second.

Duh. It just occurred to me that was done interactively. I'll
give the datetime module a try once I reboot my test machine
back into Windows.

Grant Edwards · Feb 9, 2010

Grant said:
Grant said:

What's the correct way to measure small periods of elapsed
time. I've always used time.clock() in the past:

start = time.clock()
[stuff being timed]
stop = time.clock()

delta = stop-start

However on multi-processor machines that doesn't work.
Sometimes I get negative values for delta. According to
google, this is due to a bug in Windows that causes the value
of time.clock() to be different depending on which core in a
multi-core CPU you happen to be on. [insert appropriate
MS-bashing here]

Is there another way to measure small periods of elapsed time
(say in the 1-10ms range)?

Is there a way to lock the python process to a single core so
that time.clock() works right?

Click to expand...

Did you try with the datetime module ?

import datetime
t0 = datetime.datetime.now()
t1 = t0 - datetime.datetime.now()
t1.microseconds
Out[4]: 644114

Doesn't work. datetime.datetime.now has granularity of
15-16ms.

Intervals much less that that often come back with a delta of
0. A delay of 20ms produces a delta of either 15-16ms or
31-32ms

Grant Edwards · Feb 9, 2010

Grant said:
Grant said:

What's the correct way to measure small periods of elapsed
time. I've always used time.clock() in the past:

start = time.clock()
[stuff being timed]
stop = time.clock()

delta = stop-start

However on multi-processor machines that doesn't work.
Sometimes I get negative values for delta. According to
google, this is due to a bug in Windows that causes the value
of time.clock() to be different depending on which core in a
multi-core CPU you happen to be on. [insert appropriate
MS-bashing here]

Is there another way to measure small periods of elapsed time
(say in the 1-10ms range)?

Is there a way to lock the python process to a single core so
that time.clock() works right?

Click to expand...

Click to expand...

Did you try with the datetime module ?

import datetime
t0 = datetime.datetime.now()
t1 = t0 - datetime.datetime.now()
t1.microseconds
Out[4]: 644114

Click to expand...

Doesn't work. datetime.datetime.now has granularity of
15-16ms.

time.time() exhibits the same behavior, so I assume that
datetime.datetime.new() ends up making the same libc/system
call as time.time(). From what I can grok of the datetime
module source code, it looks like it's calling gettimeofday().

I can't find any real documentation on the granularity of Win32
gettimeofday() other than a blog post that claims it is 10ms
(which doesn't agree with what my tests show).

Gabriel Genellina · Feb 9, 2010

En Tue, 09 Feb 2010 13:10:56 -0300, Grant Edwards

What's the correct way to measure small periods of elapsed
time. I've always used time.clock() in the past:

However on multi-processor machines that doesn't work.
Sometimes I get negative values for delta. According to
google, this is due to a bug in Windows that causes the value
of time.clock() to be different depending on which core in a
multi-core CPU you happen to be on. [insert appropriate
MS-bashing here]

I'm not sure you can blame MS of this issue; anyway, this patch should fix
the problem:
http://support.microsoft.com/?id=896256

Is there another way to measure small periods of elapsed time
(say in the 1-10ms range)?

No that I know of. QueryPerformanceCounter (the function used by
time.clock) seems to be the best timer available.

Is there a way to lock the python process to a single core so
that time.clock() works right?

Interactively, from the Task Manager:
http://www.microsoft.com/resources/...ll/proddocs/en-us/taskman_assign_process.mspx

In code, using SetProcessAffinityMask and related functions:
http://msdn.microsoft.com/en-us/library/ms686223(VS.85).aspx

Paul McGuire · Feb 10, 2010

Is there another way to measure small periods of elapsed time
(say in the 1-10ms range)?

I made repeated calls to time.clock() in a generator expression, which
is as fast a loop I can think of in Python. Then I computed the
successive time deltas to see if any granularities jumped out. Here
are the results:

import time
from itertools import groupby

# get about 1000 different values of time.clock()
ts = set(time.clock() for i in range(1000))

# sort in ascending order
ts = sorted(ts)

# compute diffs between adjacent time values
diffs = [j-i for i,j in zip(ts[:-1],ts[1:])]

# sort and group
diffs.sort()
diffgroups = groupby(diffs)

# print the distribution of time differences in microseconds
for i in diffgroups: print "%3d %12.6f" % (len(list(i[1])), i[0]*1e6)

Click to expand...

Click to expand...

....
25 2.234921
28 2.234921
242 2.514286
506 2.514286
45 2.793651
116 2.793651
1 3.073016
8 3.073016
6 3.352381
4 3.631746
3 3.911112
1 3.911112
5 4.190477
2 4.469842
1 6.146033
1 8.660319
1 9.777779
1 10.895239
1 11.174605
1 24.304765
1 41.904767

There seems to be a step size of about .28 microseconds. So I would
guess time.clock() has enough resolution. But also beware of the
overhead of the calls to clock() - using timeit, I find that each call
takes about 2 microseconds (consistent with the smallest time
difference in the above data set).

-- Paul

Dennis Lee Bieber · Feb 10, 2010

Doesn't work. datetime.datetime.now has granularity of
15-16ms.

Intervals much less that that often come back with a delta of
0. A delay of 20ms produces a delta of either 15-16ms or
31-32ms

WinXP uses an ~15ms time quantum for task switching. Which defines
the step rate of the wall clock output...

http://www.eggheadcafe.com/software/aspnet/35546579/the-quantum-was-not-182.aspx
http://www.eggheadcafe.com/software/aspnet/32823760/how-do-you-set-timeslici.aspx

http://www.lochan.org/2005/keith-cl/useful/win32time.html

Paul McGuire · Feb 10, 2010

WinXP uses an ~15ms time quantum for task switching. Which defines
the step rate of the wall clock output...

http://www.eggheadcafe.com/software...om/software/aspnet/32823760/how-do-you-set-ti...

http://www.lochan.org/2005/keith-cl/useful/win32time.html

Gabriel Genellina reports that time.clock() uses Windows'
QueryPerformanceCounter() API, which has much higher resolution than
the task switcher's 15ms. QueryPerformanceCounter's resolution is
hardware-dependent; using the Win API, and a little test program, I
get this value on my machine:
Frequency is 3579545 ticks/sec
Resolution is 0.279365114840015 microsecond/tick

-- Paul

Grant Edwards · Feb 10, 2010

En Tue, 09 Feb 2010 13:10:56 -0300, Grant Edwards

What's the correct way to measure small periods of elapsed
time. I've always used time.clock() in the past:

However on multi-processor machines that doesn't work.
Sometimes I get negative values for delta. According to
google, this is due to a bug in Windows that causes the value
of time.clock() to be different depending on which core in a
multi-core CPU you happen to be on. [insert appropriate
MS-bashing here]

Click to expand...

I'm not sure you can blame MS of this issue; anyway, this
patch should fix the problem:
http://support.microsoft.com/?id=896256

I'm curious why it wouldn't be Microsoft's fault, because

A) Everything is Microsoft's fault.

B) If a patch to MS Windows fixes the problem, how is it not a
problem in MS Windows?

Interactively, from the Task Manager:
http://www.microsoft.com/resources/...ll/proddocs/en-us/taskman_assign_process.mspx

Thanks. That looks a bit easier than disabling the second core
(which is what I ended up doing).

In code, using SetProcessAffinityMask and related functions:
http://msdn.microsoft.com/en-us/library/ms686223(VS.85).aspx

With help from google and some old mailing list posting I might
even try that.

Gabriel Genellina · Feb 10, 2010

En Wed, 10 Feb 2010 13:15:22 -0300, Grant Edwards

En Tue, 09 Feb 2010 13:10:56 -0300, Grant Edwards

What's the correct way to measure small periods of elapsed
time. I've always used time.clock() in the past:
However on multi-processor machines that doesn't work.
Sometimes I get negative values for delta. According to
google, this is due to a bug in Windows that causes the value
of time.clock() to be different depending on which core in a
multi-core CPU you happen to be on. [insert appropriate
MS-bashing here]

Click to expand...

I'm not sure you can blame MS of this issue; anyway, this
patch should fix the problem:
http://support.microsoft.com/?id=896256

Click to expand...

I'm curious why it wouldn't be Microsoft's fault, because

A) Everything is Microsoft's fault.

B) If a patch to MS Windows fixes the problem, how is it not a
problem in MS Windows?

I won't argue against A) because its truthness (?) is self-evident

99% of my code does not run in Python 3.x; I may fix it and it will
eventually run fine, but that doesn't mean it's *my* fault.

The original problem was with the RDTSC instruction on multicore CPUs;
different cores may yield different results because they're not
synchronized at all times.

Windows XP was launched in 2001, and the first dual core processors able
to execute Windows were AMD Opteron and IBM Pentium D, both launched
around April 2005 (and targeting the server market, not the home/desktop
market of Windows XP).
How could MS know in 2001 of a hardware issue that would happen four years
in the future?
Guido seems very jealous of his time machine and does not lend it to
anyone.

Grant Edwards · Feb 10, 2010

In code, using SetProcessAffinityMask and related functions:
http://msdn.microsoft.com/en-us/library/ms686223(VS.85).aspx

That solves the problem. If I don't set the process affinity
mask, I regularly get measurements that are off by 6ms. I
presume 6ms is the skew between the two cores' performance
counter values. I don't know if that difference is stable or
how it varies, but now it doesn't matter.

Grant Edwards · Feb 22, 2010

In Forth we add a small machine code routine that executes the
RDTSC instruction. (I used that to play music on a couple of
mechanical instruments in real time.) It just counts the (3
Ghz) clock cycles in a 64 bit timer.

Click to expand...

That's what clock.clock() does, except that it converts it into
a floating point value in seconds.

Subtract two samples and you're done.

Click to expand...

Nope. It would fail the same way that clock.clock() does on a
multi-core Windows machine.

Is there a mechanism in Python to do something similar,
embedded assembler or something?

Click to expand...

You'd get the same results as using clock.clock(). Just
different format/units.

(This is not a general solution, but at least it would work on
Windows, that is i86 only.)

Click to expand...

It fails on Windows for the same reason that clock.clock()
fails: the counters read by the RDTSC instruction are not
synchronized between the different cores.

Albert van der Horst · Feb 22, 2010

com/

Gabriel Genellina reports that time.clock() uses Windows'
QueryPerformanceCounter() API, which has much higher resolution than
the task switcher's 15ms. QueryPerformanceCounter's resolution is
hardware-dependent; using the Win API, and a little test program, I
get this value on my machine:
Frequency is 3579545 ticks/sec
Resolution is 0.279365114840015 microsecond/tick

In Forth we add a small machine code routine that executes the
RDTSC instruction. (I used that to play music on a couple of
mechanical instruments in real time.)
It just counts the (3 Ghz) clock cycles in a 64 bit timer.
Subtract two samples and you're done.

Is there a mechanism in Python to do something similar,
embedded assembler or something?

(This is not a general solution, but at least it would work on
Windows, that is i86 only.)

Gabriel Genellina · Feb 24, 2010

En Sat said:
Not true. The synchronization issue has two causes: initial
synchronization at boot time, and power management making microscopic
adjustments to the clock rate. In Windows NT 4, Microsoft took extra
pains
to adjust the cycle counters on multiprocessor computers during boot so
that the processors started out very close together. Once set, they
naturally remained in lock step, until aggressive power management
because
more prevalent. In XP, they stopped trying to align at boot time.

Thanks for the detailed explanation!

logging module -- better timestamp accuracy on Windows	1	Feb 15, 2011
Determining actual elapsed (wall-clock) time	13	Jul 2, 2005
FAQ 8.17 How can I measure time under a second?	0	Mar 26, 2011
Finding cpu time spent on my program	3	Feb 5, 2007
How to use multiple instances of the same COM object at the sametime	0	Mar 22, 2013
Anomaly in time.clock()	12	Mar 17, 2008
elapsed time 0 with std::cin	12	Nov 23, 2007
How to debug a problem with python crashing under windows	1	Feb 4, 2011

How to measure elapsed time under Windows?

Grant Edwards

Jean-Michel Pichavant

Grant Edwards

Grant Edwards

Grant Edwards

Grant Edwards

Gabriel Genellina

Paul McGuire

Dennis Lee Bieber

Paul McGuire

Grant Edwards

Gabriel Genellina

Grant Edwards

Grant Edwards

Albert van der Horst

Gabriel Genellina

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads