Premature wakeup of time.sleep()

Erich Schreiber · Sep 12, 2005

In the Python Library Reference the explanation of the time.sleep()
function reads amongst others:

The actual suspension time may be less than that requested because
any caught signal will terminate the sleep() following execution
of that signal's catching routine. Also, the suspension time may
be longer than requested by an arbitrary amount because of the
scheduling of other activity in the system.

I don't understand the first part of this passage with the premature
wakeup. What signals would that be?

I've written a script that tries to bench the responsiveness of a
virtual Linux server. My script sleeps for a random time and on waking
up calculates the difference of the proposed and actual wakeup time.

The essential code fragment is

while True:
ts = tDelay()
t1 = time.time()
time.sleep(ts)
t2 = time.time()
twu = str(datetime.datetime.utcfromtimestamp(t1 + ts))
logResults(LOGFILE, twu, ts, int((t2-t1-ts)*1000))

Whereas tDelay() returns a (logarithmically) randomly distributed real
number in the range [0.01, 1200] which causes the process to sleep
from 10 ms to 20 minutes.

In the logs I see a about 1% of the wake-up delays beeing negative
from -1ms to about -20ms somewhat correlated with the duration of the
sleep. 20 minute sleeps tend to wake-up earlier then sub-second
sleeps. Can somebody explain this to me?

Regards,

Erich Schreiber

Steve Horsley · Sep 12, 2005

Erich said:
In the Python Library Reference the explanation of the time.sleep()
function reads amongst others:

I don't understand the first part of this passage with the premature
wakeup. What signals would that be?

That would be signals from the OS, frinstance, if you type ^C
into the python console, or kill the process from another command
line.

I've written a script that tries to bench the responsiveness of a
virtual Linux server. My script sleeps for a random time and on waking
up calculates the difference of the proposed and actual wakeup time.

The essential code fragment is

while True:
ts = tDelay()
t1 = time.time()
time.sleep(ts)
t2 = time.time()
twu = str(datetime.datetime.utcfromtimestamp(t1 + ts))
logResults(LOGFILE, twu, ts, int((t2-t1-ts)*1000))

Whereas tDelay() returns a (logarithmically) randomly distributed real
number in the range [0.01, 1200] which causes the process to sleep
from 10 ms to 20 minutes.

In the logs I see a about 1% of the wake-up delays beeing negative
from -1ms to about -20ms somewhat correlated with the duration of the
sleep. 20 minute sleeps tend to wake-up earlier then sub-second
sleeps. Can somebody explain this to me?

I think the sleep times are quantised to the granularity of the
system clock, shich varies from os to os. From memory, windows 95
has a 55mS timer, NT is less (19mS?), Linux and solaris 1mS. All
this is from memory, and really comes from discussions I have
seen about sleep and time in java, but I am guessing that there
are similarities.

Steve

Peter Hansen · Sep 13, 2005

Steve said:
I think the sleep times are quantised to the granularity of the system
clock, shich varies from os to os. From memory, windows 95 has a 55mS
timer, NT is less (19mS?), Linux and solaris 1mS. All this is from

For the record, the correct value for NT/XP family is about 15.6 ms
(possibly exactly 64 per second and thus 15.625ms, but offhand I don't
recall, though I'm sure Google does).

-Peter

Nick Craig-Wood · Sep 13, 2005

Erich Schreiber said:
In the Python Library Reference the explanation of the time.sleep()
function reads amongst others:

I don't understand the first part of this passage with the premature
wakeup. What signals would that be?

If someone sent your process a signal. Say you pressed CTRL-C - that
generates the INT signal which python translates to the
KeyboardInterrupt exception - and which does interrupt the sleep()
system call.

This probably isn't happening to you though.

In the logs I see a about 1% of the wake-up delays beeing negative
from -1ms to about -20ms somewhat correlated with the duration of the
sleep. 20 minute sleeps tend to wake-up earlier then sub-second
sleeps. Can somebody explain this to me?

Sleep under linux has a granularity of the timer interrupt rate (known
as HZ or jiffies in linux-speak).

Typically this is 100 Hz, which is a granularity of 10ms. So I'd
expect your sleeps to be no more accurate than +/- 10ms. +/- 20ms
doesn't seem unreasonable either.

(Linux 2.4 was fond of 100Hz. Its more configurable in 2.6 so could be
1000 Hz. Its likely to be 100 Hz or less in a virtual private server.)

I'm not sure the best way of finding out your HZ, here is one (enter
it all on one line)

start=`grep timer /proc/interrupts | awk '{print $2}'`; sleep 1;
end=`grep timer /proc/interrupts | awk '{print $2}'`; echo $(($end-$start))

Which prints a number about 1000 on my 2.6 machine.

Bengt Richter · Sep 13, 2005

For the record, the correct value for NT/XP family is about 15.6 ms
(possibly exactly 64 per second and thus 15.625ms, but offhand I don't
recall, though I'm sure Google does).

What "correct value" are you referring to? The clock chip interrupts,
clock chip resolution, or OS scheduling/dispatching?

For NT4 at least, I don't know of anything near 15.6 ms ;-)

Speaking generally (based on what I've done myself implementing
time-triggered wakeups of suspended tasks in a custom kernel eons ago),
a sleep-a-while function is not as desirable as a sleep-until function,
because the first is relative and needs to get current time and
add the delta and then use the sleep-until functionality. So if there
is a timing glitch during the sleep-a-while call, there will be a glitch
in the wake-up time.

Once you have a wakeup time, you can put your sleeper in a time-ordered queue,
at which time you might or might not check if wake-up time is already past,
which could happen. So you can put the task in a ready queue for the OS to
run again when it gets around to it, or since the sleeper is the caller, you
have the option of a hot return as if the sleep call hadn't happened.

Assuming you queue the task, when is the queue going to be checked to see
if it's time to wake anyone up? Certainly when the next OS time slice is
up. That might be ~55ms or 10ms (NT4) or 1ms (modern OS & processor). At that
point (roughly on the strike of e.g. 10ms, depending on how many bad high-priority
interrupt service routines there are that glitch it by suspending interrupts
too long or queueing too much super-priority deferred processing)), we look at
the wake-up queue. And a number of tasks might get wakened. Some will have had
wake-up times that were really just microseconds after they were queued, but it
has taken 10ms (or whatever) for the OS to finish another task's time slice and
check, so the effect is to wake up on the OS basic slicing time. Waking up is
only being queued for resumption however, so there might be more delay, depending
on relative priorities etc. Some OS's might look at the time queue at every opportunity,
meaning any time there is any interrupt. The check can be fast, since it is just
a comparison of current time vs the first wakeup time (with the right hardware,
this check can be done in the hardware, and it can interrupt when a specific
wakeup time passes. (You'd think CPUs would have this built in by now)).
But in any case, waking up is only being put in the ready-to-resume
queue, not necessarily being run immediately, so you have variability in the timing.

Note that when you wake up on the OS time-slicing interval edge, you are synchronized
with that, and if a number of tasks get readied at the "same" time, they will run
in succession. Sometimes a pattern of succession will be stable for a long time
due to the order of calls and the structure of queues etc., so that there may be
stable but different relative delays observed in the multiple tasks.

Add in adaptive priority tuning for foreground/background tasks, etc., and you
can see why sleep results could be less predictable as you move away from a
dedicated-machine context for your program.

UIAM the clock chip commonly used since the early PC days derives from IBM's
decision to buy a commodity crystal-based oscillator from some CRT/TV context
and divide it down rather than spend more to spec something with a nice
10**something hz frequency that most programmers would likely have preferred.

For a PC, the linux kernel sources will have a constant "CLOCK_TICK_RATE" e.g.,
see
http://lxr.linux.no/source/include/asm-i386/timex.h

where it has

12 #ifdef CONFIG_X86_ELAN
13 # define CLOCK_TICK_RATE 1189200 /* AMD Elan has different frequency! */
14 #else
15 # define CLOCK_TICK_RATE 1193182 /* Underlying HZ */
16 #endif

The second define is typical for i386 but was apparently revised
from the rounder number 1193180 at some point. IIRC the number in my old
compaq (16mhz 386 back then ;-) bios spec technical docs ended in zero ;-)

Anyway, if you run that into a 16-bit counter and trigger an interrupt every
time the counter turns over, you get the old familiar 18.206512 hz (1193182/2**16)
or 54.9254mz tick of DOS and early windows. Or you could trigger at some other count
for other size ticks.

BTW, the CMOS clock that keeps time by battery power while your PC power is off
is a separate thing, which has a resolution of 1 second, so even though the OS
may keep time with the timer chip, its counting start time when it boots
is on a second, and could be a second off even if the CMOS clock was dead
accurate, unless the OS goes to the net for a more accurate time. Usually
drift swamps the second resolution in short order though (a problem for
networked parallel makes BTW?)

Enough rambling ...

Regards,
Bengt Richter

Peter Hansen · Sep 13, 2005

Bengt said:
What "correct value" are you referring to? The clock chip interrupts,
clock chip resolution, or OS scheduling/dispatching?

The resolution of the call to time.time(), basically.

For NT4 at least, I don't know of anything near 15.6 ms ;-)

Apparently my belief that XP was the same as NT is incorrect then. The
value is definitely correct for XP (on my machine anyway ;-) ).

-Peter

lists as an efficient implementation of large two-dimensionalarrays(!)	0	Feb 2, 2010
How to use multiple instances of the same COM object at the sametime	0	Mar 22, 2013
How to block and release a Response?	2	Jan 30, 2004
win32service (wxpython) -- i cannot install service	4	Sep 11, 2006
Sys::Syslog writes to /dev/console instead of /dev/log	0	Feb 26, 2004

Premature wakeup of time.sleep()

Erich Schreiber

Steve Horsley

Peter Hansen

Nick Craig-Wood

Bengt Richter

Peter Hansen

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads