Queue qsize = unreliable?

  • Thread starter James R. Saker Jr.
  • Start date
J

James R. Saker Jr.

I see per pydoc that Queue.Queue()'s .qsize is allegedly unreliable:

| qsize(self)
| Return the approximate size of the queue (not reliable!).

Any thoughts on why this is unreliable (and more curiously, why it would
be put in there as an unreliable function?) Rather than roll my own
threaded fifo class, it would seem prudent to use Python's built-in
Queue but the warning signs on a rather necessary function seem curious.

Jamie
 
M

Michael Hudson

James R. Saker Jr. said:
I see per pydoc that Queue.Queue()'s .qsize is allegedly unreliable:

| qsize(self)
| Return the approximate size of the queue (not reliable!).

Any thoughts on why this is unreliable (and more curiously, why it would
be put in there as an unreliable function?) Rather than roll my own
threaded fifo class, it would seem prudent to use Python's built-in
Queue but the warning signs on a rather necessary function seem curious.

Well, by the time you examine (or even get!) the answer, it might be
wrong. Kinda hard to avoid this in the setting of multiple threads!

Cheers,
mwh
 
P

Peter Hansen

James said:
I see per pydoc that Queue.Queue()'s .qsize is allegedly unreliable:

| qsize(self)
| Return the approximate size of the queue (not reliable!).

Any thoughts on why this is unreliable (and more curiously, why it would
be put in there as an unreliable function?) Rather than roll my own
threaded fifo class, it would seem prudent to use Python's built-in
Queue but the warning signs on a rather necessary function seem curious.

(Why do you think this function is necessary? It's probably
rare to really need it, except perhaps during debugging... )

Anyway, the reason it's called "unreliable", though the term
"inaccurate" might be more correct, is because while you are
getting the size of the queue, it might be updated such that
the new size is one or more fewer or larger than the value
that is about to be returned to you. In effect, the value is
guaranteed accurate only for the precise instant in time, now
passed, that it was determined, but by the time the calling
routine actually sees the value the size could be anything.

Note also the latest docs at docs.python.org, which state the
case a little more clearly.

"""Return the approximate size of the queue. Because of
multithreading semantics, this number is not reliable. """

-Peter
 
G

Grant Edwards

(Why do you think this function is necessary? It's probably
rare to really need it, except perhaps during debugging... )

Anyway, the reason it's called "unreliable", though the term
"inaccurate" might be more correct, is because while you are
getting the size of the queue, it might be updated such that
the new size is one or more fewer or larger than the value
that is about to be returned to you.

I don't think that's any reason to call the function either
unreliable or inaccurate. If you're operating in a
multi-threaded environment, such a statement is trivially true
about anything that accesses shared data.

For example: time.time() needs a disclaimer that it is
unreliable, since the result it returns is incorrect by the
time you get around to using it...
 
J

Jacob Hallen

(Why do you think this function is necessary? It's probably
rare to really need it, except perhaps during debugging... )

Anyway, the reason it's called "unreliable", though the term
"inaccurate" might be more correct, is because while you are
getting the size of the queue, it might be updated such that
the new size is one or more fewer or larger than the value
that is about to be returned to you. In effect, the value is
guaranteed accurate only for the precise instant in time, now
passed, that it was determined, but by the time the calling
routine actually sees the value the size could be anything.

Note also the latest docs at docs.python.org, which state the
case a little more clearly.

"""Return the approximate size of the queue. Because of
multithreading semantics, this number is not reliable. """

And an inaccurate number is still quite usable. If you are
the only one reading from a queue, you know that the number
is at least as large as indicated. If you read from multiple
queues, you can use the indicated number to pick which queue
to handle first.

This allows you to do pollling and fair scheduling and some other
neat tricks.

The same sort of tricks works for writers as well.

Jacob Hallén

--
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top