Producer/consumer Queue "trick"

Discussion in 'Python' started by Evan Simpson, Jan 14, 2005.

  1. Evan Simpson

    Evan Simpson Guest

    WEBoggle needs a new game board every three minutes. Boards take an
    unpredictable (much less than 3min, but non-trivial) amount of time to
    generate. The system is driven by web requests, and I don't want the
    request that happens to trigger the need for the new board to have to
    pay the time cost of generating it.

    I set up a producer thread that does nothing but generate boards and put
    them into a length-two Queue (blocking). At the rate that boards are
    pulled from the Queue, it's almost always full, but my poor consumer
    thread was still being blocked for "a long time" each time it fetched a
    board.

    At this point I realized that q.get() on a full Queue immediately wakes
    up the producer, which has been blocked waiting to add a board to the
    Queue. It sets about generating the next board, and the consumer
    doesn't get to run again until the producer blocks again or is preempted.

    The solution was simple: have the producer time.sleep(0.001) when
    q.put(board) returns.

    Cheers,

    Evan @ 4-am
     
    Evan Simpson, Jan 14, 2005
    #1
    1. Advertising

  2. Evan Simpson

    Jon Perez Guest

    I don't get it.

    If the consumer and the producer are separate threads,
    why does the consumer thread block when the producer
    thread is generating a new board? Or why does it
    take forever for the producer thread to be pre-empted?

    Also, I don't understand why the solution works.
    How does sleeping for .001 seconds right after putting
    a new board on the queue solve the problem?

    Evan Simpson wrote:
    > WEBoggle needs a new game board every three minutes. Boards take an
    > unpredictable (much less than 3min, but non-trivial) amount of time to
    > generate. The system is driven by web requests, and I don't want the
    > request that happens to trigger the need for the new board to have to
    > pay the time cost of generating it.
    >
    > I set up a producer thread that does nothing but generate boards and put
    > them into a length-two Queue (blocking). At the rate that boards are
    > pulled from the Queue, it's almost always full, but my poor consumer
    > thread was still being blocked for "a long time" each time it fetched a
    > board.
    >
    > At this point I realized that q.get() on a full Queue immediately wakes
    > up the producer, which has been blocked waiting to add a board to the
    > Queue. It sets about generating the next board, and the consumer
    > doesn't get to run again until the producer blocks again or is preempted.
    >
    > The solution was simple: have the producer time.sleep(0.001) when
    > q.put(board) returns.
     
    Jon Perez, Jan 15, 2005
    #2
    1. Advertising

  3. Evan Simpson

    Paul Rubin Guest

    Evan Simpson <> writes:
    > wakes up the producer, which has been blocked waiting to add a board
    > to the Queue. It sets about generating the next board, and the
    > consumer doesn't get to run again until the producer blocks again or
    > is preempted.


    That's weird. Preemption should happen every few dozen milliseconds
    unless you've purposely increased the preemption delay.
     
    Paul Rubin, Jan 15, 2005
    #3
  4. On Fri, 14 Jan 2005 16:26:02 -0600, Evan Simpson wrote:

    > WEBoggle needs a new game board every three minutes. Boards take an
    > unpredictable (much less than 3min, but non-trivial) amount of time to
    > generate.


    I gotta ask, why?

    Looking over your information about "how to play", my only guess is that
    you're generating all possible words that may exist in the board, at the
    time of board generation.

    But it also looks like you score once at the end (as you have to anyhow in
    order to cancel out words found by multiple people, according to the rules
    of Boggle).

    If both of these statements are true, the generation of all possible words
    is a waste of time. For each word given by a human, looking it up in the
    dict and verifying it is on the board at score time should be a lot
    faster, and not need any pre-processing.

    If you're generating stats about what percentage of possible words were
    found, that can also be done during game play without loss by handing
    outh the board, and *then* finding all words. You still have a threading
    problem, but now instead of dealing with human response times, you've got
    a three minute deadline which ought to be enough.

    (The other thing I can think of is that you are trying to verify that the
    board contains some minimal number of words, in which case I submit that
    boards with only 20-ish words is just part of the game :) I've never sat
    down and really studied the Boggle dice, but I've always expected/hoped
    that there is at least one or two dice with all vowels; even so the odds
    of no vowels are small and easily algorithmically discarded. )

    To be clear, I'm mostly curious what's going on, but it is possible that
    the fundamental problem may be algorithmic, and since I don't know what's
    going on it's worth checking.

    Also, entirely separate plug, you may be interested in my XBLinJS project:
    http://www.jerf.org/resources/xblinjs . (I'm expecting to release 0.2
    either today or tomorrow, which will have vastly more documentation and
    more online examples.) It'd help pull out relevant code so that it is
    easily re-usable in any future gaming projects by others.
     
    Jeremy Bowers, Jan 15, 2005
    #4
  5. Evan Simpson

    Nick Coghlan Guest

    Paul Rubin wrote:
    > Evan Simpson <> writes:
    >
    >>wakes up the producer, which has been blocked waiting to add a board
    >>to the Queue. It sets about generating the next board, and the
    >>consumer doesn't get to run again until the producer blocks again or
    >>is preempted.

    >
    >
    > That's weird. Preemption should happen every few dozen milliseconds
    > unless you've purposely increased the preemption delay.


    To me, it smells like a call into a C extension which isn't releasing the GIL
    before starting a time-consuming operation. After getting bitten by this a
    couple of times, I now make sure to release the GIL as part of my SWIG wrapper
    (since the code I'm wrapping knows nothing of Python, and sure as heck doesn't
    need to be holding the GIL!).

    Cheers,
    Nick.

    --
    Nick Coghlan | | Brisbane, Australia
    ---------------------------------------------------------------
    http://boredomandlaziness.skystorm.net
     
    Nick Coghlan, Jan 15, 2005
    #5
  6. Evan Simpson

    Evan Simpson Guest

    I should clarify up front that I may have given an overblown sense of
    how long the producer thread typically takes to generate a board; It's
    usually a few tenths of a second, up to a few seconds for especially
    fecund boards.

    My concern was that even a few seconds is long enough for fifty requests
    to get piled up, and I was experiencing mysterious breakdowns where
    Apache was suddenly totally clogged up and taking *minutes* to respond.

    Jeremy Bowers wrote:
    > Looking over your information about "how to play", my only guess is that
    > you're generating all possible words that may exist in the board, at the
    > time of board generation.


    Yep. I do this in order to minimize the cost of the most common request
    that WEBoggle handles, which is checking a submitted word to see whether
    it is a valid word on the board, a valid word not on the board, or an
    invalid word. With the current code, this averages 1ms.

    > But it also looks like you score once at the end (as you have to anyhow in
    > order to cancel out words found by multiple people, according to the rules
    > of Boggle).


    WEBoggle is a little different than regular Boggle, in that your score
    is the plain sum of the scores for all of the words that you found, with
    no cancellation. I'm planning to add a "vs." feature eventually that
    will involve cancellation, but even then I'll retain the immediate
    feedback upon guessing a word.

    In addition, many players enjoy seeing the list of "Words not found by
    anyone".

    > (The other thing I can think of is that you are trying to verify that the
    > board contains some minimal number of words, in which case I submit that
    > boards with only 20-ish words is just part of the game :) I've never sat
    > down and really studied the Boggle dice, but I've always expected/hoped
    > that there is at least one or two dice with all vowels; even so the odds
    > of no vowels are small and easily algorithmically discarded. )


    Believe it or not, before I added code to filter them out, I was
    generating enough boards with *zero* valid words on them to get complaints.

    > Also, entirely separate plug, you may be interested in my XBLinJS project


    Very nifty! Well beyond my current needs, but good to know about.

    Cheers,

    Evan @ 4-am
     
    Evan Simpson, Jan 17, 2005
    #6
  7. Evan Simpson

    Evan Simpson Guest

    Jon Perez wrote:
    > If the consumer and the producer are separate threads,
    > why does the consumer thread block when the producer
    > thread is generating a new board? Or why does it
    > take forever for the producer thread to be pre-empted?
    >
    > Also, I don't understand why the solution works.
    > How does sleeping for .001 seconds right after putting
    > a new board on the queue solve the problem?


    I'm guessing at how things work, and may be totally wrong, but here's
    what I think happens: In the Queue get() code, the consumer releases
    the 'fsema' lock. Directly or indirectly, this wakes up and hands
    control to the producer thread, which was blocked trying to acquire
    'fsema'. The sleep() hands control back to the scheduler immediately,
    which appears to wake up the consumer and let it get on with things.

    It doesn't take "forever" for the producer to be preempted, just the
    normal preemption interval. I was bothered, though, by the idea of
    having it take even a few dozen milliseconds out of the middle of a
    request that's at most a millisecond or two away from finishing anyway.

    Cheers,

    Evan @ 4-am
     
    Evan Simpson, Jan 17, 2005
    #7
  8. Evan Simpson

    Evan Simpson Guest

    Nick Coghlan wrote:
    > Paul Rubin wrote:
    >
    >> That's weird. Preemption should happen every few dozen milliseconds
    >> unless you've purposely increased the preemption delay.

    >
    >
    > To me, it smells like a call into a C extension which isn't releasing
    > the GIL before starting a time-consuming operation.


    Sorry, I think I misled you both -- "a long time" is actually the few
    dozen milliseconds of a normal preemption, which is only long relative
    to the single millisecond that the consumer typically takes to handle a
    request.

    Cheers,

    Evan @ 4-am
     
    Evan Simpson, Jan 17, 2005
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Mark McKay
    Replies:
    0
    Views:
    444
    Mark McKay
    Dec 9, 2003
  2. Buck Turgidson

    Simple Producer/Consumer Thread Question

    Buck Turgidson, Feb 17, 2004, in forum: Java
    Replies:
    5
    Views:
    537
    Tony Dahlman
    Feb 21, 2004
  3. Usenet Poster!!!
    Replies:
    4
    Views:
    1,827
    Eric Sosman
    Sep 30, 2004
  4. Jeff
    Replies:
    4
    Views:
    675
    xarax
    Oct 22, 2004
  5. Kris
    Replies:
    0
    Views:
    486
Loading...

Share This Page