how to write thread-safe module ? and pytz

Discussion in 'Python' started by nicolas_riesch, Aug 10, 2005.

  1. Does someone know if the module pytz
    (http://sourceforge.net/projects/pytz/) is thread-safe ?
    I have not seen it explicitely stated, and just wanted to be sure, as I
    want to use it.

    That's because in the file pytz/tzinfo.py, I see global variables
    _timedelta_cache, _datetime_cache, _ttinfo_cache, which are
    dictionnaries and are used as cache.
    I always thought that you must protect shared data with locks when
    multithreading, but I don't see any lock anywhere in pytz.
    However, pytz seems to work well with multiple threads creating various
    timezone objects at the same time.
    I don't understand where the trick is, that allows multiple threads to
    access this module without any locking and that all this seems to work
    without any problem...

    Does this mean that there is a means to write a module that is
    thread-safe, without importing the threading module and creating locks
    ?

    Or have I not understood something ...?

    Can someone give me a hint ?
    nicolas_riesch, Aug 10, 2005
    #1
    1. Advertising

  2. nicolas_riesch

    Bryan Olson Guest

    nicolas_riesch wrote:
    > Does someone know if the module pytz
    > (http://sourceforge.net/projects/pytz/) is thread-safe ?


    On that, I don't know.

    > That's because in the file pytz/tzinfo.py, I see global variables
    > _timedelta_cache, _datetime_cache, _ttinfo_cache, which are
    > dictionnaries and are used as cache.
    > I always thought that you must protect shared data with locks when
    > multithreading, but I don't see any lock anywhere in pytz.


    Definitely stick with what you always thought.

    > However, pytz seems to work well with multiple threads creating various
    > timezone objects at the same time.


    Buggy threading can seem to work well for along time. It may be
    a billion-to-one shot that a thread switch happens at just the
    wrong time.

    > I don't understand where the trick is, that allows multiple threads to
    > access this module without any locking and that all this seems to work
    > without any problem...
    >
    > Does this mean that there is a means to write a module that is
    > thread-safe, without importing the threading module and creating locks
    > ?
    >
    > Or have I not understood something ...?
    >
    > Can someone give me a hint ?


    In the current Python implementation, more things are atomic
    than the language guarantees to be atomic. Programmers should
    not depend on that behavior. Again, I don't know anything about
    pytz, but we wouldn't bother with locks and semaphores and such
    if we could make the problems go away just by ignoring them.


    --
    --Bryan
    Bryan Olson, Aug 10, 2005
    #2
    1. Advertising

  3. nicolas_riesch wrote:
    > Does someone know if the module pytz
    > (http://sourceforge.net/projects/pytz/) is thread-safe ?
    > I have not seen it explicitely stated, and just wanted to be sure, as I
    > want to use it.
    >
    > That's because in the file pytz/tzinfo.py, I see global variables
    > _timedelta_cache, _datetime_cache, _ttinfo_cache, which are
    > dictionnaries and are used as cache.
    > I always thought that you must protect shared data with locks when
    > multithreading, but I don't see any lock anywhere in pytz.


    Dictionaries (and probably most other Python types that are implemented
    in C) are inherently thread safe.

    This applies only to the individual methods of dictionaries. The
    following code would still require a lock:
    if mydict.has_key (keyval):
    variable = mydict [keyval]
    because a second thread could delete the entry between the calls to
    has_key and __getvalue__.

    mydict [keyval] = mydict.get (keyval, 0) + 1
    is also an candidate for problems.

    > However, pytz seems to work well with multiple threads creating various
    > timezone objects at the same time.


    'Seems to work' is never a good argument with regard to threads.
    Especially if you're testing on a single CPU machine.

    Daniel
    Daniel Dittmar, Aug 10, 2005
    #3
  4. Daniel Dittmar wrote:

    > Dictionaries (and probably most other Python types that are implemented
    > in C) are inherently thread safe.


    That sounds like a dangerous assumption to me.

    Are you relying on the Global Interpreter Lock?
    Is is guaranteed?
    Does that safety transfer to Jython?
    How can I tell if any particular object is thread-safe?

    I don't know the answers to these questions, and I have the
    feeling that it is probably best to play safe and always use your
    own explicit locking.

    The Cog
    Cantankerous Old Git, Aug 11, 2005
    #4
  5. nicolas_riesch wrote:
    > Does someone know if the module pytz
    > (http://sourceforge.net/projects/pytz/) is thread-safe ?
    > I have not seen it explicitely stated, and just wanted to be sure, as I
    > want to use it.


    pytz is thread safe.

    > That's because in the file pytz/tzinfo.py, I see global variables
    > _timedelta_cache, _datetime_cache, _ttinfo_cache, which are
    > dictionnaries and are used as cache.


    > I always thought that you must protect shared data with locks when
    > multithreading, but I don't see any lock anywhere in pytz.
    > However, pytz seems to work well with multiple threads creating various
    > timezone objects at the same time.
    > I don't understand where the trick is, that allows multiple threads to
    > access this module without any locking and that all this seems to work
    > without any problem...


    Thanks to the global interpreter lock, with the Python builtin types you
    only need to maintain a lock if there is a race condition, or if you care
    about the race condition. For example, the following is thread safe code:

    >>> from threading import Thread
    >>> import time
    >>> stack = []
    >>> stack2 = []
    >>> def doit(i):

    .... stack.append(i)
    .... time.sleep(0.1)
    .... stack2.append(stack.pop())
    ....
    >>> threads = [Thread(target=doit, args=(i,)) for i in range(0,100)]
    >>> for t in threads: t.start()

    ....
    >>> for t in threads: t.join()

    ....
    >>> len(stack2)

    100
    >>> stack2

    [99, 95, 98, 94, 93, 97, 92, 91, 96, 88, 87, 86, 85, 84, 83, 90, 79, 78, 77,
    76, 74, 73, 72, 71, 70, 75, 82, 81, 80, 89, 69, 67, 66, 65, 64, 68, 60, 59,
    58, 57, 56, 55, 63, 62, 61, 49, 54, 53, 52, 51, 46, 45, 44, 50, 48, 47, 29,
    28, 35, 34, 33, 43, 42, 41, 40, 39, 38, 32, 37, 31, 30, 36, 27, 26, 25, 24,
    23, 22, 21, 20, 19, 18, 17, 12, 16, 15, 14, 13, 11, 10, 9, 8, 7, 6, 4, 3, 2,
    1, 0, 5]

    Note that the value being appended to 'stack2' might not be the value that
    was appended to 'stack' in any particular thread - in this case, we don't
    care (but is the sort of thing you might need to watch out for).

    In the code you mention in pytz, there *is* a race condition. However, if
    this condition occurs the side effects are so trivial as to not worry about
    locking. ie. if a number of threads call memorized_timedelta(seconds=60)
    simultaneously, there is a slight chance that each thread will get a
    different timedelta instance. This is extremely unlikely, and the rest of
    the code doesn't care at all. If pytz compared the timedeltas using 'is'
    instead of '==' at any point, it would be a bug (but it doesn't, so it isn't).

    So you can write thread safe Python code without locks provided you are
    using the builtin types, and keep a close eye out for race conditions. This
    might sound error prone, but it is quite doable provided the critical areas
    that are accessing shared objects are kept isolated, short and simple.

    Here is an thread unsafe example. Here the mistake is made that the length
    of stack will not change after checking it. Also because we don't use the
    atomic stack.pop(), two threads might add the same value to stack2:

    >>> from threading import Thread
    >>> import time
    >>> stack = range(0, 50)
    >>> stack2 = []
    >>> def doit():

    .... if len(stack) > 0:
    .... stack2.append(stack[-1])
    .... time.sleep(0.1)
    .... del stack[-1]
    ....
    >>> threads = [Thread(target=doit) for i in range(0, 100)]
    >>> for t in threads: t.start()

    ....
    Exception in thread Thread-249:
    Traceback (most recent call last):
    File "/usr/lib/python2.4/threading.py", line 442, in __bootstrap
    self.run()
    File "/usr/lib/python2.4/threading.py", line 422, in run
    self.__target(*self.__args, **self.__kwargs)
    File "<stdin>", line 5, in doit
    IndexError: list assignment index out of range



    --
    Stuart Bishop <>
    http://www.stuartbishop.net/

    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.2.5 (GNU/Linux)

    iD8DBQFC/t4hAfqZj7rGN0oRAop2AJ4udkfv8TlAtQt7ya0v/dh55l8ntACdG9PH
    m2WJx2WTUZnNh7HmAMQcils=
    =WyDW
    -----END PGP SIGNATURE-----
    Stuart Bishop, Aug 14, 2005
    #5
  6. Thank you very much for all your explanation !
    Your pytz module is great !
    nicolas_riesch, Aug 15, 2005
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. scott

    Matplotlib, py2exe and pytz

    scott, Feb 19, 2005, in forum: Python
    Replies:
    2
    Views:
    618
    scott
    Feb 20, 2005
  2. David Pratt

    Datetime, pytz and strange offset

    David Pratt, Dec 13, 2005, in forum: Python
    Replies:
    0
    Views:
    441
    David Pratt
    Dec 13, 2005
  3. Sanjay
    Replies:
    2
    Views:
    410
    Sanjay
    Jul 18, 2007
  4. Gabriel Rossetti
    Replies:
    0
    Views:
    1,294
    Gabriel Rossetti
    Aug 29, 2008
  5. John Nagle
    Replies:
    5
    Views:
    451
    John Nagle
    Mar 12, 2012
Loading...

Share This Page