multithreading and shared dictionary

  • Thread starter Stéphane Ninin
  • Start date
S

Stéphane Ninin

Hello,

Probably a stupid question, but I am not a multithreading expert...

I want to share a dictionary between several threads.
Actually, I will wrap the dictionary in a class
and want to protect the "sensitive accesses" with locks.

The problem is I am not sure which concurrent access to the dictionary
could cause a problem.

I assume that two write's on the same location would be,
but what if one thread does

mydict['a'] = something

and another thread:

mydict['b'] = something else

Is a lock required in such a case ?

Thanks for any comments on this.
 
P

placid

Stéphane Ninin said:
Hello,

Probably a stupid question, but I am not a multithreading expert...

I want to share a dictionary between several threads.
Actually, I will wrap the dictionary in a class
and want to protect the "sensitive accesses" with locks.

The problem is I am not sure which concurrent access to the dictionary
could cause a problem.

I assume that two write's on the same location would be,
but what if one thread does

mydict['a'] = something

and another thread:

mydict['b'] = something else

Is a lock required in such a case ?

i dont think you need to use a lock for these cases because mydict['a']
refers to mydict['b'] memory locations (i may be wrong because i dont
know Python implementation).

Just a recomendation, you could also use a Queue object to control
write access to the dictionary.

-Cheers
 
I

Istvan Albert

Stéphane Ninin said:
Is a lock required in such a case ?

I believe that assignment is atomic and would not need a lock.

yet most of the other dictionary use cases are not threadsafe. For
example I suspect that you'd get an error if you were iterating through
the dictionary while another thread inserted a value.

i.
 
A

Alex Martelli

Istvan Albert said:
I believe that assignment is atomic and would not need a lock.

Wrong, alas: each assignment *could* cause the dictionary's internal
structures to be reorganized (rehashed) and impact another assignment
(or even 'get'-access).


Alex
 
P

placid

Alex said:
Wrong, alas: each assignment *could* cause the dictionary's internal
structures to be reorganized (rehashed) and impact another assignment
(or even 'get'-access).


Oh yeah you are correct, i forgot about possible rehashing after
assignment operation, so to that means you do need to use a lock to
restrict only one thread at time from accessing the dictionary. Again i
recommend using a Queue to do this. If you need help with usage of
Queue object first see
http://www.google.com/notebook/public/14017391689116447001/BDUsxIgoQkcSO3bQh
or contact me

-Cheers
 
K

K.S.Sreeram

Alex said:
Wrong, alas: each assignment *could* cause the dictionary's internal
structures to be reorganized (rehashed) and impact another assignment
(or even 'get'-access).

but wont the GIL be locked when the rehash occurs?

Regards
Sreeram


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEr3/rrgn0plK5qqURAvbUAJ9Elw+PUI2yZ2g9Yc8BCJfuHpn/FwCeP8hM
gTSnzuMvz4W6HyTALBLpZ1U=
=WcTH
-----END PGP SIGNATURE-----
 
M

Marc 'BlackJack' Rintsch

K.S.Sreeram said:
but wont the GIL be locked when the rehash occurs?

If there is a GIL then maybe yes. But the GIL is an implementation
detail. It's not in Jython nor IronPython and maybe not forever in
CPython. Don't know about PyPy.

Ciao,
Marc 'BlackJack' Rintsch
 
K

K.S.Sreeram

Marc said:
If there is a GIL then maybe yes. But the GIL is an implementation
detail. It's not in Jython nor IronPython and maybe not forever in
CPython. Don't know about PyPy.

Just wondering.. Is this simply a case of defensive programming or is it
an error? (say, we're targeting only CPython).
For instance, what happens when there's dictionary key with a custom
__hash__ or __eq__ method?

Its probably wise to simply use a lock and relieve ourselves of the
burden of thinking about these cases. But it still is worth knowing...

Regards
Sreeram


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEr7+jrgn0plK5qqURAhXWAKC3DIcu9OlHkkZRXB7O/J448Oxm4ACaAk4+
ycckFDDct8rB0Cg39/aIew4=
=AUBj
-----END PGP SIGNATURE-----
 
K

K.S.Sreeram

Alex said:
Wrong, alas: each assignment *could* cause the dictionary's internal
structures to be reorganized (rehashed) and impact another assignment
(or even 'get'-access).

(been thinking about this further...)

Dictionary get/set operations *must* be atomic, because Python makes
extensive internal use of dicts.

Consider two threads A and B, which are independent except for the fact
that they reside in the same module.

def thread_A() :
global foo
foo = 1

def thread_B() :
global bar
bar = 2

These threads create entries in the same module's dict, and they *might*
execute at the same time. Requiring a lock in this case is very
non-intuitive, and my conclusion is that dict get/set operations are
indeed atomic (thanks to the GIL).

Regards
Sreeram


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEr8gUrgn0plK5qqURAuGFAJ9W4uL75ZXElerrDz+pnbk9ZCBDVwCeN5Xy
jJ9ie+qyix1p7KmqcHulZmA=
=hVWs
-----END PGP SIGNATURE-----
 
I

Istvan Albert

Marc said:
It's not in Jython nor IronPython and maybe not forever in
CPython.

Whether or not a feature is present in Jython or IronPython does not
seem relevant, after all these languages emulate Python, one could
argue that it only means that this emulation is incomplete. Same for
changes way out in the future that may or may not materialize.

i.
 
A

Alex Martelli

K.S.Sreeram said:
Consider two threads A and B, which are independent except for the fact
that they reside in the same module.

def thread_A() :
global foo
foo = 1

def thread_B() :
global bar
bar = 2

These threads create entries in the same module's dict, and they *might*
execute at the same time. Requiring a lock in this case is very
non-intuitive, and my conclusion is that dict get/set operations are
indeed atomic (thanks to the GIL).

Well then, feel free to code under such assumptions (as long as you're
not working on any project in which I have any say:) -- depending on
subtle (and entirely version-dependent) considerations connected to
string interning, dict size, etc, you _might_ never run into failing
cases (as long as, say, every key in play is a short internable string
[never an instance of a _subtype_ of str of course], every value an int
small enough to be kept immortally in the global small-ints cache, etc,
etc)... why, dicts that forever remain below N entries for sufficiently
small (and version-dependent) N may in fact never be resized at all.

Most of us prefer to write code that will keep working a bit more
robustly (e.g. when the Python interpreter is upgraded from 2.5 to 2.6,
which might change some of these internal implementation details), and
relying on subtle reasoning about what might be "very non-intuitive" is
definitely counterproductive; alas, testing is not a great way to
discover "race conditions", deadlocks, etc, so threading-related errors
must be avoided ``beforehand'', by taking a very conservative stance
towards what operations might or might not happen to be
atomic/threadsafe unless specifically guaranteed by the Language
Reference (or the specific bits of the Library Reference).


Alex
 
A

Alex Martelli

Istvan Albert said:
Whether or not a feature is present in Jython or IronPython does not
seem relevant, after all these languages emulate Python, one could

Not at all, but rather: these _implementations_ are implementations of
the language Python; no "emulation" is involved at all.
argue that it only means that this emulation is incomplete. Same for
changes way out in the future that may or may not materialize.

What Python semantics' are is supposedly defined by the (normative)
Language Reference. "One could argue" anything one wishes, of course,
but the existence of freedom of speech does not necessarily make such
arguments sensible nor at all useful.


Alex
 
K

K.S.Sreeram

Alex said:
Well then, feel free to code under such assumptions (as long as you're
not working on any project in which I have any say:)

Hey, I would *never* write code which depends on such intricate
implementation details! Nonetheless, its good to *know* whats going on
inside. As they say.. Knowledge is Power!

Regards
Sreeram


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEr9sXrgn0plK5qqURAtDvAJ9oHVvn99WTitIOHrUrgnisTWOZiwCfR3Uw
4PFJ3+Tg+lbFCFhUx+FQYoQ=
=ORqw
-----END PGP SIGNATURE-----
 
A

Alex Martelli

K.S.Sreeram said:
Hey, I would *never* write code which depends on such intricate
implementation details! Nonetheless, its good to *know* whats going on
inside. As they say.. Knowledge is Power!

Since "all abstractions leak" (Spolski), it's indeed worthwhile knowing
what goes on "below" the abstraction (to be wary in advance about such
possible "leaks") -- but studying the sources (and the rich notes that
accompany them) is my favorite approach towards such knowledge!-)


Alex
 
A

Aahz

Whether or not a feature is present in Jython or IronPython does not
seem relevant, after all these languages emulate Python, one could
argue that it only means that this emulation is incomplete. Same for
changes way out in the future that may or may not materialize.

The GIL is *NOT* part of the Python language spec; it is considered a
CPython implementation detail. Other implementations are free to use
other mechanisms -- and they do.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,598
Members
45,147
Latest member
CarenSchni
Top