removeall() in list

C

castironpi

Any ideas for a thread-safe list.removeall( X ): removing all
occurrences of X within list L, when L might be modified concurrently?

Sincerely,
Aaron
 
P

Paul Rubin

Any ideas for a thread-safe list.removeall( X ): removing all
occurrences of X within list L, when L might be modified concurrently?

That way lies madness. Do something sensible instead. Put a lock
around the list, or put all mutators for the list into a single
thread, or whatever. Don't do what you're describing.
 
C

castironpi

That way lies madness. Do something sensible instead. Put a lock
around the list, or put all mutators for the list into a single
thread, or whatever. Don't do what you're describing.

This function just wants X out of the list. It doesn't matter if this
happens before, during, or after something else; so long as it happens.
 
P

Paul Rubin

This function just wants X out of the list. It doesn't matter if this
happens before, during, or after something else; so long as it happens.

If you're asking in a general way how to do something like that,
there are several basic methods:

1. Put a single thread in charge of the list, and communicate with it
by message passing through Queues. To get X out of the list, you'd
send the mutator thread a message asking for removal. The mutator
thread would loop reading and processing messages from the queue,
blocking when no requests are pending. This is sort of the preferred
Python style and is pretty simple to get correct, but if there are
many such objects you can end up with more threads than you really
want.

2. Associate a lock with the list. Anything wanting to access the
list should acquire the lock, do its stuff, then release the lock.
This gets confusing after a while.

3. Various other schemes involving finer grained locks etc. that
get horrendously error prone (race conditions etc).

There is probably a good tutorial somewhere about programming with
threads. It's sometimes a tricky subject, so it's worth taking
the time to study various approaches rather than re-inventing the wheeel.
 
C

castironpi

2. Associate a lock with the list. Anything wanting to access the
list should acquire the lock, do its stuff, then release the lock.
This gets confusing after a while.

To keep it generic, how about:

listA.op( insert, x )
listA.op( remove, x )
 
C

castironpi

To keep it generic, how about:

listA.op( insert, x )
listA.op( remove, x )

However, in reality, your rock and hard place are:
listA.op( listA.insert, x )
listA.op( listA.remove, x )

or

listA.op( 'insert', x )
listA.op( 'remove', x )
 
P

Paul Rubin

To keep it generic, how about:

listA.op( insert, x )
listA.op( remove, x )

Sure, there are various ways you can make the code look uniform. What
gets messy is if you want to (say) operate on several lists at the
same time, which means you need to hold multiple locks simultaneously,
and some other thread is also trying to do the same thing. If you
acquire the locks in the wrong order, you can get a situation where
both threads deadlock forever.
 
C

castironpi

Sure, there are various ways you can make the code look uniform. What
gets messy is if you want to (say) operate on several lists at the
same time, which means you need to hold multiple locks simultaneously,
and some other thread is also trying to do the same thing. If you
acquire the locks in the wrong order, you can get a situation where
both threads deadlock forever. And:
However, in reality, your rock and hard place are:
listA.op( listA.insert, x )
listA.op( listA.remove, x )

or

listA.op( 'insert', x )
listA.op( 'remove', x )

For a standard library, you may not want to be exposing the potential
for deadlock-- not that users can't do that themselves.

lockerA.op( listA.extend, [ x ] )
lockerB.op( listB.reverse )
def thA():
gui.listbox.append( lockerA.op( listA.pop ) )
 
C

castironpi

Sure, there are various ways you can make the code look uniform. What
gets messy is if you want to (say) operate on several lists at the
same time, which means you need to hold multiple locks simultaneously,
and some other thread is also trying to do the same thing. If you
acquire the locks in the wrong order, you can get a situation where
both threads deadlock forever. And:
However, in reality, your rock and hard place are:
listA.op( listA.insert, x )
listA.op( listA.remove, x )

listA.op( 'insert', x )
listA.op( 'remove', x )

For a standard library, you may not want to be exposing the potential
for deadlock-- not that users can't do that themselves.

lockerA.op( listA.extend, [ x ] )
lockerB.op( listB.reverse )
def thA():
gui.listbox.append( lockerA.op( listA.pop ) )

Could you:

lockerA= Locker( listA, listB )
lockerA.op( listB.reverse )
lockerA.op( listA.pop )

Where lockerA ops acquire the locks on all its threads?
 
P

Paul Rubin

Could you:

lockerA= Locker( listA, listB )
lockerA.op( listB.reverse )
lockerA.op( listA.pop )

Where lockerA ops acquire the locks on all its threads?

I don't understand that question. The main thing to understand is
that multi-threaded programming is complicated (especially if you're
after high performance), and very difficult to get right without
knowing exactly what you're doing. The generally preferred approach
in Python is to keep things simple at some performance cost.

Your Locker approach above looks sort of reasonable if you can be
absolutely sure that nothing else can mess with listA or listB
concurrently with those locker operations. Normally you would put
listA and listB into a single object along with a lock, then do
operations on that object.

You might check the Python Cookbook for some specific recipes and
sample code for this stuff. If you've used Java, Python's general
threading mechanisms are similar, but they are in the library rather
than built into the language (i.e. there is no "synchronized"
keyword, you have to do that locking explicitly).

What is the actual application, if you don't mind saying? Are you
sure that you really need concurrency?
 
C

castironpi

I don't understand that question. The main thing to understand is
that multi-threaded programming is complicated (especially if you're
after high performance), and very difficult to get right without
knowing exactly what you're doing. The generally preferred approach
in Python is to keep things simple at some performance cost.

Your Locker approach above looks sort of reasonable if you can be
absolutely sure that nothing else can mess with listA or listB
concurrently with those locker operations. Normally you would put
listA and listB into a single object along with a lock, then do
operations on that object.

You might check the Python Cookbook for some specific recipes and
sample code for this stuff. If you've used Java, Python's general
threading mechanisms are similar, but they are in the library rather
than built into the language (i.e. there is no "synchronized"
keyword, you have to do that locking explicitly).

What is the actual application, if you don't mind saying? Are you
sure that you really need concurrency?

I'm writing an NxN observer pattern, mostly for my own personal
exploration. Two threads -might- be calling 'Disconnect' at the same
time, and I can't even guarantee that the function runs properly.

for emelem in [ e for e in emlist if e.func is func ]:
try:
emlist.remove( emelem )
except ValueError:
pass
 
C

castironpi

I don't understand that question. The main thing to understand is
that multi-threaded programming is complicated (especially if you're
after high performance), and very difficult to get right without
knowing exactly what you're doing. The generally preferred approach
in Python is to keep things simple at some performance cost.
Your Locker approach above looks sort of reasonable if you can be
absolutely sure that nothing else can mess with listA or listB
concurrently with those locker operations. Normally you would put
listA and listB into a single object along with a lock, then do
operations on that object.
You might check the Python Cookbook for some specific recipes and
sample code for this stuff. If you've used Java, Python's general
threading mechanisms are similar, but they are in the library rather
than built into the language (i.e. there is no "synchronized"
keyword, you have to do that locking explicitly).
What is the actual application, if you don't mind saying? Are you
sure that you really need concurrency?

I'm writing an NxN observer pattern, mostly for my own personal
exploration. Two threads -might- be calling 'Disconnect' at the same
time, and I can't even guarantee that the function runs properly.

for emelem in [ e for e in emlist if e.func is func ]:
try:
emlist.remove( emelem )
except ValueError:
pass

Though:

class A:
@synch( 'lockA' )
def Connect( self, target ): ...
@synch( 'lockA' )
def Disconnect( self, target ): ...

where decorator 'synch' is defined as:

take the 'lockA' attribute of the first function parameter, call
acquire(), call the function, call release(); has the right idea.
Another is on __init__, go through and wrap every member function in
self.lockA.acquire() and .release().
 
P

Paul Rubin

I think the Pythonic way to do this is have the targets communicate
with the observers through queues rather than with callbacks.
 
F

Fredrik Lundh

I'm writing an NxN observer pattern, mostly for my own personal
exploration. Two threads -might- be calling 'Disconnect' at the same
time, and I can't even guarantee that the function runs properly.

for emelem in [ e for e in emlist if e.func is func ]:
try:
emlist.remove( emelem )
except ValueError:
pass

so use a lock. it's a whopping two lines of code:

creation:

lock = threading.Lock()

usage:

with lock:
for emelem in ...
...

more here:

http://effbot.org/zone/thread-synchronization.htm

and btw, looping over a list to figure out what you want to remove from
that list is a bit pointless. better just create a new list:

with lock:
# get rid of all func instances
emlist = [e for e in emlist if e.func is not func]

an alternative approach would be to replace emlist with a dictionary,
keyed on func objects. that'll let you remove all items associated with
a given function with a single atomic operation:

del emdict[func]

</F>
 
C

castironpi

I'm writing an NxN observer pattern, mostly for my own personal
exploration. Two threads -might- be calling 'Disconnect' at the same
time, and I can't even guarantee that the function runs properly.
for emelem in [ e for e in emlist if e.func is func ]:
try:
emlist.remove( emelem )
except ValueError:
pass

so use a lock. it's a whopping two lines of code:

creation:

lock = threading.Lock()

usage:

with lock:
for emelem in ...
...

more here:

http://effbot.org/zone/thread-synchronization.htm

and btw, looping over a list to figure out what you want to remove from
that list is a bit pointless. better just create a new list:

with lock:
# get rid of all func instances
emlist = [e for e in emlist if e.func is not func]

an alternative approach would be to replace emlist with a dictionary,
keyed on func objects. that'll let you remove all items associated with
a given function with a single atomic operation:

del emdict[func]

</F>

-> so use a lock. it's a whopping two lines of code:
Yes.
1) I'm wondering what the specifics are on [Rubin]:
list should acquire the lock, do its stuff, then release the lock.
This gets confusing after a while.<<

I considered these suggestions early on. They apply often but I ruled
them out for reasons:

2) List is referenced by others; concurrent modifications may be going
on; can not replace it. Can I make asynchronous modifications and
merge the changes, SCM-style?
3) Dictionary returns non-static order; order is important. Create a
int-func tuple and sort dictionary results? Perhaps. That's sounding
pretty good.
 
C

castironpi

(e-mail address removed) writes:

1. Put a single thread in charge of the list, and communicate with it
by message passing through Queues. To get X out of the list, you'd
send the mutator thread a message asking for removal. The mutator
thread would loop reading and processing messages from the queue,
blocking when no requests are pending. This is sort of the preferred
Python style and is pretty simple to get correct, but if there are
many such objects you can end up with more threads than you really
want.

I've heard this called 'fire and forget'. You can insure that
mutations are honored one-at-a-time and in the order received. How
do you make a -read- operation; wait for queued mutations, that is
lock for a turn on the queue? Can you optionally read whatever the
state is, regardless of what's happened in the meantime? Thing is,
one thread needs its -own- preceding operations completed before a
reading operation.
 
C

castironpi

I've heard this called 'fire and forget'. You can insure that
mutations are honored one-at-a-time and in the order received. How
do you make a -read- operation; wait for queued mutations, that is
lock for a turn on the queue? Can you optionally read whatever the
state is, regardless of what's happened in the meantime? Thing is,
one thread needs its -own- preceding operations completed before a
reading operation.

Brainstorm.

First, isolation of problem: Terminates at 2000 or so, on my
computer.

import thread
import time
import random
counter= 0
def simplecounter():
global counter
ret= counter
counter+= 1
time.sleep( random.uniform( 0, .001 ) )
return counter
glist= []
def th3():
while 1:
ret= simplecounter()
glist.append( ret )
print ret,
assert glist== range( 1, len( glist )+1 )
thread.start_new_thread( th3, () )
time.sleep(1)
thread.start_new_thread( th3, () )
time.sleep( 1000 )

Second, the thoughts: 'with a.callbacklock():' looks best currently.

'''multithreading ideas:
1. Put a single thread in charge
a.k.a. fire and forget.
- Lots of extra threads
+ But most are blocking most of the time
+ honored one-at-a-time, and in order received
+ ...optionally including read-access, blocking on
to get return values
a. synchronous callbacks, for read-access
+ multi-step, user-definitionized operations
- one consumer can hang an entire object
i. with a.callbacklock():?
+ only call acquire() from curr. thread, enqueue
lock obj., released from producer thread "soon"
using message-queue semantics
b. mini-transaction, guarantees all and only
consumer's ops occur in succession
- can't do anything amidst an indivdual locking
- no multi-step ops
2. Lock mutation and/or all operations
a. Locker.op
b. with Locker.withop
- In Python, other programmers always have access
to your data; nothing guarantees they'll use "with locker"
+ User-definitioning quite easy
3. @mutation decorator
def mutation( func ):
def newfunc( self, *a, **k ):
self.lock.acquire()
func( *a, **k )
self.lock.release()
4. List-only solution:
Use a dictionary, map item to its index.
To retrieve, sort on value, not key
'''
 
R

Rhamphoryncus

I don't understand that question. The main thing to understand is
that multi-threaded programming is complicated (especially if you're
after high performance), and very difficult to get right without
knowing exactly what you're doing. The generally preferred approach
in Python is to keep things simple at some performance cost.
Your Locker approach above looks sort of reasonable if you can be
absolutely sure that nothing else can mess with listA or listB
concurrently with those locker operations. Normally you would put
listA and listB into a single object along with a lock, then do
operations on that object.
You might check the Python Cookbook for some specific recipes and
sample code for this stuff. If you've used Java, Python's general
threading mechanisms are similar, but they are in the library rather
than built into the language (i.e. there is no "synchronized"
keyword, you have to do that locking explicitly).
What is the actual application, if you don't mind saying? Are you
sure that you really need concurrency?

I'm writing an NxN observer pattern, mostly for my own personal
exploration. Two threads -might- be calling 'Disconnect' at the same
time, and I can't even guarantee that the function runs properly.

for emelem in [ e for e in emlist if e.func is func ]:
try:
emlist.remove( emelem )
except ValueError:
pass

Is there a reason you're using a list, rather than a dict? Note that
each call to list.remove() is O(n), whereas deleting a key from a dict
is O(1).
 
C

castironpi

I'm writing an NxN observer pattern, mostly for my own personal
exploration. Two threads -might- be calling 'Disconnect' at the same
time, and I can't even guarantee that the function runs properly.
for emelem in [ e for e in emlist if e.func is func ]:
try:
emlist.remove( emelem )
except ValueError:
pass

Is there a reason you're using a list, rather than a dict? Note that
each call to list.remove() is O(n), whereas deleting a key from a dict
is O(1).

4. List-only solution:
Use a dictionary, map item to its index.
To retrieve, sort on value, not key

Sure: you need to maintain order of the things you're guarding. I
need this to happen first, then this second. My application is event
callbacks, that you register in a certain order; think GUIs and the
Gamma et al. Chain of Responsibility pattern. This is a circumstance;
in a lot of applications order doesn't matter and you can use a hash.

But even so, you still can't check membership during concurrent
operation. In other words, to remove all elements E such that E.func
is FuncF, the simple
for e in setofe[:]:
if e.func is not FuncF: continue
setofe.remove( e )
still doesn't work.
 
P

Paul Rubin

2) List is referenced by others; concurrent modifications may be going
on; can not replace it. Can I make asynchronous modifications and
merge the changes, SCM-style?

Nothing else should have direct access to the list.
3) Dictionary returns non-static order; order is important. Create a
int-func tuple and sort dictionary results? Perhaps. That's sounding
pretty good.

If the list is not too long, that is reasonable. If it -is- long, the
remove operations can be slow, but that's ok if there's not too many.
If the list is long -and- there's a lot of adds/removes, maybe you
want something like an AVL tree.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,432
Messages
2,571,680
Members
48,796
Latest member
Greg L.

Latest Threads

Top