Thread Locking issue - Can't allocate lock (sem_init fail)

J

jamskip

Hi all,

I have a peculiar problem with a multithreaded program of mine
(actually I've sort of inherited it). Before i show you the error,
here's a litle background. Its a program to check email addresses are
valid, and its main task is to verify the domain names.

Here's the basic functionality:

* The prog has a list of domains it has seen before which is read into
memory (the 'rollover').
* A new list of emails is read-in from a file (to a queue) and is
checked against the rollover.
* If we've seen the domain before then update the existing entry.
* If we've not seen the domain before, add it.

The program is multithreaded to speed up the processing...there are
input and output Queues.

Now, each domain entry is an class object containing various bits of
info. Each domain class also has its own lock, so that only one thread
can modify each domain at a time.

I'm load-testing the program with a sample of 1 million email
addresses and when i hit about the 500,000 mark i get a locking
error...

sem_init: No space left on device
Exception in thread Thread-2:
Traceback (most recent call last):
File "/usr/local/lib/python2.4/threading.py", line 442, in
__bootstrap
self.run()
File "/usr/local/lib/python2.4/threading.py", line 422, in run
self.__target(*self.__args, **self.__kwargs)
File "jess.py", line 250, in worker
record.result = function( id, record.value )
File "jess.py", line 291, in action
found_domain = domains.addNewDomain( domain_name )
File "jess.py", line 123, in addNewDomain
self.domain_store.append( self.Domain( name = name ) )
File "jess.py", line 46, in __init__
self.lock = Lock()
error: can't allocate lock

Googling for this sort of error doesn't yield any results, and i can't
find any information about limits to the number of locks you can have
in Python. The 'No space left on device' message indicates a memory
issue, however i doubt this since its running on a linux server with 4
cores and 16GB ram. It seems more like an internal Python limit has
been hit (sem_init - semaphore initialisation?). Does anyone know more
about threading internals and any internal limits?

For the timebeing, I've implemented locking at a higher level which
will reduce performance but keep the number of lock objects to a
minumum.

Thanks
 
P

Philip Semanchuk

Hi all,

I have a peculiar problem with a multithreaded program of mine
(actually I've sort of inherited it). Before i show you the error,
here's a litle background. Its a program to check email addresses are
valid, and its main task is to verify the domain names.

Here's the basic functionality:

* The prog has a list of domains it has seen before which is read into
memory (the 'rollover').
* A new list of emails is read-in from a file (to a queue) and is
checked against the rollover.
* If we've seen the domain before then update the existing entry.
* If we've not seen the domain before, add it.

The program is multithreaded to speed up the processing...there are
input and output Queues.

Now, each domain entry is an class object containing various bits of
info. Each domain class also has its own lock, so that only one thread
can modify each domain at a time.

I'm load-testing the program with a sample of 1 million email
addresses and when i hit about the 500,000 mark i get a locking
error...

sem_init: No space left on device
Exception in thread Thread-2:
Traceback (most recent call last):
File "/usr/local/lib/python2.4/threading.py", line 442, in
__bootstrap
self.run()
File "/usr/local/lib/python2.4/threading.py", line 422, in run
self.__target(*self.__args, **self.__kwargs)
File "jess.py", line 250, in worker
record.result = function( id, record.value )
File "jess.py", line 291, in action
found_domain = domains.addNewDomain( domain_name )
File "jess.py", line 123, in addNewDomain
self.domain_store.append( self.Domain( name = name ) )
File "jess.py", line 46, in __init__
self.lock = Lock()
error: can't allocate lock

Googling for this sort of error doesn't yield any results, and i can't
find any information about limits to the number of locks you can have
in Python. The 'No space left on device' message indicates a memory
issue, however i doubt this since its running on a linux server with 4
cores and 16GB ram. It seems more like an internal Python limit has
been hit (sem_init - semaphore initialisation?). Does anyone know more
about threading internals and any internal limits?

Hi Jamskip,
I don't work with threading code but I have been working with
semaphores for my IPC extensions. sem_init() is a call to create a
semaphore (http://linux.die.net/man/3/sem_init). If it is failing,
then I'd guess you're trying to create an awful lot of semaphores
(intentionally or otherwise) and that you're hitting some internal
limit.

I would not be too quick to assume that the number of semaphores one
can create is bounded by the amount of RAM in your system. I don't
think they're simple chunks of malloc-ed memory. They're probably
represented in a kernel data structure somewhere that's hardcoded to
some generous but fixed value.

Please note that this is all speculation on my part. I think that the
Python threading implementation would use the "local" (i.e. not
process-shared) semaphores which can be allocated on the process'
heap. This would seem only RAM-limited, but I'll bet it isn't.

You might want to start debugging by track exactly how many locks
you're creating. If the number is really big, start investigating
kernel semaphore limits and how they're set.


Good luck
Philip
 
M

MRAB

Philip said:
Hi Jamskip,
I don't work with threading code but I have been working with semaphores
for my IPC extensions. sem_init() is a call to create a semaphore
(http://linux.die.net/man/3/sem_init). If it is failing, then I'd guess
you're trying to create an awful lot of semaphores (intentionally or
otherwise) and that you're hitting some internal limit.

I would not be too quick to assume that the number of semaphores one can
create is bounded by the amount of RAM in your system. I don't think
they're simple chunks of malloc-ed memory. They're probably represented
in a kernel data structure somewhere that's hardcoded to some generous
but fixed value.

Please note that this is all speculation on my part. I think that the
Python threading implementation would use the "local" (i.e. not
process-shared) semaphores which can be allocated on the process' heap.
This would seem only RAM-limited, but I'll bet it isn't.

You might want to start debugging by track exactly how many locks you're
creating. If the number is really big, start investigating kernel
semaphore limits and how they're set.
You're creating a thread and a lock for each _domain_? Sounds like
overkill to me. Many domains means many threads and many locks, by the
sounds of it too many for the system.
 
B

Bryan Olson

> The program is multithreaded to speed up the processing...there are
> input and output Queues.

It's not the major point here, but are you aware of Python's GIL?
> Now, each domain entry is an class object containing various bits of
> info. Each domain class also has its own lock, so that only one thread
> can modify each domain at a time.
>
> I'm load-testing the program with a sample of 1 million email
> addresses and when i hit about the 500,000 mark i get a locking
> error...

Does that correspond to creating 500,000 locks? For sure?
> sem_init: No space left on device [...]
> error: can't allocate lock

What happens if you try to allocate a million locks, as in:

from threading import Lock
locks = []
for i in range(10**6):
try:
locks.append(Lock())
locks[-1].acquire()
except:
print "Failed after %d locks." % i
raise

I tried the above in WindowsXP and Ubuntu 8.04 (Python 2.6 and 2.52
respectively), and it ran on both without the exception.
> Googling for this sort of error doesn't yield any results, and i can't
> find any information about limits to the number of locks you can have
> in Python. The 'No space left on device' message indicates a memory
> issue, however i doubt this since its running on a linux server with 4
> cores and 16GB ram. It seems more like an internal Python limit has
> been hit (sem_init - semaphore initialisation?). Does anyone know more
> about threading internals and any internal limits?

Python treading is implemented on top of some thread facility the target
O.S. provides. Creating a Python lock may or may not use up some limited
kernel resource. Given my experiments, limited as they were I do not
think this is a Python limit. If could be an issue with your platform,
or a defect in the code you inherited.
> For the timebeing, I've implemented locking at a higher level which
> will reduce performance but keep the number of lock objects to a
> minumum.

That's probably a reasonable strategy whether or not you can create a
million locks.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Staff online

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,071
Latest member
MetabolicSolutionsKeto

Latest Threads

Top