Python multithreading problem

abhinav · Mar 26, 2006

//A CRAWLER IMPLEMENTATION
please run this prog. on the shell and under the control of debugger
when this prog. is run normally the prog. does not terminate .It
doesn't come out of the cond. if c<5: so this prog. continues
infinitely
but if this prog is run under the control of debugger the prog
terminates when the cond. if c<5: becomes false
i think this prob. may be due to multithreading pls help.

from sgmllib import SGMLParser
import threading
import re
import urllib
import pdb
import time
class urlist(SGMLParser):
def reset(self):
SGMLParser.reset(self)
self.list=[]

def start_a(self,attr):
href=[v for k,v in attr if k=="href"]
if href:
self.list.extend(href)
mid=2
c=0
class mythread(threading.Thread):
stdmutex=threading.Lock()
global threads
threads=[]
def __init__(self,u,myid):
self.u=u
self.myid=myid
threading.Thread.__init__(self)
def run(self):
global c
global mid
if c<5:
self.stdmutex.acquire()
self.usock=urllib.urlopen(self.u)
self.p=urlist()
self.s=self.usock.read()
self.p.feed(self.s)
self.usock.close()
self.p.close()
c=c+1
fname="/root/" + str(c) + ".txt"
self.f=open(fname,"w")
self.f.write(self.s)
self.f.close()
print c
print self.p.list
print self.u
print self.myid
for j in self.p.list:
k=re.search("^https?:",j)
if k:
i=mythread(j,mid)
i.start()
threads.append(i)
mid=mid+1
self.stdmutex.release()

if __name__=="__main__":
thread=mythread("http://www.google.co.in/",1)
thread.start()
threads.append(thread)
for thread in threads:
thread.join()
print "main thread exits"

Dennis Lee Bieber · Mar 26, 2006

self.list.extend(href)
mid=2
c=0
class mythread(threading.Thread):
stdmutex=threading.Lock()
global threads
threads=[]

Move that line out -- initialize all your globals (ugh) in the same
spot...

self.stdmutex.acquire()

There is NO actual self.stdmutex (using self. implies that a copy
exists for each instance), you should be using a class level reference,
or __init__ should make self.stdmutex a reference to the class level...

It may be inherited, but it isn't safe, in my mind..

for j in self.p.list:
k=re.search("^https?:",j)
if k:
i=mythread(j,mid)
i.start()
threads.append(i)
mid=mid+1
self.stdmutex.release()

That doesn't look good either -- you are releasing the lock from
inside a loop -- and "k" never changes, so you keep releasing the same
lock and spawning new threads...

Maybe something like:

for j in ...:
k = re...
if k:
threads.append(mythread(j, mid).start())
mid = mid + 1
break
self.stdmutex...

--

Serge Orlov · Mar 27, 2006

abhinav said:
//A CRAWLER IMPLEMENTATION
please run this prog. on the shell and under the control of debugger
when this prog. is run normally the prog. does not terminate .It
doesn't come out of the cond. if c<5: so this prog. continues
infinitely

How do you know? Have you waited *infinitely*

if c<5:
self.stdmutex.acquire()

The problem you have a lot of threads that has already checked c < 5
condition but has not acquired the lock yet. Besides you have another
problem: if a thread raises an exception you don't release the lock.
Why don't you use Queue module for sane thread management?

Serge.

abhinav · Mar 27, 2006

thanks guys.I solved the problem by moving self.stdmutex.acquire()
before if c<5:

Python Tutorial on Multithreading	3	Feb 21, 2011
multithreading in python	4	Aug 13, 2013
Python code problem	2	Apr 23, 2023
Need help with this Python code.	2	Jun 13, 2023
multithreading, performance, again...	1	Dec 30, 2009
Python battle game help	2	Feb 23, 2023
A thread import problem	0	Jul 19, 2012
asynchronous events with multithreading and multiprocessing withcircuits	0	Jan 8, 2009

Python multithreading problem

abhinav

Dennis Lee Bieber

Serge Orlov

abhinav

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads