why am I getting a segmentation fault?

Jay donnell · Jan 21, 2005

I have a short multi-threaded script that checks web images to make
sure they are still there. I get a segmentation fault everytime I run
it and I can't figure out why. Writing threaded scripts is new to me so
I may be doing something wrong that should be obvious

google messes up the python code so here is a link to it.

http://kracomp.com/~jay/py.txt

This is the output of the script.
[jay@localhost scripts]$ ./py.py
update item set goodImage = 'yes' where productId='12603'
update item set goodImage = 'yes' where productId='18272'
update item set goodImage = 'yes' where productId='1927'
update item set goodImage = 'no' where productId='12709'
update item set goodImage = 'yes' where productId='32087'
update item set goodImage = 'no' where productId='25803'
Segmentation fault

Thanks in advance.

Paul McGuire · Jan 21, 2005

Jay donnell said:
I have a short multi-threaded script that checks web images to make
sure they are still there. I get a segmentation fault everytime I run
it and I can't figure out why. Writing threaded scripts is new to me so
I may be doing something wrong that should be obvious

<snip>

Here is a code excerpt from your link (the main routine, omits the class
definition for ImageChecker, which extends threading.Thread):

db = MySQLdb.connect(host="localhost", user="xxx", passwd="xxx", db="xxx")
cursor = db.cursor()
query = "select * from item order by rand() limit 0, 100"
#query = "select * from item"
cursor.execute(query)
result = cursor.fetchall()

maxThreads = 5

for r in result:
while(threading.activeCount() > maxThreads):
pass
flag='good'
#pass
#print str(r[0]) + ', ' + str(r[7])
tmp = r[7].split('/')
filename = tmp[-1]
#print 'filename ' + filename

filename = '/tmp/'+filename

threadList = []

#r[7] is the url of the image
#r[0] is the id for the row
imageChecker = ImageChecker(r[7], filename, r[0])
imageChecker.start()
threadList.append(imageChecker)

----------------------------------------------
1. What happens after you fall out of the loop "for r in result"? Shouldn't
you wait for the remaining threads to finish? Perhaps you need another
busy-loop to wait for the threads to finish, something like
while threading.activeCount() > 0: pass
2. Is this the best way to busy-wait? What about some kind of
thread.join()? At least throw a sleep call in there or something, or this
loop will churn and churn, diverting CPU from your threads that are actually
trying to do some real work.
3. I find it easier to work with named variables than numeric subscripts.
At the top of your for loop, try something like:
id,height,width,numBytes,whatever,slkdjf1,slkdjf2,url = r
This way you have much more meaningful names than r[0] and r[7], which you
later have to comment to explain whats going on!
4. filename=r[7].split('/')[-1] is not terribly portable. See if there is a
standard module for parsing filespecs (I'll bet there is).

-- Paul

Jeff Shannon · Jan 21, 2005

Paul said:
4. filename=r[7].split('/')[-1] is not terribly portable. See if there is a
standard module for parsing filespecs (I'll bet there is).

Indeed there is -- os.path. In particular, os.path.basename() seems
to do exactly that snippet is intending, in a much more robust (and
readable) fashion.

Jeff Shannon
Technician/Programmer
Credit International

John Machin · Jan 21, 2005

Jay said:
I have a short multi-threaded script that checks web images to make
sure they are still there. I get a segmentation fault everytime I run
it and I can't figure out why. Writing threaded scripts is new to me so
I may be doing something wrong that should be obvious

def run(self):
try:
self.site = urllib.urlopen(self.url)
self.f=open(self.filename, 'w')
self.im = Image.open(self.filename)
self.size = self.im.size
self.flag = 'yes'
self.q = "yadda yadda"

That's SIX names that don't need to be attributes of the object; they
are used _only_ in this method, so they can be local to the method,
saving you a whole lot of typing "self." and saving puzzlement &
careful scrutiny by those trying to read your code.

Back to your real problem: does it work when you set maxThreads to 1?
did it work before you added the threading code? what does your
debugger tell you about the location of the seg fault?

MOST IMPORTANTLY, sort this mess out:

self.q = "update item set goodImage = '" + self.flag + "' where
productId='" + str(self.id) + "'"
print self.q
self.cursor.execute(query) ###############################

### "query" is a global variable[YUK!] (see below) which isn't going to
do what you want. Looks like you meant "self.q".

self.db.close()

db = MySQLdb.connect(host="localhost", user="xxx", passwd="xxx",
db="xxx")
cursor = db.cursor()
query = "select * from item order by rand() limit 0, 100"

### Have you looked in your database to see if the script has actually
updated item.goodImage? Do you have a test plan?

Jay donnell · Jan 22, 2005

Thank you.

I made all the changes you recommended and everything seems to be
working.

Jay donnell · Jan 22, 2005

### Have you looked in your database to see if the >script has
actually

updated item.goodImage? Do you have a test plan?

Thank you for the help. Sorry for the messy code. I was under time
constraints. I had class, and I was rushing to get this working before
class. I should waited a day and read over it before I asked. Sorry
again.

John Machin · Jan 22, 2005

Jay said:
Thank you for the help. Sorry for the messy code. I was under time
constraints. I had class, and I was rushing to get this working before
class. I should waited a day and read over it before I asked. Sorry
again.

Apologise to _yourself_ for the messy code. If 'query' had been safely
tucked away as a local variable instead of a global, that problem
(typing 'query' instead of 'self.q') would have caused an exception on
the first time around.

A few points: Yes, don't rush. I've-forgotten-whom wrote something like
"Don't program standing up". Good advice.

Build things a piece at a time. Build your tests at the same time or
earlier. In this case, a stand-alone method or function that checked
that one file was OK, would have been a reasonable place to start. Then
add the code to query the db. Then add the threading stuff, gingerly.
If you build and test incrementally, then a segfault or other disaster
is highly likely to have been caused by the latest addition.

AND SOME MORE CRUFT:
def __init__(self, url, filename, id):
self.t = time.time() <<<<<<<<<======= never used again
threading.Thread.__init__(self)
self.db = MySQLdb.connect(host="localhost", user="xxx",
passwd="xxx", db="xxx")
# create a cursor
self.cursor = db.cursor() <<<=== should be self.db.cursor()
#### picks up the *GLOBAL* 'db'
self.url = url
self.filename = filename
self.id = id
===========================
threadList = []
[snip]
threadList.append(imageChecker)
===>>>> that doesn't achieve much!
============================

N.B. You still have two problems: (1) Your script as you said now
"seems to work". That doesn't sound like you have a test plan. (2) You
have shuffled your code around and the segfault went away; i.e. you
waved a dead chicken and the volcano stopped erupting. Most of the
changes suggested by others and myself were of a style/clarity nature.
The self.q/query thing would have caused a select instead an update;
hardly segfault territory. I wouldn't expect that busy-wait loop to
cause a segfault. You still don't know what caused the segfault. That
means you don't know how to avoid it in the future. You are still
living in the shadow of the volcano. Will the chicken trick work next
time?

Looking forward to the next episode,
John

I am trying to detect Which image id="" was clicked ?	22	Jan 3, 2023
Debugging a segmentation fault	2	Jul 8, 2010
segmentation fault sometimes	6	Sep 10, 2009
Segmentation Fault	29	Sep 1, 2009
GET NEIL DEGRASSES TYSON, I ripped a hole with this one...	0	Nov 10, 2022
Embedded Ruby 1.8.7â€”Segmentation fault	1	Aug 16, 2010
Survey details won't go through using php, ajax, Mysql	0	Oct 26, 2023
'invalid token' error and segmentation fault (XML issue?)	4	Aug 4, 2011

why am I getting a segmentation fault?

Jay donnell

Paul McGuire

Jeff Shannon

John Machin

Jay donnell

Jay donnell

John Machin

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads