why am I getting a segmentation fault?

Discussion in 'Python' started by Jay donnell, Jan 21, 2005.

  1. Jay  donnell

    Jay donnell Guest

    I have a short multi-threaded script that checks web images to make
    sure they are still there. I get a segmentation fault everytime I run
    it and I can't figure out why. Writing threaded scripts is new to me so
    I may be doing something wrong that should be obvious :(

    google messes up the python code so here is a link to it.

    http://kracomp.com/~jay/py.txt

    This is the output of the script.
    [jay@localhost scripts]$ ./py.py
    update item set goodImage = 'yes' where productId='12603'
    update item set goodImage = 'yes' where productId='18272'
    update item set goodImage = 'yes' where productId='1927'
    update item set goodImage = 'no' where productId='12709'
    update item set goodImage = 'yes' where productId='32087'
    update item set goodImage = 'no' where productId='25803'
    Segmentation fault



    Thanks in advance.
    Jay donnell, Jan 21, 2005
    #1
    1. Advertising

  2. Jay  donnell

    Paul McGuire Guest

    "Jay donnell" <> wrote in message
    news:...
    > I have a short multi-threaded script that checks web images to make
    > sure they are still there. I get a segmentation fault everytime I run
    > it and I can't figure out why. Writing threaded scripts is new to me so
    > I may be doing something wrong that should be obvious :(
    >

    <snip>

    Here is a code excerpt from your link (the main routine, omits the class
    definition for ImageChecker, which extends threading.Thread):

    db = MySQLdb.connect(host="localhost", user="xxx", passwd="xxx", db="xxx")
    cursor = db.cursor()
    query = "select * from item order by rand() limit 0, 100"
    #query = "select * from item"
    cursor.execute(query)
    result = cursor.fetchall()

    maxThreads = 5

    for r in result:
    while(threading.activeCount() > maxThreads):
    pass
    flag='good'
    #pass
    #print str(r[0]) + ', ' + str(r[7])
    tmp = r[7].split('/')
    filename = tmp[-1]
    #print 'filename ' + filename

    filename = '/tmp/'+filename

    threadList = []

    #r[7] is the url of the image
    #r[0] is the id for the row
    imageChecker = ImageChecker(r[7], filename, r[0])
    imageChecker.start()
    threadList.append(imageChecker)

    ----------------------------------------------
    1. What happens after you fall out of the loop "for r in result"? Shouldn't
    you wait for the remaining threads to finish? Perhaps you need another
    busy-loop to wait for the threads to finish, something like
    while threading.activeCount() > 0: pass
    2. Is this the best way to busy-wait? What about some kind of
    thread.join()? At least throw a sleep call in there or something, or this
    loop will churn and churn, diverting CPU from your threads that are actually
    trying to do some real work.
    3. I find it easier to work with named variables than numeric subscripts.
    At the top of your for loop, try something like:
    id,height,width,numBytes,whatever,slkdjf1,slkdjf2,url = r
    This way you have much more meaningful names than r[0] and r[7], which you
    later have to comment to explain whats going on!
    4. filename=r[7].split('/')[-1] is not terribly portable. See if there is a
    standard module for parsing filespecs (I'll bet there is).

    -- Paul
    Paul McGuire, Jan 21, 2005
    #2
    1. Advertising

  3. Jay  donnell

    Jeff Shannon Guest

    Paul McGuire wrote:

    > 4. filename=r[7].split('/')[-1] is not terribly portable. See if there is a
    > standard module for parsing filespecs (I'll bet there is).


    Indeed there is -- os.path. In particular, os.path.basename() seems
    to do exactly that snippet is intending, in a much more robust (and
    readable) fashion.

    Jeff Shannon
    Technician/Programmer
    Credit International
    Jeff Shannon, Jan 21, 2005
    #3
  4. Jay  donnell

    John Machin Guest

    Jay donnell wrote:
    > I have a short multi-threaded script that checks web images to make
    > sure they are still there. I get a segmentation fault everytime I run
    > it and I can't figure out why. Writing threaded scripts is new to me

    so
    > I may be doing something wrong that should be obvious :(
    >


    def run(self):
    try:
    self.site = urllib.urlopen(self.url)
    self.f=open(self.filename, 'w')
    self.im = Image.open(self.filename)
    self.size = self.im.size
    self.flag = 'yes'
    self.q = "yadda yadda"

    That's SIX names that don't need to be attributes of the object; they
    are used _only_ in this method, so they can be local to the method,
    saving you a whole lot of typing "self." and saving puzzlement &
    careful scrutiny by those trying to read your code.

    Back to your real problem: does it work when you set maxThreads to 1?
    did it work before you added the threading code? what does your
    debugger tell you about the location of the seg fault?

    MOST IMPORTANTLY, sort this mess out:

    self.q = "update item set goodImage = '" + self.flag + "' where
    productId='" + str(self.id) + "'"
    print self.q
    self.cursor.execute(query) ###############################

    ### "query" is a global variable[YUK!] (see below) which isn't going to
    do what you want. Looks like you meant "self.q".

    self.db.close()

    db = MySQLdb.connect(host="localhost", user="xxx", passwd="xxx",
    db="xxx")
    cursor = db.cursor()
    query = "select * from item order by rand() limit 0, 100"


    ### Have you looked in your database to see if the script has actually
    updated item.goodImage? Do you have a test plan?
    John Machin, Jan 21, 2005
    #4
  5. Jay  donnell

    Jay donnell Guest

    Thank you.

    I made all the changes you recommended and everything seems to be
    working.
    Jay donnell, Jan 22, 2005
    #5
  6. Jay  donnell

    Jay donnell Guest

    >### Have you looked in your database to see if the >script has
    actually
    >updated item.goodImage? Do you have a test plan?


    Thank you for the help. Sorry for the messy code. I was under time
    constraints. I had class, and I was rushing to get this working before
    class. I should waited a day and read over it before I asked. Sorry
    again.
    Jay donnell, Jan 22, 2005
    #6
  7. Jay  donnell

    John Machin Guest

    Jay donnell wrote:
    > >### Have you looked in your database to see if the >script has

    > actually
    > >updated item.goodImage? Do you have a test plan?

    >
    > Thank you for the help. Sorry for the messy code. I was under time
    > constraints. I had class, and I was rushing to get this working

    before
    > class. I should waited a day and read over it before I asked. Sorry
    > again.


    Apologise to _yourself_ for the messy code. If 'query' had been safely
    tucked away as a local variable instead of a global, that problem
    (typing 'query' instead of 'self.q') would have caused an exception on
    the first time around.

    A few points: Yes, don't rush. I've-forgotten-whom wrote something like
    "Don't program standing up". Good advice.

    Build things a piece at a time. Build your tests at the same time or
    earlier. In this case, a stand-alone method or function that checked
    that one file was OK, would have been a reasonable place to start. Then
    add the code to query the db. Then add the threading stuff, gingerly.
    If you build and test incrementally, then a segfault or other disaster
    is highly likely to have been caused by the latest addition.

    AND SOME MORE CRUFT:
    def __init__(self, url, filename, id):
    self.t = time.time() <<<<<<<<<======= never used again
    threading.Thread.__init__(self)
    self.db = MySQLdb.connect(host="localhost", user="xxx",
    passwd="xxx", db="xxx")
    # create a cursor
    self.cursor = db.cursor() <<<=== should be self.db.cursor()
    #### picks up the *GLOBAL* 'db'
    self.url = url
    self.filename = filename
    self.id = id
    ===========================
    threadList = []
    [snip]
    threadList.append(imageChecker)
    ===>>>> that doesn't achieve much!
    ============================

    N.B. You still have two problems: (1) Your script as you said now
    "seems to work". That doesn't sound like you have a test plan. (2) You
    have shuffled your code around and the segfault went away; i.e. you
    waved a dead chicken and the volcano stopped erupting. Most of the
    changes suggested by others and myself were of a style/clarity nature.
    The self.q/query thing would have caused a select instead an update;
    hardly segfault territory. I wouldn't expect that busy-wait loop to
    cause a segfault. You still don't know what caused the segfault. That
    means you don't know how to avoid it in the future. You are still
    living in the shadow of the volcano. Will the chicken trick work next
    time?

    Looking forward to the next episode,
    John
    John Machin, Jan 22, 2005
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Alex Hunsley
    Replies:
    17
    Views:
    843
  2. Vishal Grover
    Replies:
    10
    Views:
    526
    Vishal Grover
    Sep 11, 2003
  3. fudmore
    Replies:
    9
    Views:
    394
    Jonathan Turkanis
    Jan 20, 2004
  4. fudmore
    Replies:
    0
    Views:
    895
    fudmore
    May 10, 2006
  5. Mr. SweatyFinger
    Replies:
    2
    Views:
    1,669
    Smokey Grindel
    Dec 2, 2006
Loading...

Share This Page