How to cleanly pause/stop a long running function?

Discussion in 'Python' started by Basilisk96, May 12, 2007.

  1. Basilisk96

    Basilisk96 Guest

    Suppose I have a function that may run for a long time - perhaps from
    several minutes to several hours. An example would be this file
    processing function:

    import os
    def processFiles(startDir):
    for root, dirs, files in os.walk(startDir):
    for fname in files:
    if fname.lower().endswith(".zip"):
    # ... do interesting stuff with the file here ...

    Imagine that there are thousands of files to process. This could take
    a while. How can I implement this so that the caller can pause or
    interrupt this function, and resume its program flow? Doing a Ctrl+C
    interrupt would be a not-so-clean-way of performing such a thing, and
    it would quit the application altogether. I'd rather have the function
    return a status object of what it has accomplished thus far.

    I have heard about threads, queues, and asynchronous programming, but
    am not sure which is appropriate for this and how to apply it. Perhaps
    the above function should be a method of a class that inherits from
    the appropriate handler class? Any help will be appreciated.

    -Basilisk96
     
    Basilisk96, May 12, 2007
    #1
    1. Advertising

  2. Basilisk96

    Adam Atlas Guest

    On May 12, 4:51 pm, Basilisk96 <> wrote:
    > Suppose I have a function that may run for a long time - perhaps from
    > several minutes to several hours. An example would be this file
    > processing function:
    >
    > import os
    > def processFiles(startDir):
    > for root, dirs, files in os.walk(startDir):
    > for fname in files:
    > if fname.lower().endswith(".zip"):
    > # ... do interesting stuff with the file here ...
    >
    > Imagine that there are thousands of files to process. This could take
    > a while. How can I implement this so that the caller can pause or
    > interrupt this function, and resume its program flow? Doing a Ctrl+C
    > interrupt would be a not-so-clean-way of performing such a thing, and
    > it would quit the application altogether. I'd rather have the function
    > return a status object of what it has accomplished thus far.
    >
    > I have heard about threads, queues, and asynchronous programming, but
    > am not sure which is appropriate for this and how to apply it. Perhaps
    > the above function should be a method of a class that inherits from
    > the appropriate handler class? Any help will be appreciated.
    >
    > -Basilisk96


    Consider using generators.
    http://docs.python.org/tut/node11.html#SECTION00111000000000000000000

    This way, whatever part of your program calls this function can
    completely control the iteration. Maybe you can have it yield status
    information each time.
     
    Adam Atlas, May 13, 2007
    #2
    1. Advertising

  3. > Doing a Ctrl+C
    > interrupt would be a not-so-clean-way of performing such a thing, and
    > it would quit the application altogether. I'd rather have the function
    > return a status object of what it has accomplished thus far.


    Just in case you are unaware that you can explicitly handle ^C in your
    python code, look up the KeyboardInterrupt exception.

    mt
     
    Michael Tobis, May 13, 2007
    #3
  4. On Sat, 12 May 2007 13:51:05 -0700, Basilisk96 wrote:

    > Suppose I have a function that may run for a long time - perhaps from
    > several minutes to several hours. An example would be this file
    > processing function:
    >
    > import os
    > def processFiles(startDir):
    > for root, dirs, files in os.walk(startDir):
    > for fname in files:
    > if fname.lower().endswith(".zip"):
    > # ... do interesting stuff with the file here ...
    >
    > Imagine that there are thousands of files to process. This could take
    > a while. How can I implement this so that the caller can pause or
    > interrupt this function, and resume its program flow?


    I don't think there really is what I would call a _clean_ way, although
    people may disagree about what's clean and what isn't.

    Here's a way that uses global variables, with all the disadvantages that
    entails:

    last_dir_completed = None
    restart = object() # a unique object

    def processFiles(startDir):
    global last_dir_completed
    if startDir is restart:
    startDir = last_dir_completed
    for root, dirs, files in os.walk(startDir):
    for fname in files:
    if fname.lower().endswith(".zip"):
    # ... do interesting stuff with the file here ...
    last_Dir_completed = root



    Here's another way, using a class. Probably not the best way, but a way.

    class DirLooper(object):
    def __init__(self, startdir):
    self.status = "new"
    self.startdir = startdir
    self.root = startdir
    def run(self):
    if self.status == 'new':
    self.loop(self.startdir)
    elif self.status == 'finished':
    print "nothing to do"
    else:
    self.loop(self.root)
    def loop(self, where):
    self.status = "started"
    for self.root, dirs, files in os.walk(where):
    # blah blah blah...


    Here's another way, catching the interrupt:

    def processFiles(startDir):
    try:
    for root, dirs, files in os.walk(startDir):
    # blah blah blah ...
    except KeyboardInterrupt:
    do_something_with_status()


    You can fill in the details :)


    As for which is "better", I think the solution using a global variable is
    the worst, although it has the advantage of being easy to implement. I
    think you may need to try a few different implementations and judge for
    yourself.


    --
    Steven.
     
    Steven D'Aprano, May 13, 2007
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Kerry Shetline
    Replies:
    2
    Views:
    447
  2. Tim Bradshaw
    Replies:
    2
    Views:
    275
    David Fraser
    May 27, 2004
  3. Jon Wright
    Replies:
    2
    Views:
    407
    Jon Wright
    Oct 22, 2004
  4. jrpfinch
    Replies:
    2
    Views:
    519
    Fredrik Lundh
    Mar 23, 2007
  5. Replies:
    1
    Views:
    200
    Erwin Moller
    Dec 2, 2008
Loading...

Share This Page