Graceful failures

Discussion in 'Python' started by Jacob H, Dec 29, 2003.

  1. Jacob H

    Jacob H Guest

    Hello all,

    I'm nearing the completion of my first graphical console game and my
    thoughts have turned to the subject of gracefully handling runtime
    errors. During development I like to not handle exceptions, so that
    program execution will halt and I can immediately read the traceback
    to see what's up. Once the bugs are more or less worked out, I have a
    system ready wherein any exceptions are caught and an attempt is made
    to write the traceback object to a log file. This system suits me for
    the upcoming testing phase - if my beta testers do anything to make
    the game crash, I can read about the error and its causes in the log
    later.

    But what about the eventual production version that I distribute to
    the public? My current understanding is that there's always a chance a
    computer program can fail in some way at runtime, disk access being an
    example. Suppose the startup code for my game fails to load one of the
    images from disk. I can't think of any reasonable way to rescue
    things, and under such conditions I would prefer that the program
    gracefully exits. The aforementioned traceback logging is fine for me
    as the developer, but useless to the end user of the production
    version. So - I'm real curious - what are the canonical ways that a
    computer application should gracefully fail?

    I'm somewhat reluctant to write a bunch of code for pretty windows
    that pop up with some message to the effect of, "internal error, game
    exiting." My main reservation here is it doesn't seem it can be done
    cross platform, and since my game is in a graphical console there's no
    place for stdout to write. However, if this is the best way to go
    about error handling, I'm willing to write the code. And I would
    appreciate advice about how to handle the cross platform problem. :)

    Thanks in advance for any help!

    Jake
    Jacob H, Dec 29, 2003
    #1
    1. Advertising

  2. On Mon, 29 Dec 2003 14:14:43 -0800, Jacob H wrote:
    > I'm somewhat reluctant to write a bunch of code for pretty windows that
    > pop up with some message to the effect of, "internal error, game
    > exiting." My main reservation here is it doesn't seem it can be done
    > cross platform, and since my game is in a graphical console there's no
    > place for stdout to write. However, if this is the best way to go about
    > error handling, I'm willing to write the code. And I would appreciate
    > advice about how to handle the cross platform problem. :)


    I can't speak for any "canonical" way of doing things, but I do have some
    thoughts on the matter from my experience in customer service and some
    computer tutoring I used to do. Maybe they can be of some use to you.
    Forgive me as I ramble for a minute, but I haven't spent time collecting
    these thoughts.

    Most computer users react to error messages with fear and panic. They
    feel like the computer is telling that *they* have done something wrong.
    This misconception is compounded by the generally unintelligible error
    messages thrown by most programs. Computers have a way of making even
    some of the most intelligent people feel stupid.

    The second most common type of user understands that they didn't
    necessarily do anything wrong, and just wishes the program would work.
    They might thumb through the manual to see if it has anything on the
    matter. If they are given an obvious way to find a solution to the
    problem, they will take it, but they generally won't work that hard.

    The rest of us will dig in and try to solve the problem. This could be
    anything from using google, to posting on usenet, to reading a disassembly
    of the core dump.


    My "dream error dialog box" would do the following:

    Tell the user in a very neutral and nontechnical manner that it has
    encountered a problem and tell the user what it was trying to do when it
    failed.

    for instance:
    """
    Foo has encountered a problem.
    Foo was unable to load the necessary images to continue.
    """

    The program should avoid words like "error." You might consider having
    the program keep a stack of descriptions of what it's trying to accomplish
    at any given moment. That allows you to differentiate between being
    unable to open the configuration file with the intent of initially loading
    the configuration and opening the configuration file with the intent of
    reverting to original settings.

    Next, tell the user whether or not they can continue and what some of the
    consequences of continuing might be.

    The error dialog box should allow the user the option of viewing technical
    details, but should clearly label them as technical details.

    If possible, the user should be presented with possible courses of action
    to correct the problem.

    Finally, I really, really wish that users were presented with the option
    of being automagically taken to the technical support website where
    they're allowed to see and discuss with others how to solve their
    particular problem. This is where I get a little bit hazy in my ideas. I
    envision a website/forum where a user can read and discuss possible
    solutions to their problem *and* similar problems. The proper web page
    could be located from the stack trace. If they're the first one to
    encounter that particular stack trace, then tell them that they're the
    first, politely ask them to describe the problem (even if the
    description's useless to you, it makes the user feel included) and tell
    them that your development staff has been notified and will be making
    contact with them soon. Offer to keep them updated via email when other
    people post about having the same problem.

    This type of system could be very useful to your development team and tech
    support staff by allowing you to identify, track and fix the real life errors
    that your users are encountering.

    anyways...

    HTH

    Sam Walters

    P.S.

    If you haven't already, look at the Interface Hall of Shame to see what
    *not* to do:
    http://digilander.libero.it/chiediloapippo/Engineering/iarchitect/shame.htm
    Samuel Walters, Dec 31, 2003
    #2
    1. Advertising

  3. It is quite a coincidence that a thread like this pops up right when I
    feel I have to start one - with almost exactly the same subject - myself.

    Error handling can easily be the hardest task in a program, and that's why
    its being neglected most of the time.

    Am Wed, 31 Dec 2003 05:28:08 +0000 schrieb Samuel Walters:

    > My "dream error dialog box" would do the following:
    >
    > Tell the user in a very neutral and nontechnical manner that it has
    > encountered a problem and tell the user what it was trying to do when it
    > failed.
    >
    > for instance:
    > """
    > Foo has encountered a problem.
    > Foo was unable to load the necessary images to continue.
    > """


    I would like it much more detailled, like:

    """
    Foo has encountered a problem. It was trying to load a necessary image
    from the file '/usr/share/Foo/images/up.png'. This file does apparently
    not exist. The problem may be caused by a broken installation of Foo.
    Alas, Foo cannot be continued and will be closed.
    """

    (I'm not the best error message designer, but I hope you get the point:
    Tell exactly *what* is missing, so an even moderatly experienced user can
    try to fix it. I *hate* messages like "Cannot find image". What image?
    Where is it supposed to be?)

    > The error dialog box should allow the user the option of viewing
    > technical details, but should clearly label them as technical details.


    Which means the details I want go there - fine with me.

    > If possible, the user should be presented with possible courses of
    > action to correct the problem.


    +1

    > Finally, I really, really wish that users were presented with the option
    > of being automagically taken to the technical support website where
    > they're allowed to see and discuss with others how to solve their
    > particular problem.

    [snip]

    This, of course, is a bit overkill for a little freeware program, but
    sounds good for a "big" application with a hefty price tag.

    Handling every conceivable error right is quite a challenge and lots of
    work, especially in the test department.

    Hans-Joachim Widmaier
    Hans-Joachim Widmaier, Dec 31, 2003
    #3
  4. |Thus Spake Hans-Joachim Widmaier On the now historical date of Wed, 31
    Dec 2003 18:19:00 +0100|
    > It is quite a coincidence that a thread like this pops up right when I
    > feel I have to start one - with almost exactly the same subject -
    > myself.
    >
    > Error handling can easily be the hardest task in a program, and that's
    > why its being neglected most of the time.


    Agreed. Another problem is that programmers often forget what it's like
    for people who don't understand computers. It's hard to overestimate the
    apprehension many users feel towards computers. I've seen very competent
    and intelligent people frozen with fear in front of a word processor
    because they were afraid that they'd break the computer.


    > I would like it much more detailed, like:


    Most people that will read this post would prefer a more detailed message.
    Of course one should tailor the system to your program's target audience,
    but let's assume, for the sake of this particular discussion, that the
    errors will mostly be read by non-technically oriented people. I think
    that detailed error reports should always be readily available, but should
    not necessarily be the first thing presented.


    > """
    > Foo has encountered a problem. It was trying to load a necessary image
    > from the file '/usr/share/Foo/images/up.png'. This file does apparently
    > not exist. The problem may be caused by a broken installation of Foo.
    > Alas, Foo cannot be continued and will be closed. """


    That's a good message. I would avoid putting raw filenames in the
    non-technical description. (Unless, of course, the user was trying to
    open a document, then it's okay to list the path.) I would also avoid
    using the words like "broken." Both of these things look normal to you and
    me, but to a lot of people they're Scary Things(tm). By simply hiding the
    filename behind a button that says "Technical Details" you have said "Hey,
    you might not understand this, but that's OK because it's meant for the
    geeks." Heck, they might take a look at it and feel proud that they
    understand more of it than they thought they would.

    I like the increased specificity of your error message. I might suggest:

    """
    Foo has encountered a problem. It was trying to load necessary image
    files. At least one image apparently does not exist. Uninstalling and then
    Reinstalling Foo may correct this problem. Unfortunately, Foo cannot be
    continued and will be closed. """


    > (I'm not the best error message designer, but I hope you get the point:


    Nor am I. Most of the reason I offered up my opinion is so that other
    people might clue me into new ideas on the matter.


    > Which means the details I want go there - fine with me.


    As most people like you and me wouldn't mind one extra click to get at the
    juicy details.


    > This, of course, is a bit overkill for a little freeware program, but
    > sounds good for a "big" application with a hefty price tag.


    It makes as much sense as installing something like bugzilla. (more on
    that in a minute)

    > Handling every conceivable error right is quite a challenge and lots of
    > work, especially in the test department.


    Let me digress for a moment and explain how I stumbled into this idea of
    allowing the user to jump to an online information system based on their
    particular error. It might make more sense if you know where it came
    from. Or, at least, you can pin-point the critical flaw in my logic.

    My last development job was with a large company that preprocessed medical
    insurance claims. Essentially, we took on contracts to collect insurance
    claims from doctors, verify that the data was well-formatted (there are
    several hundred formats for medical insurance claims) translate them into
    the single format requested by our client, transmit them to the client,
    receive the response and deliver the responses back to the doctors. Karl
    Marx would have hated us. We were nothing but big-time middle-men. When
    asked where I worked, I used to say "I work for a huge quasi-governmental
    corporation that exists solely to shuffle paper."

    The method for transferring data was plain old 56k modems and a simple
    BBS. Our competitive advantage in the market was that we handled all
    end-user support and that we offered a pretty gui program tailored
    specifically for the medical field. (At this point, anyone who's worked
    in that field knows exactly who I worked for.) The gui program,
    essentially a fancy modem driver that allowed the user to track which
    files had been sent and pair them with the proper responses.
    Unfortunately the code for it was one huge tangled mess. It was the first
    program I was asked to make changes too and, as I hadn't mastered the
    Borland C++ debugger, I resorted to the tried and true method of keeping a
    log file while debugging. Every time that the program began an action, I
    had it write and flush, in plain english, what it was attempting to do. I
    found that when I sent the product on to QA, that having that log was
    invaluable. After some discussion with our tech support department, we
    decided to keep the code in for the production version. It was a smashing
    success. The front line tech support people were able to get much more
    reliable information about how the user's particular error came about.
    They started a database around the log-file so that they could reference
    the solutions for people with the same or similar problems. Most
    importantly, for me at least, I knew exactly where the end users were
    having problems. I didn't have to deal with errors like "Version Foo
    failed an assertion on line X in module Bar." I was able to implement
    more graceful error handling.

    So, I read the OP and thought to myself: Couldn't python keep a stack of
    the 'goals' it's attempting to achieve, then if an exception was thrown or
    an assertion failed couldn't this become part of the information about the
    error. You would push goals onto the stack, then pop them off if the
    program was reasonably certain that the action wasn't going to cause a
    problem. Then, couldn't that information be used to see if other users
    were having the same or similar problems? Couldn't that information be
    used to register and track problems via a hybrid of a message-board and
    bugzilla? Well, I don't see why not. In fact, since Python's exception
    system is so wonderful you could have the error dialog marshal error codes
    and basic stack-trace info into the get portion of a url and probably even
    make it open a browser with the press of a button. The user wouldn't have
    to know that all this went on in the background. All they would know is "I
    pressed a button and it took me to a place where people wanted to help me
    make it work."

    I want to work on something like that someday. I've already got a couple
    of projects on my plate, so at the moment this is blue sky thinking, but
    don't you think it would be nice if things could work this way? Maybe
    when my current project becomes stable, I'll see if I can add this as a
    feature.

    Sam Walters

    --
    Never forget the halloween documents.
    http://www.opensource.org/halloween/
    """ Where will Microsoft try to drag you today?
    Do you really want to go there?"""
    Samuel Walters, Dec 31, 2003
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Tom Kaminski [MVP]

    graceful file upload limit error?

    Tom Kaminski [MVP], Jul 19, 2004, in forum: ASP .Net
    Replies:
    6
    Views:
    2,729
    Shan Plourde
    Jul 20, 2004
  2. Alexander Staubo

    Portable, generic, graceful termination

    Alexander Staubo, Sep 28, 2004, in forum: Python
    Replies:
    0
    Views:
    279
    Alexander Staubo
    Sep 28, 2004
  3. MickeyBob

    Graceful detection of EOF

    MickeyBob, Oct 7, 2004, in forum: Python
    Replies:
    20
    Views:
    867
    Follower
    Oct 14, 2004
  4. lovecreatesbeauty

    How to write a small graceful gcd function?

    lovecreatesbeauty, Jul 15, 2006, in forum: C Programming
    Replies:
    73
    Views:
    1,564
    ozbear
    Jul 26, 2006
  5. mmacrobert
    Replies:
    1
    Views:
    313
    Victor Bazarov
    Aug 4, 2005
Loading...

Share This Page