Real-time developers/designers: Can abort() be used to fail-fast in a safety-critical system?

Discussion in 'C++' started by Marc, Dec 17, 2010.

  1. Marc

    Marc Guest

    Here is an opportunity to shine. I only seek answers from very
    experienced real-time safety-critical system designers and implementors.

    Can you convince me that abort() can be used to fail-fast in a
    safety-critical system?

    If you say "it depends", explain, but don't stay in theory land or "it's
    a team process"-land, as only true usage/end-product counts this time.
    Any and all real examples that you have implemented in safety-critical
    systems are fair game. You did the ejection seat system design and coding
    for the F15? Great! YOU are the one I would like an answer from and such
    others. The more responses, the better, as long as the are from a top gun
    in the field.

    Can you provide an actual example that you implemented and were
    responsible for? Long-term full-time real-time developers of
    safety-critical systems at the level of designer/architect of entire
    systems or major safety-critical subsystems as well as being the
    low-level implementor of many such things for many years would help
    weight your answer. Please don't answer if you have just read about it or
    are theorizing and have not many years of guru-level experience designing
    and implementing safety-critical real-time systems or if you simply
    worked on such a project without being the technical and responsible
    lead. Full-time and many years of real-time safety-critical
    implementation experience only please. Don't be one of those who has 20
    years of experience but repeated year one 20 times. I know that it is
    rare when experience counts, but this time it does. <wink>. This is not a
    job interview or screening.

    In helping you answer this question to my satisfaction, expansion of
    instruction-level code and an actual use case would be "a picture that
    says a thousand words", but don't let that prevent your own approach. The
    use case is so important and C or C++ are both fine.

    (I realize I should have asked this in another forum, but since I started
    it here in another thread, I will try and finish it here too if
    possible.)
    Marc, Dec 17, 2010
    #1
    1. Advertising

  2. Marc

    Seebs Guest

    Re: Real-time developers/designers: Can abort() be used tofail-fast in a safety-critical system?

    On 2010-12-17, Marc <> wrote:
    > Here is an opportunity to shine. I only seek answers from very
    > experienced real-time safety-critical system designers and implementors.


    Ah, well, my answer will be of no interest to you then, but maybe someone
    else will care.

    > Can you convince me that abort() can be used to fail-fast in a
    > safety-critical system?


    I can't. And, for that matter, I'd argue that this isn't just because
    you're very demanding in qualifications, but because it Just Ain't So.

    > Please don't answer if you have just read about it or
    > are theorizing and have not many years of guru-level experience designing
    > and implementing safety-critical real-time systems or if you simply
    > worked on such a project without being the technical and responsible
    > lead.


    I thought about this request, and decided to refer you to Arkell v.
    Pressdram.

    I'm not coming to this from the position of a mythical guru in
    safety-critical systems, whose twenty years of experience could be
    largely outdated now, but from the position of someone who knows a
    decent bit about C and C implementations.

    I suppose someone could in theory develop a C implementation in which
    abort() could be a reasonable choice for such a thing, but it wouldn't
    be something they'd be expected to do for standards conformance, and
    it wouldn't be a likely implementation choice for most systems.

    -s
    --
    Copyright 2010, all wrongs reversed. Peter Seebach /
    http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
    http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
    I am not speaking for my employer, although they do rent some of my opinions.
    Seebs, Dec 17, 2010
    #2
    1. Advertising

  3. Marc

    Guest

    Re: Real-time developers/designers: Can abort() be used to fail-fastin a safety-critical system?

    On Dec 16, 8:21 pm, "Marc" <> wrote:
    > Here is an opportunity to shine. I only seek answers from very
    > experienced real-time safety-critical system designers and implementors.
    >
    > Can you convince me that abort() can be used to fail-fast in a
    > safety-critical system?
    >
    > If you say "it depends", explain, but don't stay in theory land or "it's
    > a team process"-land, as only true usage/end-product counts this time.
    > Any and all real examples that you have implemented in safety-critical
    > systems are fair game. You did the ejection seat system design and coding
    > for the F15? Great! YOU are the one I would like an answer from and such
    > others. The more responses, the better, as long as the are from a top gun
    > in the field.
    >
    > Can you provide an actual example that you implemented and were
    > responsible for? Long-term full-time real-time developers of
    > safety-critical systems at the level of designer/architect of entire
    > systems or major safety-critical subsystems as well as being the
    > low-level implementor of many such things for many years would help
    > weight your answer. Please don't answer if you have just read about it or
    > are theorizing and have not many years of guru-level experience designing
    > and implementing safety-critical real-time systems or if you simply
    > worked on such a project without being the technical and responsible
    > lead. Full-time and many years of real-time safety-critical
    > implementation experience only please. Don't be one of those who has 20
    > years of experience but repeated year one 20 times. I know that it is
    > rare when experience counts, but this time it does. <wink>. This is not a
    > job interview or screening.
    >
    > In helping you answer this question to my satisfaction, expansion of
    > instruction-level code and an actual use case would be "a picture that
    > says a thousand words", but don't let that prevent your own approach. The
    > use case is so important and C or C++ are both fine.
    >
    > (I realize I should have asked this in another forum, but since I started
    > it here in another thread, I will try and finish it here too if
    > possible.)



    Well, if you're using MISRA, rule 126 specifically prohibits the use
    of abort().
    , Dec 17, 2010
    #3
  4. Marc

    Goran Pusic Guest

    Re: Real-time developers/designers: Can abort() be used to fail-fastin a safety-critical system?

    On Dec 17, 3:21 am, "Marc" <> wrote:
    > Here is an opportunity to shine. I only seek answers from very
    > experienced real-time safety-critical system designers and implementors.


    I am not that person, so no answer from me to you.

    > Can you convince me that abort() can be used to fail-fast in a
    > safety-critical system?


    IMO, this is a mighty vague question and a "guru" that does give an
    answer to it, bloody isn't.

    What does "fail-fast" mean? What is the system in question? What about
    hooking on SIGABRT? What speed do you need? what speed can you achieve
    on your system with some example uses? What are abort() speed
    guarantees __on your implementation__? You're talking about real-time;
    which flavor? "hard", where you control perf aspect of every single
    artifact; or "soft" which in itself is too vague to answer a question?

    Perhaps what abort() is supposed to do is already way too slow on
    hardware or implementation you're using. Did you even measure
    anything? abort() should close open file streams. How much time does
    that take __on your system__ (depends on the number of handles, you
    know)? Do you care if they are not closed? Do you have a system where
    they stay open after you crash (OS doesn't clean up after a process
    crash)? If yes, and you restart the process, you will eventually run
    out of resources. Or do you re-boot the system after the crash? If so,
    you don't care about those handles and for speed reasons you could
    avoid abort.

    Frankly, if OP had an idea/opinion/experience about things above __on
    his target system__, he would not be asking here.

    Methinks this question is more of a clueless shot in the dark than
    anything else.

    Goran.
    Goran Pusic, Dec 17, 2010
    #4
  5. Marc

    Chris H Guest

    In message <ieehg6$tli$>, Marc <>
    writes
    >Here is an opportunity to shine. I only seek answers from very
    >experienced real-time safety-critical system designers and implementors.


    Then you are probably in the wrong news group.
    Try the York safety group or similar

    >Can you convince me that abort() can be used to fail-fast in a
    >safety-critical system?


    No.

    >If you say "it depends", explain,


    Ok... It depends entirely on the specific context in your application.
    There are far to many variable to give a generic answer.

    >Can you provide an actual example that you implemented and were
    >responsible for?


    I doubt any one would do that in a public space.


    --
    \/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
    \/\/\/\/\ Chris Hills Staffs England /\/\/\/\/
    \/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/
    Chris H, Dec 17, 2010
    #5
  6. Marc

    Chris H Guest

    In message <
    ..com>, "" <> writes
    >On Dec 16, 8:21 pm, "Marc" <> wrote:
    >> Here is an opportunity to shine. I only seek answers from very
    >> experienced real-time safety-critical system designers and implementors.
    >>
    >> Can you convince me that abort() can be used to fail-fast in a
    >> safety-critical system?
    >>
    >> If you say "it depends", explain, but don't stay in theory land or "it's
    >> a team process"-land, as only true usage/end-product counts this time.
    >> Any and all real examples that you have implemented in safety-critical
    >> systems are fair game. You did the ejection seat system design and coding
    >> for the F15? Great! YOU are the one I would like an answer from and such
    >> others. The more responses, the better, as long as the are from a top gun
    >> in the field.
    >>
    >> Can you provide an actual example that you implemented and were
    >> responsible for? Long-term full-time real-time developers of
    >> safety-critical systems at the level of designer/architect of entire
    >> systems or major safety-critical subsystems as well as being the
    >> low-level implementor of many such things for many years would help
    >> weight your answer. Please don't answer if you have just read about it or
    >> are theorizing and have not many years of guru-level experience designing
    >> and implementing safety-critical real-time systems or if you simply
    >> worked on such a project without being the technical and responsible
    >> lead. Full-time and many years of real-time safety-critical
    >> implementation experience only please. Don't be one of those who has 20
    >> years of experience but repeated year one 20 times. I know that it is
    >> rare when experience counts, but this time it does. <wink>. This is not a
    >> job interview or screening.
    >>
    >> In helping you answer this question to my satisfaction, expansion of
    >> instruction-level code and an actual use case would be "a picture that
    >> says a thousand words", but don't let that prevent your own approach. The
    >> use case is so important and C or C++ are both fine.
    >>
    >> (I realize I should have asked this in another forum, but since I started
    >> it here in another thread, I will try and finish it here too if
    >> possible.)

    >
    >
    >Well, if you're using MISRA, rule 126 specifically prohibits the use
    >of abort().


    Of course MISRA-C:98 Rule 126 could be deviated if you have grounds to
    do it. Read the notes under the rule or better still use the 2004
    version of MISRA.

    --
    \/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
    \/\/\/\/\ Chris Hills Staffs England /\/\/\/\/
    \/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/
    Chris H, Dec 17, 2010
    #6
  7. Marc

    James Kanze Guest

    Re: Real-time developers/designers: Can abort() be used to fail-fastin a safety-critical system?

    On Dec 17, 2:21 am, "Marc" <> wrote:
    > Here is an opportunity to shine. I only seek answers from very
    > experienced real-time safety-critical system designers and implementors.


    > Can you convince me that abort() can be used to fail-fast in a
    > safety-critical system?


    You've already had the answer, several times. From people (like
    myself) with real experience in real-type safety-critical
    systems.

    [...]
    > Can you provide an actual example that you implemented and were
    > responsible for?


    Locomotive brake system. We didn't use abort, because it wasn't
    present (no underlying OS to return to); we did the equivalent,
    however, shutting the system down as rapidly as possible.

    Other more or less critical systems I've worked on (electric
    power distribution, and a lot of telephone routing systems)
    behaved similarly.

    --
    James Kanze
    James Kanze, Dec 17, 2010
    #7
  8. Marc

    Seebs Guest

    Re: Real-time developers/designers: Can abort() be used tofail-fast in a safety-critical system?

    On 2010-12-17, Chris H <> wrote:
    > In message <ieehg6$tli$>, Marc <>
    > writes
    >>Can you convince me that abort() can be used to fail-fast in a
    >>safety-critical system?


    > No.


    I was thinking about this, and I've concluded that the answer is
    almost certainly "yes". If you read carefully, you will note that his
    question is not "Can abort() be reasonably and successfully used
    to fail-fast in a safety-critical system without violating requirements
    or specifications."

    There are two obvious ways to get to a "yes" answer. One is to observe
    that the OP never specified that the usage had to be successful, correct,
    or acceptable to the client, or not result in people dying. The other
    is to observe that the OP is apparently a bit on the careless side and
    much impressed by Credentials in and of themselves. Thus, I would argue
    both that the answer to the question literally asked is "yes", and that
    even if it weren't, it would be easy for someone to convince the OP that
    it was.

    -s
    --
    Copyright 2010, all wrongs reversed. Peter Seebach /
    http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
    http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
    I am not speaking for my employer, although they do rent some of my opinions.
    Seebs, Dec 17, 2010
    #8
  9. Marc

    Adam Skutt Guest

    Re: Real-time developers/designers: Can abort() be used to fail-fastin a safety-critical system?

    On Dec 16, 9:21 pm, "Marc" <> wrote:
    >
    > (I realize I should have asked this in another forum, but since I started
    > it here in another thread, I will try and finish it here too if
    > possible.)


    Starting another topic asking the exact same question in this forum
    will not get you the answers you seek, for the reasons I already
    outlined to you before. Like it or not, you don't need to directly
    converse with an expert to get the answers you seek.

    There's an abundance of literature on the subject all over the
    Internet. Look it up. Though if you can't understand the basic
    precept, "Life-critical covers a huge field of products and what is
    acceptable coding entirely depends on which products you're talking
    about," one begins to wonder if such a simple task is entirely beyond
    you.

    Adam
    Adam Skutt, Dec 19, 2010
    #9
  10. Marc

    Adam Skutt Guest

    Re: Real-time developers/designers: Can abort() be used to fail-fastin a safety-critical system?

    On Dec 17, 3:35 am, Goran Pusic <> wrote:
    > abort() should close open file streams. How much time does
    > that take __on your system__ (depends on the number of handles, you
    > know)?

    No, the only thing abort() has to do is never return (in ANSI C).
    Anything else is best effort behavior, like most forms of process
    termination in C / C++ (including the actual termination of the
    process itself, perhaps strangely enough). Several implementations of
    UNIX flush stdio streams only. POSIX used to mandate that
    implementations effect fclose(), this was reduced to 'may affect
    fclose()' for ANSI C compatibility.

    I've worked on multiple embedded platforms where abort() was really
    just a way to call the processor-specific halt instruction. You
    didn't even get SIGABRT, because the platform lacked signal handling.

    Adam
    Adam Skutt, Dec 19, 2010
    #10
  11. Marc

    James Kanze Guest

    Re: Real-time developers/designers: Can abort() be used to fail-fastin a safety-critical system?

    On Dec 19, 4:11 am, (Gordon Burditt) wrote:

    [...]
    > If you're controlling the fuel rods in a nuclear reactor, killing
    > the power to the rod controls (which should let the rods fall back
    > in by gravity) and shutting down may be sufficient. Shutting down
    > the coolant pumps, however, is *not* acceptable, if those are also
    > under control by the same system.


    Critical systems are, by definition, systems; the software is
    only a part. Continuing operation in case of an error is not an
    option; the software might actually generate commands to put the
    rods out. And of course, any system in which the failure of one
    piece of software would cause the coolant system to fail would
    not be acceptable.

    --
    James Kanze
    James Kanze, Dec 20, 2010
    #11
  12. Re: Real-time developers/designers: Can abort() be used to fail-fastin a safety-critical system?

    On 20 déc, 10:21, James Kanze <> wrote:
    > On Dec 19, 4:11 am, (Gordon Burditt) wrote:
    >
    >     [...]
    >
    > > If you're controlling the fuel rods in a nuclear reactor, killing
    > > the power to the rod controls (which should let the rods fall back
    > > in by gravity) and shutting down may be sufficient.  Shutting down
    > > the coolant pumps, however, is *not* acceptable, if those are also
    > > under control by the same system.

    >
    > Critical systems are, by definition, systems; the software is
    > only a part.  Continuing operation in case of an error is not an
    > option; the software might actually generate commands to put the
    > rods out.  And of course, any system in which the failure of one
    > piece of software would cause the coolant system to fail would
    > not be acceptable.


    This is a question of fail-safe. The OP question is rather about fail-
    fast.

    Whether the backup system is mechanical ( such as a safe state of the
    system) or a continuation of service (duplication, load balancing ...)
    or none at all (blue screen) is IMHO outside the scope of this
    thread.

    I have heard of systems where the buggy systems continues to live and
    may be reused when its state converges to the backup system but I
    don't know how they work (i.e. decide the transitional buggy state has
    passed).

    --
    Michael
    Michael Doubez, Dec 20, 2010
    #12
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Design-Crowd
    Replies:
    0
    Views:
    389
    Design-Crowd
    Mar 19, 2005
  2. aeromarine
    Replies:
    15
    Views:
    1,468
    Martin
    Feb 18, 2008
  3. Marc
    Replies:
    16
    Views:
    586
    Michael Doubez
    Dec 20, 2010
  4. Replies:
    0
    Views:
    161
  5. wheresdave
    Replies:
    0
    Views:
    177
    wheresdave
    Feb 14, 2007
Loading...

Share This Page