Signal dispositions

L

Leet Jon

Hi what's the reason for having the default disposition for SIGSEGV,
SIGFPE, SIGBUS etc to be terminating the program, when these signals can
just be ignored by the program? Many programs crash with SIGSEGV -
they'd be much less flakey if the default was to try to carry on.

~Jon~
 
J

Joachim Schmitz

Leet Jon said:
Hi what's the reason for having the default disposition for SIGSEGV,
SIGFPE, SIGBUS etc to be terminating the program, when these signals can
just be ignored by the program? Many programs crash with SIGSEGV -
they'd be much less flakey if the default was to try to carry on.
Carry on with corrupted data? No, that's not a sane default.

Bye, Jojo
 
K

Keith Thompson

Leet Jon said:
Hi what's the reason for having the default disposition for SIGSEGV,
SIGFPE, SIGBUS etc to be terminating the program, when these signals can
just be ignored by the program? Many programs crash with SIGSEGV -
they'd be much less flakey if the default was to try to carry on.

The default handling for signals is implementation-defined (C99
7.14p4), so you might get better answers in comp.unix.programmer than
here in comp.lang.c. (I just noticed the cross-post; I've set
followups to comp.unix.programmer.)

However, letting a program continue running by default after a
catastrophic data-corrupting failure would not be a good idea. If a
program dies immediately after "an invalid access to storage" (which
is all the C standard says about SIGSEGV), then you have a good chance
of diagnosing and correcting the problem before putting the code into
production. If the error is ignored, the program will very likely
continue to corrupt your data in subtle ways; tracking it down and
fixing it is going to be difficult if the error occurs at a customer
site, or even during an important demo.
 
L

Leet Jon

However, letting a program continue running by default after a
catastrophic data-corrupting failure would not be a good idea. If a
program dies immediately after "an invalid access to storage" (which
is all the C standard says about SIGSEGV), then you have a good chance
of diagnosing and correcting the problem before putting the code into
production. If the error is ignored, the program will very likely
continue to corrupt your data in subtle ways; tracking it down and
fixing it is going to be difficult if the error occurs at a customer
site, or even during an important demo.

I believe you are completely wrong on this point. Very often a SIGSEGV
will be caused by (say) a single bad array access - the consequences
will be highly localized, and carrying on with the program will not
cause any significant problems.

Who wants their customer to run their program and have it just crash
with a segfault? That hardly comes across as professional. Better to try
your best to carry on and weather the storm than to just dump the user
with a crash.

I can understand that for debugging purposes you might want to have
SIGSEGV etc. generate a core file, but in production code the default
should be for these signals to be ignored.
 
S

santosh

Leet said:
Hi what's the reason for having the default disposition for SIGSEGV,
SIGFPE, SIGBUS etc to be terminating the program,

That's not specified by the C Standard but is a matter for
implementation.
when these signals can just be ignored by the program?

Do you routinely ignore early signs of serious illness? No? Then why
should one ignore signs of erroneous conditions in a program and allow
it to result in erroneous output?
Many programs crash with SIGSEGV - they'd be much less flakey if the
default was to try to carry on.

Try a little demo yourself. Write a data processing program of any kind
and deliberately code in a bounds violation condition. Then make sure
to catch SIGSEGV and continue execution. Observe if the end result is
what the program is supposed to do.

Signals indicate exceptional situations that the program must imminently
address. Ignoring a signal, regardless of whether it's because of
program error or a normal but exceptional condition, is only likely to
break the program further.
 
A

Al Balmer

I believe you are completely wrong on this point. Very often a SIGSEGV
will be caused by (say) a single bad array access - the consequences
will be highly localized, and carrying on with the program will not
cause any significant problems.

How on earth would you know what the consequences might be? If the
program in question is calculating my paycheck, I don't want any bad
array access to be ignored.

What kind of programs do you write? Games?
Who wants their customer to run their program and have it just crash
with a segfault? That hardly comes across as professional.

What's not professional is writing code that causes segfaults.
Better to try
your best to carry on and weather the storm than to just dump the user
with a crash.

I can understand that for debugging purposes you might want to have
SIGSEGV etc. generate a core file, but in production code the default
should be for these signals to be ignored.

In production code, those signals should never be generated. If they
are, they should crash, so that the user can complain, and someone can
fix it.
 
S

Shadowman

Leet said:
I believe you are completely wrong on this point. Very often a SIGSEGV
will be caused by (say) a single bad array access - the consequences
will be highly localized, and carrying on with the program will not
cause any significant problems.

Who wants their customer to run their program and have it just crash
with a segfault? That hardly comes across as professional. Better to try
your best to carry on and weather the storm than to just dump the user
with a crash.

I can understand that for debugging purposes you might want to have
SIGSEGV etc. generate a core file, but in production code the default
should be for these signals to be ignored.

OK, but you really haven't made a case that it would be worthwhile to
change the default behavior. What's wrong with supplying your own
signal handler when you want something else?
 
M

Martin Vuille

Hi what's the reason for having the default disposition for
SIGSEGV, SIGFPE, SIGBUS etc to be terminating the program,
when these signals can just be ignored by the program?

In spite of what you seem to believe, an application is not
allowed to ignore the signal, and is very limited in how it can
handle the signal. To quote POSIX/SUSv3:

"The behavior of a process is undefined after it ignores a SIGFPE,
SIGILL, SIGSEGV, or SIGBUS signal that was not generated by kill
( ),sigqueue( ),or raise( )."

and

"The behavior of a process is undefined after it returns normally
from a signal-catching function for a SIGBUS, SIGFPE, SIGILL, or
SIGSEGV signal that was not generated by kill( ),sigqueue( ),or
raise( )."

MV
 
J

jameskuyper

Leet said:
I believe you are completely wrong on this point. Very often a SIGSEGV
will be caused by (say) a single bad array access - the consequences
will be highly localized, and carrying on with the program will not
cause any significant problems.

In general, if that bad array access is a write, it may completely
mess up some other part of the program. Also, code sufficiently
defective to generate a bad array access is extremely unlikely to
generate only one such access; they usually produce large numbers of
them.
Who wants their customer to run their program and have it just crash
with a segfault?

Given the choice between crashing, and continuing to run, I strongly
prefer the crash. If someone desperately needs that program to be
running, they presumably need it to run correctly, and that's highly
unlikely after an ignored SIGSEG signal.
 
K

Keith Thompson

Leet Jon said:
I believe you are completely wrong on this point. Very often a SIGSEGV
will be caused by (say) a single bad array access - the consequences
will be highly localized, and carrying on with the program will not
cause any significant problems.

Who wants their customer to run their program and have it just crash
with a segfault? That hardly comes across as professional. Better to try
your best to carry on and weather the storm than to just dump the user
with a crash.

Better to continue with bad data? Better to corrupt the user's
important files than to crash and leave them in their initial state?
Better to continue operating incorrectly and produce wrong answers, as
long as it *looks* good?

I don't think so.

Your goal should be for your code never to produce a SIGSEGV in the
first place. Since all software has bugs, that's not always
achievable, but you should certainly want to *know* when the program
produces SIGSEGV (or any other signal that indicates a problem).
I can understand that for debugging purposes you might want to have
SIGSEGV etc. generate a core file, but in production code the default
should be for these signals to be ignored.

At the very least, you might consider handling the signal and logging
the error (and, preferably, cleanly shutting down the program). If
you just ignore it, then you might never know that there's a problem
-- except that users will decide that your software is unreliable.

What you advocate is the equivalent of putting a piece of black tape
over the oil warning light on your car's dashboard. It makes for a
more pleasant driving experience -- until your engine seizes up and
leaves you stranded in the middle of nowhere.
 
B

Barry Margolin

In general, if that bad array access is a write, it may completely
mess up some other part of the program.

If the write gets a SIGSEGV, it doesn't actually write anything. The
meaning of that signal is that you tried to write to a nonexistent
virtual address. So if by "mess up some other part of the program" you
meant that it would overwrite that part's data, that obviously can't
happen.

On the other hand, if some other part of the program was expecting to
read what you wrote, it will certainly be messed up by the lack of that
data.
Given the choice between crashing, and continuing to run, I strongly
prefer the crash. If someone desperately needs that program to be
running, they presumably need it to run correctly, and that's highly
unlikely after an ignored SIGSEG signal.

Agreed. Almost any time a program gets one of these signals, it means
it has a serious bug. It's better to find out that it's broken than to
pretend it isn't.
 
L

Logan Shaw

Leet said:
I believe you are completely wrong on this point. Very often a SIGSEGV
will be caused by (say) a single bad array access - the consequences
will be highly localized, and carrying on with the program will not
cause any significant problems.

And very often this will not be the case and quitting before more damage
happens is the best thing.

For example, consider a program which moves a tree of files from one
filesystem to another by copying them and then deleting the originals
once the copy finishes successfully. And suppose you get a SIGSEGV
while building the list of files to copy. If you ignore it, you might
end up making copies of half the files, then deleting all the originals!

If you really want to recover from local faults, then use a language that
does bounds checking on arrays and pointer/reference dereferencing and
throws exceptions when these things happen. Then if you know such
errors really won't corrupt the state of the larger program and that the
fault is really localized, you can write an exception handler to do the
error recovery and contain the fault within whatever bounds you've
pre-determined it actually *can* be confined within.
Who wants their customer to run their program and have it just crash
with a segfault?

I'd much rather the customer encounter a segfault, file a bug report,
and give me a chance to fix it than I would have it just silently fail
and let the error continue, corrupting data or whatever else for who
knows how many years upon years. There was a trend in business a
decade or two ago called "total quality management" (or TQM), and the
basic idea was that when faults happen, you should not whitewash over
them, and you should instead stop what you're doing and not proceed
until you've corrected the problem. This was carried a little too far
(like most trendy business ideas), but there is some merit to this
approach. Ignoring failures just (a) causes problems and (b) encourages
people to stop caring about whether they cause failures.

- lOGAN
 
A

Almond

You simply have to fix a bug.
There is no way to know if your illegal access is "acceptable"
or not.

--
The most powerful Usenet tool you have ever heard of.
NewsMaestro v. 4.0.6 - Dictionary Update/Expert Mode has been released.

* Significant improvement in symbol substitution mechanism
for verb tense and plurals.
* Expert mode.
* Miscellaneous improvements and bug fixes.
* Templates generator improvements.
* Multi-job support.

Note: In some previous releases some class files were missing.
As a result, the program would not run.
Sorry for the inconvenience.

Web page:
http://newsmaestro.sourceforge.net/

Download page:
http://newsmaestro.sourceforge.net/Download_Information.htm

Send any feedback, ideas, suggestions, test results to
newsmaestroinfo \at/ mail.ru.

Your personal info will not be released and your privacy
will be honored.
 
R

Rainer Weikusat

Leet Jon said:
I believe you are completely wrong on this point. Very often a SIGSEGV
will be caused by (say) a single bad array access - the consequences
will be highly localized, and carrying on with the program will not
cause any significant problems.

Who wants their customer to run their program and have it just crash
with a segfault? That hardly comes across as professional. Better to try
your best to carry on and weather the storm than to just dump the user
with a crash.

I can understand that for debugging purposes you might want to have
SIGSEGV etc. generate a core file, but in production code the default
should be for these signals to be ignored.

Let me paraphrase this ('my' is here supposed to mean 'Leet Jon'):
My code is full of ocasionally happening invalid memory accesses,
which I am to lazy to debug, even if I could. But my customers have no
way if knowing this, except, unfortunately, these invalid memory
accesses lead to the kernel terminating the process. Since they cannot
possibly tell if some output of the program has been generated in the
course of the algoritms they think it would be performing on the data
they fed to it, has instead been calculated using left-over register
contents from arbitrary functions, which could not be replaced because
of faulting load instructions and intermediate results having
vanished into nowhere land because the stores intended to store them
faulted, too, not taking into account that the control flow has
been mostly unpredictable due to corrupted stack frames, they would
just happily accpet it. I am convinced it works most of the time.

Now, for the sake of the argument, let's swap 'program' with 'electric
device', invalid memory access with 'improperly isolated flow of
current' and 'works most of time' with 'only kills someone every now
and then'.

Except for 'traditional lenience' wrt to software, there is no
'functional' difference.
 
L

Leet Jon

In production code, those signals should never be generated. If they
are, they should crash, so that the user can complain, and someone can
fix it.

Perhaps you are unaware that some C code is run in safety-critical
environments - having a program that dumps core at the drop of a hat
rather than carrying on running could literally be the difference
between life and death. OK, so *maybe* the error condition causing the
SIGSEGV will propagate and bring the program down later, but taking that
chance is a better option than immediately failing.

~Jon~
 
J

Joachim Schmitz

Leet Jon said:
Perhaps you are unaware that some C code is run in safety-critical
environments - having a program that dumps core at the drop of a hat
rather than carrying on running could literally be the difference
between life and death. OK, so *maybe* the error condition causing the
SIGSEGV will propagate and bring the program down later, but taking that
chance is a better option than immediately failing.
I beg your pardon? A program in a safety-critical environment should be
tested properly so that SIGSEGVs simply don't happen.
Als I'd rather have the program crash and a human operator take over control

Exapmple a flight auto pilot, program gets a SIGSEV but continues without
telling anybody and the plane crashes into a mountain as a result of it's
wrong calculations. Alternative: The progam abends, The system tells the
pilot about it and the pilot takes over control.

Bye, Jojo
 
K

Keith Thompson

Leet Jon said:
Perhaps you are unaware that some C code is run in safety-critical
environments - having a program that dumps core at the drop of a hat
rather than carrying on running could literally be the difference
between life and death. OK, so *maybe* the error condition causing the
SIGSEGV will propagate and bring the program down later, but taking that
chance is a better option than immediately failing.

And in those unusual circumstances, you might consider changing the
disposition for SIGSEGV to something other than terminating the
program (though if at all possible, you should log the error for later
analysis).

This does not argue for changing the *default* disposition.
 
J

Joachim Schmitz

Keith Thompson said:
And in those unusual circumstances, you might consider changing the
disposition for SIGSEGV to something other than terminating the
program (though if at all possible, you should log the error for later
analysis).
Which is something the operating system should do for you.

Bye, Jojo
 
T

Tor Rustad

Leet said:
Perhaps you are unaware that some C code is run in safety-critical
environments - having a program that dumps core at the drop of a hat
rather than carrying on running could literally be the difference
between life and death.

In safety-critical systems, you don't want to depend on a module in an
undefined state. In a fault-tolerant design, you avoid depending on
single point of failure modules.

How are you supposed to detect a HW fault in time, if ignoring
signals/exceptions?

The way this usually works, is that faults are not ignored, but when
detected, the module is taken down by a monitor program, some error
recovery can be performed by restarting the module, if that fails, the
module is shut down for good.

The system continue working, by resuming processing in independent HW,
from a well-defined state.
OK, so *maybe* the error condition causing the
SIGSEGV will propagate and bring the program down later, but taking that
chance is a better option than immediately failing.

Hopefully, you are not programming a nuclear plant control system.
 
G

Golden California Girls

Leet said:
Perhaps you are unaware that some C code is run in safety-critical
environments - having a program that dumps core at the drop of a hat
rather than carrying on running could literally be the difference
between life and death. OK, so *maybe* the error condition causing the
SIGSEGV will propagate and bring the program down later, but taking that
chance is a better option than immediately failing.

~Jon~

The error handler which you must provide will bring the controls it is doing
into a state that is normal or safe that won't cause death, ring a klaxon and
turn on a warning light and then crash. It should arrange to make itself not
restartable until it has been pulled from the working environment and placed on
a test bench! Anything else would be criminal.

If that means that the anti-lock brake system light stays on with the check
vehicle light flashing and you are operating on analog backup only then that is
what it means! (To put this into context)


You might obtain a copy of National Bureau of Standards (NBS) Computer Science
and Technology series, Special Publication 500-75, February 1981 "Validation,
Verification, and Testing of Computer Software" by W. Richards Adrion, Martha A.
Branstad, John C. Cherniavsky. Library of Congress Card Number 80-600199. I'm
sure other publications have followed this, but you will get a sense of what the
responsibility of the programmer is to design a test suite to prove the program
works as expected under all conditions expected and unexpected.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top