Paul Hsieh said:
On Jan 18, 12:34 pm, William Ahern <
[email protected]>
You want bugs to be found during *development time* not just as soon
as circumstances arise. Your compiler is unlikely to set a policy for
buffer overrun checking that is relevant to shipping, but instead is
focused for debugging.
I disagree. I want bugs found as soon as possible. And I always ship
compiled code with debugging symbols turned on. I won't budge on this. I'll
make the release engineer's life a living hell if I have to.
For example on my Windows system, since I have installed Visual
Studio, any application that causes an OS detected memory fault is
trapped by the debugger -- this is interesting and yes indeed, I *DO*
have the skills to reverse engineer and literally fix the code in
assembly language myself, but I really don't find this to worth my
time (I would rather just send an irate letter to the developer, or
use another application).
So, you send an irate letter to the developer. You'll say something like,
"It don't work". Or, probably as a fellow engineer, you'll try to be more
specific. But, 9 times out of 10, from the developer's perspective, any
complaint is roughly equivalent to "it don't work. Then, said engineer will
need to track down the source of the problem. To debug the issue, so to
speak.
(Granted, most of my experience has been with appliances, and not desktop
software. And and maybe this is where things diverge, but I tend to think
not.)
So, say this pretend function returned with an error code because it was
passed a NULL pointer. If you're lucky, the symptom belies this bug. But,
more often than not, the effect pops up somewhere else. (If it aborts() its
only mildly better, IMO.)
Now, poor developer is enlisted to fix the product. More than likely there's
more than 1 bug, and more than 1 bug which might possibly cause the
symptom(s). So, he investigates. Ah, bug! Fix. Give to customer. "Try this".
Customer thinks, "Speedy service, I like these people". A week or two
passes. Same thing happens. Calls back. "Its broken! I thought you guys
fixed this?"
Repeat, ad naseum.
Now, if the darned progam had simply core dumped at the pointer dereference,
and debug symbols were turned on, then the chances of fixing this bug on the
first try are much, much beter. And even though most developers will
[rightly] cringe at the notion of asking poor customer for a core file, or
get their hands dirty, so to speak, at the end of the day the problem is
fixed.
Customers just want stuff fixed. Fixed now. Fixed today. They don't want 6
weeks of release engineering and testing. They don't want the equivalent of
"try this".
The real reason bugs get shipped is not just because code is not
tested, but also because it is not tested in enough foreign
scenarios. This can be very expensive for a small development house
(which I would prefer to support, versus some behemoth developer)
I.e., the "crash-fast" policy basically fails for shipping
applications.
Clearly we disagree
In my experience, limping along is worse. It creates this illusion that
things work, but it's oh-so funky. Things work, they don't work. Customer
spends ages and ages on the support line. The first 3 calls the support
technician can't find any issue. "Works for me", hell tells the customer.
"I'll keep an eye on it". Or, "I'll see what I can do". As if....
Frustration. That's all that's created by hiding bugs. It wastes my time, it
wastes the customer's time.
There's much to be said for graceful failure. Even limping along. But,
usually you don't do this at so low a level. And nearly always you don't do
this at the expense of prolonging the pain.
The closer a bug's effects are to the source of the issue, not only does it
help the developer, it helps the customer. Customers are savvy. I won't
bother throwing anecdotes about Microsoft software in here, but customers
know what bugs are, and they know how to work around them. If they do X, and
Y happens, then they'll stopping doing X until the product is fixed. If they
do X but Y might happen, or Z might happen. Or if W or X might cause Y or Z.
That's sheer frustration. It leaves everybody involved powerless when effect
is remote from cause. It's incumbent on developers to mitigate this kind of
behavior.
Very rarely is there anything to be gained by hiding a bug with obsfuscation
and superfluous "helpful" code.
I have found that for supporting legacy, or decrepit old code which is
buggy, something as simple as a "is this pointer valid" function to be
very effective in whipping such code into some state of usability.
(This is platform specific, of course.) It seems that what Michael B.
Allen is doing might have a similar effect. Its very hard for me to
frown upon that knowing first hand how useful such a thing is. If the
application is written in the right way, often its possible to survive
and continue to be useful even in light of a detected error that would
otherwise be fatal if unhandled.
Except, when this is proper to do, you know it. You know it, because all the
alternatives suck. In those cases you don't think to yourself, "Hmmmm, is
this a good idea?" You think, "Forgive me for I have sinned." Then hopefully
you leave a comment saying, "It's not my fault! I had to do it, and here's
why...."
What I gleaned from the OP's proposition was an attempt at considering such
usage normative. But when it's the right things to do its not a matter of
policy; it's sheer logic. You have no practicable alternatives, ergo...
Bugs are a fact of life in code, and being overly vigilant to the
standard can be counter-productive to dealing with this.
Agreed.
- Bill