Motivation of software professionals

  • Thread starter Stefan Kiryazov
  • Start date
B

Brian

Free (as in beer) software brings up an interesting set of arguments. If
I understand your point as being, if a product is free how can one
possibly sue the maker of it for flaws in the product? Correct me if I'm
wrong.

I have my own thoughts on this topic but I simply want to make sure what
we're discussing.

Imagine driving by a house and seeing a car in front with
this sign -- "Free car." It is your responsibility to
check out the car. If I were interested in that car, I'd
talk to the giver of the car, check out the car for
myself (is it stolen?) and then either drive it carefully
to a mechanic or have a mechanic come to the car. After
that I'd be the only one that rides in the car for a
month or two to be more certain that it is in fact a safe
car. As long as the giver reveals any known problems
about the car to me, I don't think there's any basis for
suing him if the car is later found to have a serious problem.


Brian Wood
http://webEbenezer.net
(651) 251-9384
 
S

Seebs

Where do you get your conclusions that there was much software out there
that was worth re-writing eighteen years ahead of time? Remember to
allow for compound interest on the money invested on that development...

I'm using tens of thousands of lines of code right now that are over twenty
years old. It's called "OS X", and it contains a large hunk of code that
was written either at Berkeley in the 80s or at NeXT in the 80s.

We're still using classes with names like "NSString", where "NS" stands for
NeXTStep. You know, the company that folded something like fifteen years
ago, and wrote this stuff originally prior to 1990?

Heck, I have code *I personally wrote* 19 years ago and which I still use.
It was last touched in any way in 1998, so far as I can tell. It's been
untouched ever since, *because it works correctly*.

And honestly, here's the big thing:

In reality, I do not think that writing things correctly costs that much
more. Because, see, it pays off in debugging time. The rule of thumb they
use at $dayjob is that in the progression from development to testing and
then to users, the cost of fixing a bug goes up by about a factor of 10
at each level. That seems plausible to me. So if I spend an extra day
on a project fixing a couple of bugs, those bugs would probably cost about
10 days total people-time if caught in testing, and 100 if caught by
our customers. And 1000+ if caught by their customers.

This is why I have successfully proposed "this is too broken to fix, let's
build a new one from scratch" on at least one occasion, and gotten it done.

-s
 
B

Bo Persson

MarkusSchaber said:
Hi,



Maybe not the locksmith itself, but there are insurance companies
which calculate how high the risk is, and they take that liability.

For locks, cars, even airplanes, insurance companies do that all the
time. But there are only a few cases where this is done for
software.

Is it? What about the software that controls the locks, cars, and
airplanes?


Bo Persson
 
L

Lew

Andy said:
Pretty well everything I saw back in 1982 was out of use by 1999. How
much software do you know that made the transition?

Pretty much everything I saw back in 1982 is in production to this day, never
mind 1999.

Pretty much everything that had Y2K issues in 1999 was in production since the
1980s or earlier. By the 90s, more software was written without that bug.

Again, why do you think Y2K was such an issue, if affected software had gone
out of production by then?
Let's see.. Operating systems. The PC world was... umm.. CP/M 80? Maybe
MS-Dos 1.0? And by 1999 I was working on drivers for Windows 2000.
That's at least two, maybe three depending how you count it, ground-up
re-writes of the OS.

PCs were not relevant in 1982. PCs largely didn't have Y2K issues; it was
mainly a mainframe issue.
With that almost all the PC apps had gone from 8 bit versions in 64kb of
RAM to 16-bit DOS to Windows 3.1 16-bit with non-preemptive multitasking
and finally to a 32-bit app with multi-threading and pre-emptive
multitasking running in hundreds of megs.

Largely irrelevant to the discussion of Y2K issues, which were a mainframe
issue for the most part.

PCs were not in common use in 1982.
OK, so how about embedded stuff? That dot-matrix printer became a
laserjet. The terminal concentrator lost its RS232 ports, gained a
proprietary LAN, then lost that and got ethernet. And finally
evaporated in a cloud of client-server computing smoke.

Not relevant to the discussion of Y2K issues.
I'm not so up on the mainframe world - but I'll be surprised if the
change from dumb terminals to PC clients didn't have a pretty major
effect on the software down the back.

This was mainframe stuff. Most PC software didn't have Y2K bugs, and there
weren't PCs in common use in 1982.

PCs have had negligible effect on mainframe applications, other than to
provide new ways of feeding them.
Where do you get your conclusions that there was much software out there
that was worth re-writing eighteen years ahead of time? Remember to
allow for compound interest on the money invested on that development...

Software development costs are inversely proportional to the fourth power of
the time allotted. That's way beyond the inflation rate.

Y2K repair costs were inflated by the failure to deal with them early, not
reduced.

The point of my example wasn't that Y2K should have been handled earlier, but
that the presence of the bug was not due to developer fault but management
decision, a point you ignored.
 
F

Flash Gordon

Andy said:
Pretty well everything I saw back in 1982 was out of use by 1999. How
much software do you know that made the transition?

OK, so how about embedded stuff? That dot-matrix printer became a
laserjet. The terminal concentrator lost its RS232 ports, gained a
proprietary LAN, then lost that and got ethernet. And finally
evaporated in a cloud of client-server computing smoke.

I know there is software flying around today that is running on Z80
processors (well, the military variant of them) and the plan in the late
90s was for it to continue for another 20 years (I don't know the
details, but a customer signed off on some form of ongoing support
contract). Admittedly the software I used was not doing date processing
(apart from the test rigs, which used the date on printouts, which I
tested to "destruction" which turned out to be 2028).

So yes, software from the 80s is still in active use today in the
embedded world and planned to be in use for a long time to come.
I'm not so up on the mainframe world - but I'll be surprised if the
change from dumb terminals to PC clients didn't have a pretty major
effect on the software down the back.

Where do you get your conclusions that there was much software out there
that was worth re-writing eighteen years ahead of time? Remember to
allow for compound interest on the money invested on that development...

Remember to allow for the fact that software does continue to be used
for a *long* time in some industries.
 
W

Wojtek

Lew wrote :
The point of my example wasn't that Y2K should have been handled earlier, but
that the presence of the bug was not due to developer fault but management
decision, a point you ignored.

At the time (70's etc) hard drive space was VERY expensive. All sorts
of tricks were being used to save that one bit of storage. Remember
COBOL's packed decimal?

So the decision to drop the century from the date was not only based on
management but on hard economics.

Which, I will grant, is not a technical decision, though the solution
was...

And at the time Y2K was created it was not a bug. It was a money saving
feature. Probably worth many millions.
 
L

Lew Pitcher

Lew wrote :

At the time (70's etc) hard drive space was VERY expensive. All sorts
of tricks were being used to save that one bit of storage. Remember
COBOL's packed decimal?

Packed decimal (the COBOL COMP-3 datatype) wasn't a "COBOL" thing; it was an
IBM S370 "mainframe" thing. IBM's 370 instructionset included a large
number of operations on "packed decimal" values, including data conversions
to and from fixedpoint binary, and math operations. IBM's COBOL took
advantage of these facilities with the (non-ANSI) COMP-3 datatype.

As for Y2K, there was no "space advantage" in using COMP-3, nor was there an
overriding datatype-reason to store dates in COMP-3. While "space
requirements" are often given as the reason for the Y2K truncated dates,
the truncation usually boiled down to three different reasons:
1) "That's what the last guy did" (maintaining existing code and design
patterns),
2) "We'll stop using this before it becomes an issue" (code longevity), and
3) "We will probably rewrite this before it becomes an issue"
(designer/programmer "laziness").

Space requirements /may/ have been the initial motivation for truncated
dates, but that motivation ceased being an issue in the 1970's, with
cheap(er) high(er) density data storage.

FWIW: I spent 30+ years designing, writing, and maintaining S370 Assembler
and COBOL programs for a financial institution. I have some experience in
both causing and fixing the "Y2K bug".
So the decision to drop the century from the date was not only based on
management but on hard economics.

Which, I will grant, is not a technical decision, though the solution
was...

And at the time Y2K was created it was not a bug.

I agree.
It was a money saving feature. Probably worth many millions.

I disagree. It was a money-neutral feature (as far as it was a feature) that
would have (and ultimately did) cost millions to change.

Alone, it didn't save much (there's enough wasted space at the end of each
of those billions of mainframe records (alignment issues, don't you know)
to easily have accommodated two more digits (one 8-bit byte) in each
critical date recorded).

The cost would have been in time and manpower (identifying, coding, testing,
& conversion) to expand those date fields after the fact. And, that's
exactly where the Y2K costs wound up. /That's/ the expense that management
didn't want in the 70's and 80's (and got with interest in the 90's).
 
S

Seebs

Space requirements /may/ have been the initial motivation for truncated
dates, but that motivation ceased being an issue in the 1970's, with
cheap(er) high(er) density data storage.

Furthermore, what with the popularity of 30-year mortgages, people were
dealing with Y2K in or before 1970...

-s
 
N

Nick Keighley

I am not an expert at law, so I cannot reason about justification or
necessity. However, I do recall quite a few "mishaps" and software
bugs that cost both money and lives.
Let's see: a) Mariner I, b) 1982, an F-117 crashed, can't recall if
the pilot made it, c) the NIST has estimated that software bugs cost
the US economy $59 billion annually, d) 1997, radar software
malfunction led to a Korean jet crash and 225 deaths, e) 1995, a
flight-management system presents conflicting information to the
pilots of an American Airlines jet, who got lost, crashed into a
mountain, leading to the deaths of 159 people, f) the crash of Mars
Polar Lander, etc. Common sense tells me that certain people bear
responsibility over those accidents.
http://catless.ncl.ac.uk/risks


How can anybody ignore this? Do more people have to die for us to
start educating software engineers about responsibility, liability,
consequences? Right now, CS students learn that an error in their
program is easily solved by adding carefully placed printf()'s or
running inside a debugger, and that the worst consequence if the TA
discovers a bug in their project solution is maybe 1/10 lesson
credits.

I was exposed to the same mentality, but it's totally fucked up.



So what? We already know how to write more reliable software, it's
just that we don't care.
 
A

Arved Sandstrom

Leif said:
Not really. Remember, you can pack 256 years into a single 8 bit byte if
you want to, but in most cases of the Y2K problem people had stored a
resolution of 100 years into two bytes -- quite wasteful of space.

In some cases it came from a too tight adherence to the manual business
process that was modeled -- remember the paper forms with "19" pre-printed
and then two digits worth of space to fill out? Those got computerised
and the two-digit year tagged along.

In other cases it boiled down to "this is how we've always done it."

As to the first case, one large app I am familiar with is a J2EE
modernization of a legacy mainframe program. The business users, for
decades, have produced lots of paper, mostly for auditing reasons. All
the printed output has gone into folders. When the mainframe program was
written it included the folder concept, although the application
entirely did not need it. Even better, when the application was
modernized a few years ago, into J2EE, the developers at the time made
strong representations to business to leave the folder concept, and all
of its associated baggage, completely out. No such luck - it's still
there. The folder concept has to be maintained in the application,
maintained through transactions, all the book-keeping related to folders
is also duplicated in the application, and it's completely irrelevant to
proper functioning of the application. In fact it's completely
irrelevant period, but try selling that.

This kind of thing happens all the time, and it results from a failure
to identify and model the real business processes, as opposed to the
things that simply look like business processes. Back when they had
typewriters the folders made sense...like in the '60's.

AHS
 
A

Arved Sandstrom

Leif said:
Imagine the cook at a soup kitchen storing raw and fried
chicken in the same container.

Or imagine a company giving out away a free game as a marketing
stunt and the game turns out to have been infected with a virus
that formats the users' hard-drives.

Or imagine the author of an open-source product not paying
sufficent attention and accepting a patch from a third party
which turns out to have included a backdoor, providing full
access to any system where the program is running.

This is what I am getting at, although we need to have Brian's example
as a baseline. In this day and age, however, I'm not convinced that a
person could even give away a free car (it wouldn't be free in any case,
it would still get taxed, and you'd have to transfer title) and be
completely off the hook, although 99 times out of 100 I'd agree with
Brian that it's not a likely scenario for lawsuits.

With software the law is immature. To my way of thinking there are some
implied obligations that come into effect as soon as a software program
is published, regardless of price. Despite all the "legal" disclaimers
to the effect that all the risk is assumed by the user of the free
software, the fact is that the author would not make the program
available unless he believed that it worked, and unless he believed that
it would not cause harm. This is common sense.

I don't know if there is a legal principle attached to this concept, but
if not I figure one will get identified. Simply put, the act of
publishing _is_ a statement of fitness for use by the author, and to
attach completely contradictory legal disclaimers to the product is
somewhat absurd.

It's early days, and clearly software publishers are able to get away
with this for now. But things may change.

AHS
 
M

Michael Foukarakis

You say that like the developers were at fault.  I cannot tell you how many
times I've seen management overrule developers who wanted to make things
right.  It's been the overwhelming majority, though.  I recall a manager in
1982 refusing to let a team fix the Y2K bug in the project.  Many good
developers have grown resigned to the policies and have given up pushing for
quality.  Many more use stealth quality - they simply don't tell management
they're doing things in an unauthorized way that's better than the official
process.  Only rarely in the last thirty years have I encountered
management alignment with known best practices.

If management overrules developers there should also be a clear,
concise, legal way of assuming responsibility. I'm not blaming
developers; I'm saying they shouldn't be exempt from law (which they
aren't) but they should also be aware of that (which they aren't).
Nearly all projects I've worked on involved many programmers, dozens even..
Parts are written independently of each other, often over a period of years.
Often each part test perfectly in isolation and only reveal bugs emergently
under production conditions.

Same old, same old. Bugs of that type emerge when a module is used in
a way not compliant to its interface specification. There's still
someone to blame - the moron that didn't RTFM.
Many of those projects had large test teams.  Products have passed all the
tests, yet still failed to meet spec in production.

There's an easy explanation for that. Most of the time, software is
written to satisfy tests, particularly so in TDD. "Our software passes
the tests, because it was made to pass the tests. Ergo, it works." and
then they gasp in amazement at the first bugs.
Sometimes the provided test environment differed significantly from the
production environment.

And dozens of other factors that must be taken into account.
Carelessness leads to errors. Sometimes fatal ones.
Before you make the developer liable, you'd better darn well be certain the
developer is actually the one at fault.

I've already disclaimed that is neither my job, nor my interest.
That's why law officials, judges, law systems, etc. exist. But it's a
reality we have to educate ourselves about.
 
M

Martin Gregorie

No, it was a bug that wasted a byte and threw away data. And it's still
a bug - some of the "solutions" adopted by the industry just shifted the
problem on a little, by using a "century window" technique. That will
catch up with us eventually.
Lets not forget that up to some time in the '90s COBOL could not read the
century, which created a blind spot about four digit years in many IT
people, COBOL being the language of choice for many mainframe systems
(and a lot of minicomputers too, thanks to the quality of the Microfocus
implementation).

Until CODASYL changed the language spec, some time in the mid '90s, the
only way you could get the date from the OS was with the "ACCEPT CURRENT-
DATE FROM DATE." where CURRENT-DATE could only be defined as a six digit
field:

01 CURRENT-DATE.
05 CD-YY pic 99.
05 CD-MM pic 99.
05 CD-DD pic 99.
 
S

Seebs

With software the law is immature. To my way of thinking there are some
implied obligations that come into effect as soon as a software program
is published, regardless of price. Despite all the "legal" disclaimers
to the effect that all the risk is assumed by the user of the free
software, the fact is that the author would not make the program
available unless he believed that it worked, and unless he believed that
it would not cause harm. This is common sense.

Common sense has the interesting attribute that it is frequently totally
wrong.

I have published a fair amount of code which I was quite sure had at
least some bugs, but which I believed worked well enough for recreational
use or to entertain. Or which I thought might be interesting to someone
with the time or resources to make it work. Or which I believed worked in
the specific cases I'd had time to test.

I do believe that software will not cause harm *unless people do something
stupid with it*. Such as relying on it without validating it.
I don't know if there is a legal principle attached to this concept, but
if not I figure one will get identified. Simply put, the act of
publishing _is_ a statement of fitness for use by the author, and to
attach completely contradictory legal disclaimers to the product is
somewhat absurd.

I don't agree. I think it is a reasonable *assumption*, in the lack of
evidence to the contrary, that the publication is a statement of *suspected*
fitness for use. But if someone disclaims that, well, you should assume that
they have a reason to do so.

Such as, say, knowing damn well that it is at least somewhat buggy.

Wind River Linux 3.0 shipped with a hunk of code I wrote, which is hidden
and basically invisible in the infrastructure. We are quite aware that it
had, as shipped, at least a handful of bugs. We are pretty sure that these
bugs have some combination of the following attributes:

1. Failure will be "loud" -- you can't fail to notice that a particular
failure occurred, and the failure will call attention to itself in some
way.
2. Failure will be "harmless" -- operation of the final system image
built in the run which triggered the failure will be successful because
the failure won't matter to it.
3. Failure will be caught internally and corrected.

So far, out of however many users over the last year or so, plus huge amounts
of internal use, we've not encountered a single counterexample. We've
encountered bugs which had only one of these traits, or only two of them,
but we have yet to find an example of an installed system failing to operate
as expected as a result of a bug in this software. (And believe me, we
are looking!)

That's not to say it's not worth fixing these bugs; I've spent much of my
time for the last couple of weeks doing just that. I've found a fair number
of them, some quite "serious" -- capable of resulting in hundreds or thousands
of errors... All of which were caught internally and corrected.

The key here is that I wrote the entire program with the assumption that I
could never count on any other part of the program working. There's a
client/server model involved. The server is intended to be robust against
a broad variety of misbehaviors from the clients, and indeed, it has been
so. The client is intended to be robust against a broad variety of
misbehavior from the server, and indeed, it has been so. At one point in
early testing, a fairly naive and obvious bug resulted in the server
coredumping under fairly common circumstances. I didn't notice this for two
or three weeks because the code to restart the server worked consistently.
In fact, I only actually noticed it when I noticed the segfault log messages
on the console...

A lot of planning goes into figuring out how to handle bad inputs, how
to fail gracefully if you can't figure out how to handle bad inputs, and so
on. Do enough of that carefully enough and you have software that is at
least moderately durable.

-s
p.s.: For the curious: It's something similar-in-concept to the "fakeroot"
tool used on Debian to allow non-root users to create tarballs or disk images
which contain filesystems with device nodes, root-owned files, and other
stuff that allows a non-root developer to do system development for targeting
of other systems. It's under GPLv2 right now, and I'm doing a cleanup pass
after which we plan to make it available more generally under LGPL. When
it comes out, I will probably announce it here, because even though it is
probably the least portable code I have EVER written, there is of course a
great deal of fairly portable code gluing together the various non-portable
bits, and some of it's fairly interesting.
 
N

Nick Keighley

Same old, same old. Bugs of that type emerge when a module is used in
a way not compliant to its interface specification. There's still
someone to blame - the moron that didn't RTFM.

cf. Ariane 5

you're assuming there *is* an interface specification. And that it is
unambiguous. I submit that unless these things are written *very*
carefully there are going to be odd interactions between sub-systems.


whilst test teams are good they are only half (or less) of the
solution.


the testing was inadequate then. System test is supposed to test
compliance with the requirement.
There's an easy explanation for that.
maybe


Most of the time, software is
written to satisfy tests, particularly so in TDD. "Our software passes
the tests, because it was made to pass the tests. Ergo, it works." and
then they gasp in amazement at the first bugs.

there is confusion between two types of testing. TDD is about
producing "an executable specification". You come close to "proving"
that the software does what you expect of it.
Of course "what you expect" ain't necessarily what the customer asked
for (and probably a million miles away from what he /wanted/!). The
System Test people do black box testing (no access to internals) and
demonstrate that it meets the requirement. The customer then witnesses
a System Acceptance Test (often a cut-down version of System test plus
some goodies of his own (sometimes just ad hoc "what does this do
then?")).

Skipping either of these leads to problems. TDD type tests don't test
against the requirement (Agile people often purport to despise formal
requirements http://en.wikipedia.org/wiki/Big_Design_Up_Front). And
often run in a non-production environments. Maybe even on the wrong
hardware. Relying only on System Test leads to subtle internal
faults. "it goes wrong when we back up exactly 32 characters on a
message of exactly this size". Systems put together out of untested
components are a ***GIT*** to debug.


oh yes. Sometimes we don't see some of the hardware until we are on a
customer site.
 
N

Nick Keighley

I'm terribly sorry, but I didn't get your point, if there was one.
Seriously, no irony at all. Care to elaborate?

oh, sorry. You were listing "software bugs that cost both money and
lives", I thought your list was a bit light (Ariane and Therac spring
to mind immediatly). I thought you might not have come across the
RISKs forum that discusses many computer related (and often software
related) bugs.
 
M

Martin Gregorie

the testing was inadequate then. System test is supposed to test
compliance with the requirement.
Quite. System tests should at least be written by the designers, and
preferably by the commissioning users.

Module tests should NOT be written by the coders.
The System Test people do black box
testing (no access to internals) and demonstrate that it meets the
requirement. The customer then witnesses a System Acceptance Test (often
a cut-down version of System test plus some goodies of his own
(sometimes just ad hoc "what does this do then?")).
These are the only tests that really count apart from performance testing.

Its really important that the project manager keep an eye on all levels
of testing and especially on how the coders design unit tests or it can
all turn to worms.
 
L

Lew Pitcher

Lets not forget that up to some time in the '90s COBOL could not read the
century,

.... using it's built-in date functions ...
which created a blind spot about four digit years in many IT
people, COBOL being the language of choice for many mainframe systems
(and a lot of minicomputers too, thanks to the quality of the Microfocus
implementation).

Until CODASYL changed the language spec, some time in the mid '90s, the
only way you could get the date from the OS

.... through the COBOL language itself ...
was with the "ACCEPT CURRENT-
DATE FROM DATE."
[snip]

Which is why "ACCEPT CURRENT-DATE" wasn't used very much in my shop.

Rather, we read current-date using an external utility module ("TDDATE")
which retrieved the full current date /and time/, and even adjusted it (if
required by options) for start-of-business-day (i.e. dates changed at 8AM,
not at midnight).

I don't doubt that many other serious COBOL shops had similar facilities.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top