There’s always reasons for software to have faults. An early example from the
late ’60’s is something I did. Here’s the story.
I was taking a course on machine architecture and programming, and we had an
exercise writing an IBM 360 assembler program. I forget what the program was
supposed to do, but I wrote it and submitted it for running overnight as was
the case in those days. Next morning I was called into the Computer Center’s
office and got a serious tongue lashing for having destroyed all the output
produced by all the programs run on the University’s computers the previous day.
It turns out that what happened was the following:
1. I had mistyped “4” instead of “3” (or maybe the other way around) in one
column of one of my punch cards.
2. My program crashed and the computer proceeded to do a program dump.
3. The usual very small print line limit for student programs was ignored for
program dumps, on the theory that you really want to know what went wrong.
4. The effect of my mistyping was that a method/function stack frame was linked
to itself as its own “parent”. So there was an infinite length print out.
5. The disk software assumed that the original small limit on how much output a
program produced was still in effect, and had no check on exceeding the
capacity of the disk drive.
6. The disk filled up and the system kept trying to write. The system crashed.
7. The computer staff (it’s now about 2am in the morning) rebooted the system
and it crashed again: the disk drive software was corrupted at this point.
8. After a few attempts at that, they finally reformatted the disk, and
rebooted the machine. And all the output on that disk was lost.
The thing is, that there were two conflicting assumptions made in the design of
two separate pieces of system software. And otherwise smart people hadn’t
thought about such a conflict. The further assumption that an undergraduate
should never make a typo compounded the problem. I forget what they eventually
did: I think checks were added. Their first idea for a solution was to not let
me run any more programs on the computer, but they fortunately rethought that
one. At the time I didn’t think that I was the culprit.
The point is that large systems are always made up of multiple components, made
by multiple people, with multiple assumptions. And there’s always room for
error. That said, a lot can be done about that, and errors minimized. But
they are intrinsic to how things work.
Sam.
------------
Sam Wilmott
sam(_at_)wilmott(_dot_)ca
www.wilmott.ca
“We have met the enemy and he is us.” — Walt Kelly
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--