SOFTWARE HORROR STORIES
The time is now
- The Mars Climate Orbiter crashed in September
1999 because of a "silly mistake": wrong units in a program. Story Story Report
- The 1988 shooting down of the Airbus 320 by the
USS Vincennes was attributed to the cryptic and misleading output
displayed by the tracking software. Story
More
- Death resulted from inadequate testing of the
London Ambulance Service software. Story
- Several 1985-7 deaths of cancer patients were
due to overdoses of radiation resulting from a race condition between
concurrent tasks in the Therac-25 software. Report
Report
Story More
More
More More
- Errors in medical software have caused deaths.
Details in B.W. Boehm, "Software and its Impact: A Quantitative
Assessment," Datamation, 19(5), 48-59(1973).
- An Airbus A320 crashes at an air show. Story
- A China Airlines Airbus Industrie A300 crashes
on April 26, 1994 killing 264. Recommendations include software
modifications. Summary
- The British destroyer H.M.S. Sheffield was sunk
in the Falkland Islands war. According to one report, the ship's radar
warning systems were programmed to identify the Exocet missile as
"friendly" because the British arsenal includes the Exocet's homing
device and allowed the missile to reach its target, namely the
Sheffield. From "The development of software for ballistic-missile
defense," by H. Lin, Scientific American, vol. 253, no. 6 (Dec.
1985), p. 48.
- An error in an aircraft design program
contributed to several serious air crashes. From P. Naur and B.
Randell, eds., Software Engineering: Report on a Conference
Sponsored by the NATO Science Committee, Brussels, NATO Scientific
Affairs Division, 1968, p. 121.
- An Air New Zealand airliner crashed into an
Antarctic mountain; its crew had not been told that the input data to
its navigational computer, which described its flight plan, had been
changed. From "The development of software for ballistic-missile
defense," by H. Lin, Scientific American, vol. 253, no. 6 (Dec.
1985), p. 52.
- The Ariane 5 satellite launcher malfunction was
caused by a faulty software exception routine resulting from a bad
64-bit floating point to 16-bit integer conversion. Report
Story
Story
Story Story
- During the maiden flight of the Discovery space
shuttle, 30 seconds of (non-critical) real-time telemetry data was lost
due to a problem in the requirement stage of the software development
process. Story
- A train stopped in the middle of nowhere
(London' Docklands Light Railway) due to future station location
changes after the software was deployed and reluctance to change the
software. Story
- The Dallas/Fort Worth air-traffic system began
spitting out gibberish in the Fall of 1989 and controllers had to track
planes on paper. "Ghost in the Machine," Time Magazine, Jan.
29, 1990. p. 58. Story
- Several Space Shuttle missions have been
delayed
due to hardware/software interaction problems. Story
- An airplane software control returned
inappropriate responses to pilot inquiries during abnormal flight
conditions. Story
- The Pathfinder reset problem. Story More
- An Iraqi Scud missile hit Dhahran barracks,
leaving 28 dead and 98 wounded. The incoming missile was not detected
by the Patriot defenses, whose clock had drifted .36 seconds during the
4-day continuous siege, the error increasing with elapsed time since
the system was turned on. This software flaw prevented real-time
tracking. The specifications called for aircraft speeds, not Mach 6
missiles, for 14-hour continuous performance, not 100. Patched software
arrived via air one day later. From ACM SIGSOFT Software
Engineering Notes, vol 16, #3. See Story More More More
- Bug-infested [air traffic control software] was
scoured by software experts at Carnegie-Mellon and the Massachusetts
Institute of Technology to determine whether it could be salvaged or
had to be canceled outright. Story
- Were a missile to approach at a certain tricky
angle (all) 27 programs would fail to shoot it down. Story
- The Apollo 8 spacecraft erased part of the
computer's memory. From G. J. Myers, Software Reliability:
Principles & Practice, p. 25.
- Eighteen errors were detected during the 10-day
flight of Apollo 14. From G. J. Myers, Software Reliability:
Principles & Practice, p. 25.
- A 1963 NORAD exercise was incapacitated because
a software error caused the incorrect routing of radar information.
From G. J. Myers, Software Reliability: Principles & Practice,
p. 25.
- The U.S. Strategic Air Command's 465L Command
System, even after being operational for 12 years, still averaged one
software failure per day. From G. J. Myers, Software Reliability:
Principles & Practice, p. 25.
- An error in a single FORTRAN statement resulted
in the loss of the first American probe to Venus. From G. J. Myers, Software
Reliability: Principles & Practice, p. 25.
- On June 3, 1980, the North American Aerospace
Defense Command (NORAD) reported that the U.S. was under missile
attack. The report was traced to a faulty computer circuit that
generated incorrect signals. If the developers of the software
responsible for processing these signals had taken into account the
possibility that the circuit could fail, the false alert might not have
occurred. From "The development of software for ballistic-missile
defense," by H. Lin, Scientific American, vol. 253, no. 6 (Dec.
1985), p. 48.
- The manned space capsule Gemini V missed its
landing point by 100 miles because its guidance program ignored the
motion of the earth around the sun. From "The development of software
for ballistic-missile defense," by H. Lin, Scientific American,
vol. 253, no. 6 (Dec. 1985), p. 49.
- Five nuclear reactors were shut down
temporarily
because a program testing their resistance to earthquakes used an
arithmetic sum of variables instead of the square root of the sum of
the squares of the variables. From "The development of software for
ballistic-missile defense," by H. Lin, Scientific American,
vol. 253, no. 6 (Dec. 1985), p. 49.
- In a 1977 exercise, when it was connected to
the
command-and-control systems of several regional commands, the WWMCCS
had an average success rate for message transmission of only 38
percent. From "The development of software for ballistic-missile
defense," by H. Lin, Scientific American, vol. 253, no. 6 (Dec.
1985), p. 51.
- Aegis was installed on the U.S.S. Ticonderoga,
a
Navy cruiser. After the Ticonderoga was commissioned the weapon system
underwent its first operational test. In this test it failed to shoot
down six out of 16 targets because of faulty software; earlier
small-scale and simulation tests had not uncovered certain system
errors. In addition, because of test-range limitations, at no time were
more than three targets presented to the system simultaneously. For a
sizable attack approaching Aegis' design limits the results would most
likely have been worse. From "The development of software for
ballistic-missile defense," by H. Lin, Scientific American ,
vol. 253, no. 6 (Dec. 1985), p. 51.
- On June 19, 1985 the Strategic Defense
Initiative Organization performed a simple experiment: The crew of the
space shuttle was to position the shuttle so that a mirror mounted on
its side could reflect a laser beamed from the top of a mountain 10,023
feet above sea level. The experiment failed because the computer
program controlling the shuttle's movements interpreted the information
it received on the laser's location as indicating the elevation in
nautical miles instead of feet. As a result the program positioned the
shuttle to receive a beam from a nonexistent mountain 10,023 nautical
miles above sea level. From "The development of software for
ballistic-missile defense," by H. Lin, Scientific American ,
vol. 253, no. 6 (Dec. 1985), p. 51.
- The first operational launch attempt of the
space shuttle, whose real-time operating software consists of about
500,000 lines of code, failed because of a synchronization problem
among its flight-control computers. The software error responsible for
the failure, which was itself introduced when another error was fixed
two years earlier, would have revealed itself, on the average, once in
67 times. From "The development of software for ballistic-missile
defense," by H. Lin, Scientific American, vol. 253, no. 6 (Dec.
1985), p. 52.
- "The change was so simple he didn't feel he had
to inform anyone that it took place and the mistake he made was so
stupid. He had no idea of the damage it would caused." The day after
the product shipped 50 beta testers called and reported that all the
paychecks were being printed at zero dollars. Story
- The Sendmail security bug. Story
- INTEL processor bugs galore. List Pentium
discussion
- A computer-monitored house arrest inmate
escaped
and subsequently committed murder. This was caused by the reporting
software not re-trying when it received a busy signal at the main
computer number. Story
- The clock in the video camera indicated a
customer had withdrawn his money at the same time as a fraud occurred,
so the bank forwarded his photo to the authorities. The clock had been
off by about one hour. Story
- The nine-hour breakdown of AT&T's
long-distance telephone network in Jan. 1990, caused by an untested
code patch, dramatized the vulnerability of complex computer systems
everywhere. "Ghost in the Machine," Time Magazine, Jan. 29,
1990. p. 58. Story
- On July 1-2, 1991, computer-software collapses
in telephone switching stations disrupted service in Washington DC,
Pittsburgh, Los Angeles and San Francisco. Once again, seemingly minor
maintenance problems had crippled the digital System 7. About twelve
million people were affected in the crash of July 1, 1991. Said the New
York Times Service: "Telephone company executives and federal
regulators said they were not ruling out the possibility of sabotage by
computer hackers, but most seemed to think the problems stemmed from
some unknown defect in the software running the networks." Within the
week, a red-faced software company, DSC Communications Corporation of
Plano, Texas, owned up to glitches in the signal transfer point
software that DSC had designed for Bell Atlantic and Pacific Bell. The
immediate cause of the July 1 crash was a single mistyped character:
one tiny typographical flaw in one single line of the software. One
mistyped letter, in one single line, had deprived the nations capital
of phone service. It was not particularly surprising that this tiny
flaw had escaped attention: a typical System 7 station requires ten
million lines of code. From The Hacker Crackdown, by Bruce
Sterling, 1992. Story
More More More
- During a payday rush in 1989, a faulty program
shut down 1,800 automated-teller machines at Tokyo's Dai-Ichi Kangyo
Bank. "Ghost in the Machine," Time Magazine, Jan. 29, 1990. p.
58. Story
- When an airline's reservation system went down
in 1989, 14,000 travel agents had to book flights manually. "Ghost in
the Machine," Time Magazine, Jan. 29, 1990. p. 58. Story
- In the early 1980s, Buick had to give 80,000 V6
cars a chip transplant to fix flaws in their microprocessors. "Ghost in
the Machine," Time Magazine, Jan. 29, 1990. p. 58. Story
- The New York Stock Exchange opened one hour
late
on Dec. 18, 1995 due to a communications problem in the software. Story
- Chemical Bank went down for 5 hours on July 20,
1994 due to a file update overloading the computer system. Story
- There was a San Francisco 911 system crash of
over 30 minutes on Oct. 12, 1995. Patched but not fixed, it still
misses between 100-200 calls per day. Story
- The hole in Ozone layer over Antartica left
undetected for extended period because data was considered anomalous by
software because it was out of the specified range. Story
- The Denver airport stayed closed for over a
year
due to software glitches in the automated baggage handling system. Story More
- Bell Atlantic Corp. failed to bill
approximately
400,000 AT&T customers in parts of Virginia, Maryland, Washington
D.C., and West Virginia for their long-distance calls on their January
1998 bill. AT&T stated that their Operations Support Systems
provided Bell Atlantic with the correct billing data for three of the
twenty billing cycles, customer's billed on the 2nd, 4-5th, and 7th of
the month, and that a Bell Atlantic computer error failed to produce
the AT&T portion of the bill. Bell Atlantic has stated that the
problem was a "systems glitch", "processing error", and/or "data
processing error". [Supposedly, computer tapes were used to transfer
the billing details between AT&T and Bell Atlantic.] From an
AT&T press release, dated 16-Jan-1998, reprinted in the Richmond
Times-Dispatch, 17 Jan 1998, p. C10.
- Oodles of software will fail in the year 2000. Story More
More
Lots more
- The IRS uncovered an unintended side effect of
its effort to eliminate the Year 2000 computer bug: About 1,000
taxpayers who were current in their tax installment agreements were
suddenly declared in default due to a programming error. [There are 62
million lines of source code to check; the error was caused by an
attempted Y2K fix.] From the Associated Press newswire (AP US
& World, 23 Jan 1998, by Rob Wells).
- An alert to all National Association of
Miniature Enthusiasts (NAME) members: A member recently called the
office to find out why she hasn't received her Houseparty Gazette.
She discovered that the computer has deactivated ALL members whose
memberships expire in the year 2000 and beyond. Kim ... said she had no
way of knowing who those folks are unless they call her and let her
know. From the rec.arts.dollhouses newsgroup.
- One production line shut down when the
laser-driven printer putting "sell-by" dates on products couldn't
handle the 2000 date. Industry Week, Jan. 5, 1998, p. 26.
- Many programs err in, or simply ignore, the
century rule for leap years on the Gregorian calendar (every 4th year
is a leap year, except every 100th year which is not, except every
400th year which is). For example, early releases of the popular
spreadsheet program Lotus 1-2-3 treated 2000 as a non-leap year, a
problem eventually fixed. But, all releases of Lotus 1-2-3 take 1900 as
a leap year; by the time this error was recognized, the company deemed
it too late to correct: ``The decision was made at some point that a
change now would disrupt formulas which were written to accommodate
this anomaly''. Excel, part of Microsoft Office, has the same flaw.
From Calendrical Calculations
,
N. Dershowitz and E. M. Reingold, p. xviii.
- The New York City Taxi and Limousine Commission
chose March 1, 1996 as the start date for a new, higher fare structure
for cabs. Meters programmed by one company in Queens forgot about the
leap day and charged customers the higher rate on February 29. The
New York Times, March 1, 1997.
- A computer software error at the Tiwai Point
aluminum smelter in Southland, New Zealand at midnight on New Year's
Eve 1997 caused more than $AU 1 million of damage. The software error
was the failure to account for leap years (and considering a 366th day
in the year to be invalid), causing 660 process control computers to
shut down and the smelting pots to cool. The same problem occurred two
hours later at Comalco's Bell Bay smelter in Tasmania (which is two
hours behind New Zealand). The general manager of operations for New
Zealand Aluminum Smelters, David Brewer, said ``It was a complicated
problem and it took quite some time [until midafternoon] to find the
cause.'' The New Zealand Herald , January 8, 1997, and The
Dominion, in Wellington, New Zealand.
- A "computer error" is blamed for a false report
of three death by an incurable disease when a woman killed her daughter
and tried to kill her son and herself. From ACM SIGSOFT Software
Engineering Notes, vol. 10, no. 3
- A Norwegian class gets a pornographic image
because of cache problem, when a recycled link leads to a pornographic
site. From Internet Risks
Forum NewsGroup (RISKS), vol. 19,
issue 47.
- Computers were blamed when, in three separate
incidents, 3 million, 5.4 million, and 1.5 million gallons of raw
sewage were dumped into Willamette River. From ACM SIGSOFT Software
Engineering Notes, vol. 13, no. 3.
- The U.S. national EFTPOS system crashed on 2
Jun
1997 for two hours and 100K transactions were "lost". One central
processor failed and backup procedures to redistribute the load also
failed. From Internet
Risks Forum NewsGroup (RISKS), vol. 19,
issue 21.
- Computer blunders were blamed for $650M student
loan losses. From ACM SIGSOFT Software Engineering Notes , vol.
20, no. 3.
- An Internet routing "black hole" cuts off ISPs;
MAI Network Services routing table errors directed 50,000 routing
addresses to MAI; InterNIC goofed, as well, 23 Apr 1997. From ACM
SIGSOFT Software Engineering Notes, vol. 22, no. 4.
- Votes were lost by a computer in Toronto. The
Toronto district finally abandoned computerized voting, leaving a
year-old race unresolved. From ACM SIGSOFT Software Engineering
Notes , vol. 15, no. 2.
- A cat was registered as a voter to demonstrate
risks (no pawtograph required). From ACM SIGSOFT Software
Engineering Notes, vol. 20, no. 1.
- A "read-ahead" synchronization glitch and/or an
eager operator caused a large data entry error, and the wrong winner
was announced in a Rome, Italy city election. From ACM SIGSOFT
Software Engineering Notes, vol. 15, no. 1.
- In a German parliament election, the program
rounds up the Greens' 4.97%, which was less than the 5% cutoff; when
corrected, the Social Democrats attained a one seat majority. From ACM
SIGSOFT Software Engineering Notes, vol. 17, no. 3.
- An Oregon computer error reversed election
results. From ACM SIGSOFT Software Engineering Notes, vol. 18,
no. 1.
- A (CTSS) raw password file was distributed as
message-of-the-day, due to an editor temporary file name confusion. See
Morris and Thompson, CACM 22, 11, Nov 1979.
- The U.S. Social Security Administration systems
could not handle non-Anglo names, affecting $234 billion for 100,000
people, some going back to 1937. From Internet Risks Forum
NewsGroup (RISKS) , vol 18, issue 80.
- Software prevented the correction of a
recognized Olympic skating scoring error. From ACM SIGSOFT Software
Engineering Notes, vol. 17, no. 2.
- A computer scoring glitch at an Olympic boxing
match causes the evident winner to lose. From ACM SIGSOFT Software
Engineering Notes, vol. 17, no. 4.
- A man's auto insurance rate triples when he
turns 101 (= 1 mod 100). From ACM SIGSOFT Software Engineering Notes,
vol. 12, no. 1.
- A Montreal life insurance company dies due to
software bugs in its integrated system. From ACM SIGSOFT Software
Engineering Notes, vol. 17, no. 2.
- A computer test residue generates a false
tsunami warning in Japan. From ACM SIGSOFT Software Engineering
Notes, vol. 19, no. 3.
- Chicago cat owners were billed $5 for
unlicensed
dachshunds. A database search on "DHC" (for dachshunds) found "domestic
house cats" with shots but no license. From ACM SIGSOFT Software
Engineering Notes, vol. 12, no. 3.
- The Korean Airlines KAL 801 accident in Guam
killed 225 out of 254 aboard. A design problem was discovered in
barometric altimetry in Ground Proximity Warning System (GPWS). From ACM
SIGSOFT Software Engineering Notes, vol. 23, no. 1.
NTSB final report.
- A "computer error" affected hundreds of U.K.
A-level exam results. From Internet
Risks Forum NewsGroup (RISKS),
vol. 19, issue 40.
- The Paris police computer mismatched a Corsican
city code with postal code, and was unable to collect motorists' fines.
From Internet Risks Forum
NewsGroup (RISKS), vol. 19, issue 41.
- Netscape Communicator 4.02 and 4.01a allowed
disclosure of passwords. From Internet Risks Forum NewsGroup
(RISKS),
vol. 19, issue 34.
- A bank robbery "wanted" poster of the wrong
person was due to an unchecked match. From Internet Risks Forum
NewsGroup (RISKS), vol. 19, issue 29.
- The Soviet Phobos I Mars probe was lost, due to
a faulty software update, at a cost of 300 million rubles. Its
disorientation broke the radio link and the solar batteries discharged
before reacquisition. From Aviation Week, 13 Feb 1989.
- An F-18 fighter plane crashed due to a missing
exception condition. From ACM SIGSOFT Software Engineering Notes,
vol. 6, no. 2.
- An F-14 fighter plane was lost to
uncontrollable
spin, traced to tactical software. From ACM SIGSOFT Software
Engineering Notes, vol. 9, no. 5.
- A Parisian computer transforms traffic charges
into big crimes. From ACM SIGSOFT Software Engineering Notes,
vol. 14, no. 6.
- CyberSitter censors "menu */ #define" because
of
the string "nu...de". From Internet
Risks Forum NewsGroup (RISKS),
vol. 19, issue 56.
- In a heavily loaded computer system, a steady
stream of high-priority processes can prevent a low-priority process
from ever getting resources. Generally, one of two things will happen.
Either the process will eventually be run (at 2 A.M. Sunday, when the
system is finally lightly loaded), or the computer system will
eventually crash and lose all unfinished low-priority processes....
Rumor has it that, when they shut down the IBM 7094 at MIT in 1973,
they found a low-priority process that had been submitted in 1967 and
had not yet been run. From Silbershatz and Galvin, pp. 142-143.
- GTE Corp. mistakenly printed 50,000 unlisted
residential phone numbers and addresses in 19 directories that were
leased to telemarkteters in communities between Santa Barbara and
Huntington Beach. GTE blames the problem on a software snafu. The
company faces fines of up to 1.5 billion dollars, if found guilty of
gross negligence. From comp.dcom.telecom newsgroup (27 Apr 1998);
X
Telecom Digest, Volume 18, Issue 60, Message 4 of 7.
- On Sept. 19, 1989 an overflow (of a 2-byte
integer) at a Washington, DC hospital caused a computer to collapse and
forced them to do things manually.
- On Nov. 16, 1989 an overflow (of a 2-byte
integer) in the Michingan Terminal System caused a computer crash in
Newcastle, followed by crashes all over the U.S.
- Midwest Telephone Company had a program to
assign telephone numbers with a $5 million annual maintenance budget.
In 1981, they reported: "No more than 15 known errors remain unsolved
at the end of each month." In fact, people had stopped using the
program and were entering numbers manually, leaving the database
hopelessly outdated.
- Bank of America was forced to write off a $60
million investment in a new software systems and reverted to its
15-year old predecessor.
- Due to a software error, Continental Airlines
consistently undercharged for plane rentals by one day.
- SRI International's computer reset the time by
averaging 11 clocks, though one was 12 hours off.
- In 1980, the ARPAnet shut down on account of a
self-propagating error.
- Rumor has it that a military plane flipped over
when crossing the equator.
- Rumor has it that an Airbus plane crashed into
its hangar, since its onboard computer interpreted a bump as turbulence
in the air.
- Software reboot during the Apollo 11 landing
forced Armstrong to manually land the lunar lander. Story
- In 1989, Swedish Gripen prototype crashed due
to
new software in the fly-by-wire system. Story
- In 1995, Swedish Gripen fighter plane crashed
during air-show. Story
- Soldiers killed. Story
- Roundup of US
government Y2K bugs.
- French ticket reservation software took 4
months
to get working. Story
- In October 1995, 200,000 French civil servants
were paid twice.
- On May 3, 2000, Paris area telephone service
collapsed. Story
- Software error causes patients to be declared dead. Story
- Shuttle simulator bug. Story
- Software suspected in 1994 Chinook helicopter crash, killing
29. Story
Report
- For two days during the summer holidays in 2004, the French national railroad company's
reservation system was disorganized, due to a faulty patch.
Report