Association for Computing Machinery (ACM) I could use their new UNIX computer in the school's computer lab. That was important, because using UNIX I could do all my assignments without having to wait in line to use a keypunch machine and then wait to get my printout from the school's massive IBM 370 computer.
Aside from getting my homework done easier, and meeting a great bunch of people, I learned some things about computer science. First off, I learned that the ACM had a code of ethics. Up to that point I had never considered that a code of ethics would be part of a particular discipline, rather than just a life code. I figured if you're a Christian, that's your code of ethics. But that's really a subject for another post.
One of the other things that happened was I started receiving Communications of the ACM, which is their member publication. It's full of papers and abstract stuff, but one column fascinated me - Inside Risks by Peter Neumann. Go ahead and click the link and read an article or two. Good stuff. Don't forget to come back here when you're done.
OK, welcome back. One of the things which I found interesting was the sheer number of different types risks associated with computing. We can all understand the risks of losing data, or being hacked, or banking errors, but there are "real world" dangers involved as well. Toyota is finding this out with their sticky gas pedal recall. The flaw is a mechanical one, but since the gas pedal is really just an input device to a computer that controls the car's acceleration, Toyota is (rightly) being called to task for not having safeties in the software. For instance, brake pedal pressure is also monitored by computer (for antilock brake operation). Why not cut out the accelerator when the brake pedal is being frantically pushed?
It seems like an obvious thing in hindsight but should Toyota have had the foresight to do it? As a software architect, and a reader of "Inside Risks" I have to say that they should. The "state of the art" in software today has been "dumbed down" due to short sighted management and low customer expectations. Developers, especially those who work on systems that control "real world" devices, need to consider failure modes in the design of their software.
Of course, although the Toyota blunder is making big headlines, it is not that serious a problem. Far more serious is the death of Scott Jerome-Parks, as described in this NY Times article. Mr. Jerome-Parks died in 2007 at the age of 43 from a radiation overdose for cancer treatment.
His treatments were performed using a new state of the art linear accelerator. The device works by accelerating a beam of electrons and focusing them onto a tungsten target. That converts the energy of the electrons into X-rays, which then pass through a "multileaf collimator". The collimator is essentially a set of metal "windows" that can open or close by varying amounts to control the size and shape of the X-rays that pass through them. Everything, from the strength of the electron beam to the shape of the collimator is controlled by software.
According to the article:
The investigation into what happened to Mr. Jerome-Parks quickly turned to the Varian software that powered the linear accelerator.
The software required that three essential programming instructions be saved in sequence: first, the quantity or dose of radiation in the beam; then a digital image of the treatment area; and finally, instructions that guide the multileaf collimator.
When the computer kept crashing, Ms. Kalach, the medical physicist, did not realize that her instructions for the collimator had not been saved, state records show. She proceeded as though the problem had been fixed.
“We were just stunned that a company could make technology that could administer that amount of radiation — that extreme amount of radiation — without some fail-safe mechanism,” said Ms. Weir-Bryan, Ms. Jerome-Parks’s friend from Toronto. “It’s always something we keep harkening back to: How could this happen? What accountability do these companies have to create something safe?”It seems that the machine ignored the fact that the instructions for the collimator were missing, and left the collimator wide open, exposing Mr. Jerome-Parks to the highest possible dose of radiation. I don't mean to say that people should avoid radiation therapy, which does save many lives, or to demonize that particular machine, which no doubt has also saved lives. Also, since then, Varian has released an update to add fail safes to the system.
However it came too late to save Mr. Jerome-Parks. Nor is his case unique. According to the Times article errors in radiation therapy are more common than realized, and some of them are due to software errors. Not mentioned in the Times article are issues that Peter Neumann wrote about in "Inside Risks in Medical Electronics" in 1990 about the Therac-25 software problems. When a particular set of commands was entered, the device malfunctioned, emitting high doses of radiation into the patient, resulting in at least 4 deaths between 1985 and 1987.
As we make machines more and more complex, we tend to rely on software more, without realizing that while software simplifies the need for specialized mechanisms, it does not of itself simplify the control problems it intends to solve. With poorly architected, designed, and implemented software accepted as the norm in our homes and offices, it is no wonder that failures happen in critical equipment.
If we look at the Toyota problem, it is actually blown way out of proportion. I don't mean that the people who died or were injured are not important, I mean that we are focusing a lot of attention on this issue, while overlooking similar types of problems that cause even more deaths and injuries. I hope that when the Toyota problem is fixed we won't forget that other people are dying from shoddy software.