Our software products Qedit and Suprtool have high reputations for quality in the HP 3000 marketplace, but I suspect that our initial design and code are not much better than other people's. What may be different is the way we evaluate and revise our efforts before official release, and the way we react to error reports after the release. The result is that few users ever experience our programming or design mistakes. When one does, we have systems in place that can correct most of them quickly. That is the topic of this paper.
I have good news and bad news regarding software quality. The bad news is: compared to hardware, software quality is terrible. The good news is: I don't think it is the programmers' fault. The techniques for writing good code are well-known and I won't rehash them here. No doubt there are still more techniques and tools to come that will assist in this difficult task. However, most of the quality problems occur because of mismatches between the programs and the needs of clients, not because of bugs.
The market value of a product is not an intrinsic value, not a "value in itself", hanging in a vacuum. A free market never loses sight of the question: of value to whom? [Ayn Rand]
Your software has quality to the extent that it provides Value to some living, breathing people with choices and options. If another program solves a similar problem in a way that the person values more, it has higher quality.
The 150 "worked", but was incompatible with other PC software, used an odd-size diskette, and customers did not want the touch screen. The LaserJet, on the other hand, satisfied customer needs perfectly. [Kathy McKittrick]
Clients have their own context from which they look at our software:
Why do some subsystems require "EXIT", some "E", and some "EX" to exit? Even in MPE XL, in a single subsystem, SYSDIAG, part of it requires "EXI" and the next level up requires "EX". I know that there is an explanation and it is because parts of the software were written by different teams in vastly different geographical locations, but it seems to me that an elementary standard such as exiting the program should be uniform. [John Dunlop, Interrupt magazine]
After the state spent $20 million and nearly seven years trying to computerize its public-assistance program, the first caseworkers to use the system made their own discovery: They could figure out a client's benefits faster by hand than with the computer. [Seattle Times, May 19, 1989]
This software disaster includes most of the things that can be done wrong. The state contracted with an outside firm to design and implement the system. What started as a Big Project, grew into a Giant one. State officials bragged that COSMOS would use artificial intelligence. It was seven years before the first pilot installation, when workers found it took up to twice as long to figure out a client's eligibility with COSMOS as it did manually.
The traditional method of developing a DP system includes endless user interviews, voluminous specifications, official approval by confused users, programming phase, integration phase, testing phase, user training, and endless bureaucracy. Notice that this method does not deliver any working programs to the clients until the very end. Creating a DP system without delivering anything to the clients until everything is done is like constructing a complete office building on its side, then trying to lift it into position.
For a successful technology, reality must take precedence over public relations, for Nature cannot be fooled. [Feynman]Fantasy by top management has a devastating effect on employees. If your boss commits you to producing a new accounting system in six months that will actually take at least two years, there is no honest way to do your job. Such projects usually appear to be on schedule until the last second, then are delayed, and delayed again. Management's concern often switches from the project itself to covering up the bad publicity about the delays.
Information from the bottom which is disagreeable is suppressed by big cheeses and middle managers ... Maybe they don't say explicitly, "Don't tell me," but they discourage communication ... it's a question of whether, when you do tell somebody about some problem, they're delighted to hear about it. If you try once or twice to communicate and get pushed back, pretty soon you decide, "To hell with it." [Feynman]An objective project goal unleashes people's minds to discover solutions and attain the goal. An irrational goal just short-circuits the best within them.
Moon's First Maxim: The process of developing a system uncovers information about the system that no one could have known at the offset. [Richard Moon]Users are not, and cannot be expected to be, systems analysts. And systems designers cannot think like users. The client often cannot describe what he wants -- he does not realize how important exceptions are. Even when the analyst extracts all his wants from him and defines them in an enormous specification, he has no idea what is critical and what is frosting. In an attempt to wrench precise specifications from the client, some shops spend so long on the design that by the time they are done, the client's needs have changed.
The clients are often shut out after the general design phase. They are asked to approve the specifications so that the programmers can get to work. One thing you can be sure of: the clients may not be able to tell you what they want, but they can tell you what they don't like when you finally deliver the code.
The Hewlett-Packard Company has more success producing quality electronic instruments than quality software products. I think the reason is that the engineers who design instruments are basically their own clients -- they can visualize what would make a better product because both they and the client are engineers. The programmers who create financial accounting packages are not accountants, they are programmers -- and they can't afford to trust their personal judgment of an accounting program's quality.
A program is not static. The actions it performs vary dynamically, depending on the information that is entered. It is a moving body and is unlikely to be adequately described without using jargon. The same applies to mathematics or astronomy, or films. How can we visualize a film from a script? This is why the sooner you show the program to the user, the better it will be for his understanding. [Michel Kohon]You must get the program into the client's hands in order to find out what you don't know! Once you get a reaction from the client, you can revise the program to meet his exact needs.
Development methodologies that do not support iterative development are doomed to failure. [Richard Moon]
This is a key insight. You can never get a software design correct by just studying and interviewing. You will always have to go back and revise the design as soon as you start implementing. We use a form of iterative development called the Step by Step method that was proposed in a 1980 paper written by the above-mentioned Michel Kohon.
American companies also have often lagged behind their overseas competitors in exploiting the potential for continual improvement in the quality and reliability of products and processes. The cumulative effect of successive incremental improvements in and modifications of established products and processes can be very large; it may even outpace efforts to achieve technological breakthroughs. [Scientific American]
At Pacific Bell, a system was required for automating a million transactions. Two estimates were received, one from a big, outside firm (three years, $10 million) and one from a major Pacific Bell unit (two years, $5 million). Meanwhile, three South California employees took a crack at the task--and did it in sixty days for $40,000. [Tom Peters]Small projects have the advantage that they can be cut off or modified quickly. Big projects are hard to cancel, because of the political flak over all the money already spent, and are hard to modify, because of the rigid planning that goes into them. To be successful in the free market, you need to respond instantly to new facts and opportunities.
Small projects, especially pilot projects, are perfect for testing new ideas in the real world. Even large goals, such as a new aircraft design at Boeing, can be done as a series of small projects. Parts of new aircraft are tried out as redundant systems on current aircraft. According to Tom Peters, the key to successful innovation is to "test it now, at least some piece of it, in the real world..."
Is this approach `soft'? NO! It is hard--the very hardest. It is rational and it is `scientific'. In fact, it amounts to the organizations embracing the essence of the scientific method--empiricism and the experimental method. Piloting is the approach based on data. By contrast, decision making by proposal churning is whistling in the wind; it is the truly soft and ultimately less rational route. [Peters]
Adding manpower to a late software project makes it later.
Why is it that increasing the resources never seems to get the work done faster? One reason is economics. To produce programs, you will assign programmers, but there are never enough. Why? Because the client's demands will always increase to match your supply of programmers.
This is a common result in all human interactions. When they opened a new freeway in Vancouver recently, a highway expert said not to expect any lessening of traffic on other routes. The reason: by making it easier to travel downtown, the new freeway would entice more suburban motorists to take trips. The traffic expands to fill the roads available.
The only logical way to escape this dilemma is to limit the offer [i.e., supply]. How can we do that? One way is to limit resolutely the number of programmers working on a project. A second way is to limit explicitly the amount of time allocated to a program or system.Let's imagine for a moment that we've said we have two weeks to program our system with the existing manpower. No more than two weeks. How can we best solve the problem in the amount of time given? The natural way will be to put on paper what the MUSTS and the WANTS are. If both can be produced in two weeks, we will program both, but that is unlikely... The most important objective is to find the absolute MUSTS which can be produced with the current staff in a limited period of two weeks....Never go back on the two weeks allowed. It MUST be done in two weeks. Try to imagine that in two weeks' time, it will be the End of the World. Users will laugh, but they will, as well, appreciate your concern. [Michel Kohon]
The Step by Step method suggests dividing projects into two-week chunks, then delivering each chunk to the client for actual use. This has a number of useful results. It involves the client directly and enthusiastically in the design of the system, it means you never have to write off more than two weeks' work if your design is wrong, it means you can make constant adjustments in your goals as you get realistic feedback, and it eliminates the difference between the development and maintenance programmer. Everyone becomes a maintenance programmer, charged with delivering increasing value to the client in each step.
The software products that have not had attention after first release are now forgotten. The most successful HP program, MPE, has been continually enhanced for 15 years. But consider the Spooler. The new Native-Mode Spooler is the first enhancement since the SPOOK program, over ten years ago, and the 2680 printer support, over five years ago. No wonder four vendors can make a good living selling enhancements to the Spooler.
If you want to see the tremendous power of continuing, unrelenting, tiny improvements, you just have to look at the Japanese success in manufacturing:
The Japanese treat every product as an ongoing experiment and are constantly engaged in improving it. [Peters]
Find out what was wrong, try to understand why it had gone wrong, and then break down the corrective process into modest steps. [David Halberstram, The Reckoning]I feel strongly that a programmer should stay on a program through its life. In many shops, programmers are treated like commodities and shifted from project to project frequently. This ignores the benefits that come from continuing to work on a single project: experience in client needs, experience with the code and data structures, working relationships with the other people involved on the project and in the industry. These are lost if a programmer is shuffled off to another project as soon as he completes part of one project.
Identifying the long-term and short-term objectives will permit you, with the users, to draw a line of actions within an overall strategy. You will move from point A to point Z through points B, C, D, ..., with each point being an objective. But how to order these points? To provide a solution to the top problem [first] means that you will give the maximum result in a minimum of time, and you will repeat this with each successive point. Order the objectives from the maximum payoff to the minimum. These will be your Steps. [Michel Kohon]
Programmers have a tendency to want to work on the technical challenges first, since that is what they know best. But a beautiful screen doesn't help the client unless it has data on it that are important to him. This is not Step by Step.
Step by Step aims to discover the client's actual requirements and program them all, eventually. Suppose a client is having cash flow problems? He asks you to provide an order processing system, expecting that the more efficient invoicing will bring in cash more quickly.
The typical response is to give him an order processing system. If you could provide a complete working order processing system in two weeks, including invoicing, you would indeed solve his cash flow problem. But you can't, so you conduct a long study and install order entry as phase one. This is more work for him and does not solve his most pressing problem. This is not Step by Step.
Step by Step challenges you to deliver something in the first step that will make a big contribution toward solving the client's most pressing problem. This is not easy to do -- it takes creative thought. You might automate just the invoices with the largest dollar amount. Or just the simplest ones, leaving the staff free to deal with the ugly invoices manually. Think of solving the 20% of the cases that generate 80% of the benefit.
This is the hardest part of the Step by Step method, the part that requires the most demanding thought on your part. You will need to analyze the client's problems sufficiently to make an objective hypothesis identifying the most critical problem. For complex clients, this could be a major study. But you must always remember that the goal is the programs, not the investigation.
Since you're not the customer you have no way of knowing what's important and what's not important about the product. [anonymous HP client, quoted in "HP Corporate Quality 1989"]
For example, disposable contact lenses come in a plastic package that keeps them sterile and moist until use. However, peeling off the foil seal leaves a sharp edge that can cut your hand. The contact lens may give perfect sight, the marketing may be superb, the sales team helpful, and the product distribution speedy, but if the client cuts his hand opening the package, that undercuts the quality of the entire product.
It is frustrating to dedicate two years to a computer program, as happened to a former HP employee of my acquaintance, then find that the company bureaucracy cannot deliver the program to the users for another three years. The best people quit and form their own companies -- the ones who stay often shrug and say "that's not my job." At Robelle, we hold staff meetings every two weeks in which sales, support, finance, R&D, marketing, and administration provide each other with a global picture of product concerns and customer priorities.
No Matter
How Much
It Hurts!
The most important lesson to be learned from this incident is that even highly talented programmers make disastrous mistakes. Morris's program was comparatively small and simple, and had a limited and well-defined purpose. It faced no serious obstacles in the form of security barriers that attempted to foil it. Morris is known to be extremely skillful, was highly motivated to write an error-free program, and was not working under the pressure of any deadline. Despite all these facts, his program contained a catastrophic error [causing it to replicate and propagate itself far more rapidly than he apparently intended]. [Ornstein]I found this example in the ACM's comprehensive review of Robert Morris's famous Internet Virus that invaded 6,000 Unix systems on November 2, 1988. Producing quality software is difficult and mistakes are inevitable -- not just mistakes in programming, but more disastrously, mistakes even in identifying what the client needs.
I now suggest that we confine ourselves to the design and implementation of intellectually manageable programs... We shall do a much better programming job, provided that we approach the task with a full appreciation of its tremendous difficulty, provided that we stick to modest and elegant programming languages, provided that we respect the intrinsic limitations of the human mind and approach the task as Very Humble Programmers. [Dijsktra]Now we need to extend those insights to the equally difficult task of managing "what" programmers program. Humble management is like defensive driving. You must assume that something will go wrong at the worst possible moment and be prepared to switch gears quickly.
Start with concrete, mundane measurements that are easy to collect. They should be simple, they should relate to your long-range goals, and they should be understood by the people who make them. Just the fact that you are measuring will make a difference. If you show you care, people will refine the categories and numbers over time. Here are some you can start with:
Dan Warmenhoven has said that HP has two new metrics it will use to measure software quality: "... the number of post-release defects in the first year ... and the number of critical and serious open problem reports." [Interrupt Oct. 89].
Management communicates priorities every time it does anything -- whatever management spends time on and rewards is what the company will emphasize.
I think that the primary responsibility of managers, the fundamental one around which all others will revolve, is to break projects into manageable steps, then deliver the new software produced by each step into the client's hands, so they can give you objective feedback for the next step.
Brooks, Frederick P., The Mythical Man-Month, Reading: Addison-Wesley, 1975.
Dijkstra, E., "The Humble Programmer", 1972 Turing Award Lecture at the ACM Annual Conference, Boston, on August 14, 1972.
Feynman, Richard P., "Personal Observations on the Reliability of the Shuttle," What Do You Care What Other People Think?, New York: W. W. Norton, 1988.
Halberstam, David, The Reckoning, New York: Avon, 1986.
Kohon, Michel, "Introduction to Step by Step", SMUG II Proceedings, Langley: Robelle, 1982.
Ornstein, Severo M., Communications of the ACM, June 1989, Volume 32, No. 6.
Moon, Richard, "Managing 4GL System Development in the 1990's", Conference Proceedings of the HP Computer Users Association, Brighton, England, July 1989.
Peters, Tom, Thriving on Chaos, New York: Harper and Row, 1987.
Rand, Ayn. Capitalism: The Unknown Ideal, New York: New American Library, 1966.
Schlender, Brenton. "How to Break the Software Logjam", Fortune, September 25, 1989.