Re: Bits that go bump in the night



Date view Thread view Subject view Author view

William Petrick (caprit(at)ix.netcom.com)
Fri, 29 Dec 1995 17:08:01 -0800


This is a response to Charles Waite's request for comments on his search for the "good system". First, I need to provide some background information so the readers can better understand the context of my comments. I have been a "software engineer" in th e commercial nuclear power industry for 30 years. During the last 10 years, I have developed a number of computer-based systems that are now operating in 8 nuclear power plants in the US. Over the last 18+ years, I have provided consulting services to the nuclear utilities in the area of plant computing systems. >From this broad experience base, I believe it is possible to predict the outcome of a computer project based on a review of a few key elements of the system development process that are the riskiest (I subscribe to Boehm's idea of a risk-based approach to software development, but I have extended it to the entire system development). The elements that my "good system" must have are: (1) a complete, consistent, unambiguous requirements document for the system, including software (2) a system design document, including software (3) an executable acceptance test plan and procedures (or V&V in our industry jargon) (4) a user's guide (5) configuration controls (6) a project plan with milestones >From a project risk view, I can look at a project involving software and make a quick (and usually accurate) assessment regarding the probability of success (and the presentation of Charles' "good system" award). To begin, if the requirements document is poor or incomplete, then how will anyone know the system is complete and correct? I have seen a system being developed in which the requirements document was not complete a few weeks before the "acceptance tests" . That system did not get a "good system" award. Next, if the software is being developed by someone (person, group, or company) that has no experience in the appropriate field, then they will not meet the technical requirements, much less the project budget and schedules. Software "engineers" who have no idea what interrupts are and why you may need them for real-time diagnostics and control systems will almost always fail until they learn from experience. Also, the results of a good, experienced design team will be reflected in good documentation so the system can be maintained in the future. I believe the competence of the software team can be determined from the quality of the design documentation, therefore any software with poor design documentation will never get a "good system" award from me. Testing is the next most critical item in my list. If the developer does not believe testing is important and does as little as possible to meet some contract language, then that system will have problems, I GUARANTEE IT. In our industry, the utility ex pects to install a system, turn it on, run a few confirmatory tests, and start using the system 24 hours/day, 7 days/week. No upgrades, no bug fixes, no changes. Maybe once every 5 years or so, a change will be needed for added capability. How can this be done when most software developers expect to debug on the target system? At my company, we develop simulations of the plant environment to test our products under both static and dynamic conditions. We include malfunction tests, failure modes, etc. to which the system needs to respond, including modes that are not be testabl e (practical) in the real plant. These test systems can be as complex as the system under test, but they help understand the requirements and demonstrate compliance with those requirements. If I see a system that has not undergone some level of controll ed dynamic testing with an operator in the loop, then there is a high probability that the system is not a "good system". Many systems work well until the first user touches it. If the system does not have a users guide that has been developed and tested by actual users, then I predict problems will occur when the first users are unleashed and any "good systems" award must be returned. As a system nears completion, there must be some form of configuration control or the system will deteriorate with age. I have seen systems that were 1 week from factory acceptance test without configuration controls. The confusion that arose when every one tried to get their "final" changes in for the acceptance tests was something to behold. (Amazingly, the FAT was delayed one year.) Finally, I really enjoy reviewing project management plans involving software because every experienced software engineer knows how to beat the system. How else can a multimillion dollar project with all its oversight and reviews be on schedule up to a f ew weeks before final tests, only to find out that there is a delay as long as the project itself. How can this be? (Only with software!) If I review a project that has no milestones that show the progress of the software, then I can predict that the p roject will not meet its schedule and budget. Once it is clear the schedule or budget will be missed (usually too late), then corners are cut, quality is compromised, and the "good system" award is not possible. To make matters worse, the procurement peo ple get involved, upper management, and then the lawyers. Cut bait NOW! I read a recent posting on the e-mail that showed 40+% of the problems in systems were traced back to poor or inadequate requirements. We are currently involved in a project with the Electric Power Research Institute to develop a methodology that creates a good requirements document for large, small, safety, and non-safety system upgrades at nuclear power plants. Most of the information is applicable to any high integrity system, but our emphasis is on the nuclear power industry specific requirements. My hope is that this effort will result in a systematic approach to system upgrades involving software that will eventually result in a system that qualifies for Charles' "good system" award. Maybe others out there can add to my list of what is NOT a good system which may allow us to grope the much smaller elephant that remains. Happy New Year to everyone... Bill Petrick Capri Technology 50 Curtner Ave. Suite 7 Campbell, CA 95008 408.559.5996 408.559.5998 (FAX) caprit(at)ix.netcom.com http://www.valuserve.com/ctihome.htm


Date view Thread view Subject view Author view