RE: [sc] probabilistic values for software unclassified mail



Date view Thread view Subject view Author view Attachment view

From: Paul Musson (paul.musson@xxxxxx)
Date: Fri 06 May 2005 - 09:46:41 BST


Adding my thoughts to the discussion
(I don't often stick my head above the parapet, but thought I'd give it a go - please aim carefully when shooting back)
 
I personally have misgivings about the ability of applying probabilistic values to software failures.
software does not fail - in the context of failures usable within a FTA.
When presented with a set of input conditions, the software does exactly what it was designed to do.
This is simply not what the designer originally planned.
I fact, this is a fault, rather than a failure.  
 
One of the problems with "modelling" these faults is that the assessor has identified a mechanism that exists that, given the structure of the FTA, causes an unacceptable risk.  This is no longer the random chance that a given hardware component will fail. With no failure or even no other systematic faults present in the system, this software fault may manifest itself - and it will do so at every instance of those input conditions.  
 
So, once you have identified that a particular element in your system is critical, what do you do about it.  Classically, one would attempt to change the design - to remove this criticality.  But what happens if you have a fixed design, and your now uncovering this criticality through retrospective safety analysis.  Then you have a problem.  One can try and change the other parts of the system - the environment that it will be used in, the operators, or procedures - all which will aim to deal with the fault if it occurs (as predicted by the "model").  Similarly, one may be able to control the inputs to the software, such that they wont be able to get to the required combination - for example, limiting the control movement of an aircraft, such that a particular combination of control surface positions can't be achieved.  However, given that one will have to justify these choices (at the subsequent public enquiry) then changing the design would be my favourite control method.
 
It's rather analogous to a cutter in a vegetable preparation plant chopping off a finger if the finger is stuffed into a particular hole when the plant is running.  Every time a finger is stuffed in, it will be chopped off (no matter how many time this is tried).  This is not a failure of the finger, or a failure of the chopper (both are functioning correctly and fault free).  It is a fault in the design of the whole system that allows a particular set of circumstances to arise.  The solution to this is to change the design, to remove the set of circumstances.  It is not to calculate the probability that a finger will be poked into the hole - and then accept the risk because the finger-poke probability is acceptably low.
 
Getting back to Duncan's problem...
 
My initial thoughts are, what are you trying to demonstrate with the FTA?
Are you looking to show that the code can not influence the final risk level - i.e. demonstrate that the code is not critical - this can be done with a simple 1 or 0.
Are you trying to show which parts of the code are critical - again - 1 and 0 will do this for you - but you'll know the answer before running the tool.
Are you using the FTA to "animate" your safety argument.  I which case, the numbers entered are almost immaterial.  The logical structure offered by FTA lets you demonstrate that the identified faults combined in a particular way will not lead to the top event.  The structure and logic is important - not the final probability.
(In fact, in my opinion, this is the main reason for using a FTA at anything other than component failure level)
Slavish reliance on the failure rates produced by FTA's will always catch you out eventually.  I have seen FTA produced that show a final failure rate for a complex system in the order of 1x10^-50 - calculated to 8 significant figures.  
 
Paul Musson
________________________________

From: safety-critical-request@xxxxxx on behalf of DOSGST3B, Duncan Williams
Sent: Wed 04/05/2005 11:58
To: 'safety-critical@xxxxxx'
Subject: Re: [sc] probabilistic values for software unclassified mail




Thank you everyone who replied to my question on the use of FTA with S/W and
probabilistic values for S/W.

Just to add a bit more detail and possibly defend my corner...

My Fault tree did have a few basic events which were labelled "software
fails" and I now understand that this is wrong but on my part I think this
was down to poor labelling.  Each "software fails" event was referring to a
specific failure which impacts on the intermediate event above either
through an AND gate or OR gate.  An example of what I may re-word one such
event to would be "Failure to provide ignite signal" but ultimately this
would still arise from an unknown fault within the software which could be
exercised.  Am I missing the point or am I starting to head in the right
direction?

  
Martyn Thomas's point was interesting, discussing the distinction between
random and systematic failure analysis,  to ensure I understand this
correctly a FTA could be used with either the S/W events deleted or set to a
probability of 1 to assess the random failure probability of the system
failing but then an alternative process should be used to assess the
systematic failure of a system (both hardware and software)?

Regards
Duncan Williams
MoD DOSG


______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email
______________________________________________________________________



Echelon Consulting Limited
Echelon House
93 Fleet Road
Fleet
Hampshire
GU51 3PJ

For more information visit: www.echelonltd.com 

Tel: +44 (0) 1252 627799 
Fax: +44 (0) 1252 626904


Echelon cannot accept liability for statements made which are clearly the sender's own and not made on behalf of the Echelon Consulting Ltd. Therefore the views expressed in this e-mail are my own and do not necessarily represent the views of Echelon Consulting Limited


_____________________________________________________________________
Upon entering the internet this message was scanned by a Security System.

______________________________________________________________________
[The content of this part has been removed by the mailing list software]

Date view Thread view Subject view Author view Attachment view