diff --git a/fmmd_concept/fmmd_concept.tex b/fmmd_concept/fmmd_concept.tex index 3650246..b4cc7fe 100644 --- a/fmmd_concept/fmmd_concept.tex +++ b/fmmd_concept/fmmd_concept.tex @@ -9,15 +9,116 @@ creating failure mode models of safety critical systems, which have a common and integrateable notation for mechanical, electronic and software domains. The proposed methodology is bottom-up and -modular. -} +modular.} } {} \section{Introduction} -\section{Some requirements for a failure mode methodolgy} +There are three methodologies in common use for failure mode modelling. +These are Fault Tree Analysis (FTA), various forms of Fault Mode Effects Analysis (FMEA) +and statistical analysis. + +These methodologies have several draw backs. +FTA can overlook error conditions, and FMEA and the Statistical Methods +lack precision in predicting failure modes at the SYSTEM level. + +The Failure Mode Modular De-composition +(FMMD) methodology presented here provides a more detailed and analytical +modelling system from which +the data models from FTA, FMEA and the statistical approach can be +derived from. +It also applies analysis stages to the failure mode analysis process +ensuring that all component failure modes must be considered in the model. + +FMMD +\ifthenelse {\boolean{paper}} +{ +paper +} +{ +chapter +} +presents a bottom up modular technique, a variant of FMEA, where instead of looking +at individual component failure modes and deciding on their impact on the SYSTEM +it uses the component failure modes, to build modules or derived components. +This methodology has been named Failure Mode Modular De-composition (FMMD) +because it de-composes a SYSTEM into a hierarchy of modules or {\dc}s. +It does this by working from the bottom up, taking small groups +of components, {\fgs}, and then analysing how they can fail. +This analysis is performed using FMEA from a micro rather than a macro perspective. +Thus instead of looked at a component failure modes, and determining how +it {\em might} cause a failure at SYSTEM level, we are looking at how +it will affect the {\fg}. +When we know the failure modes of a {\fg} we can treat it as a `black box' +or {\dc}. With {\dc}s we can build {\fgs} +at higher levels of analysis, until we have a complete +hierarchy representing the failure behaviour of the SYSTEM. +Because all the failure modes of all the components +are held in a computer program, we can determine if the model is complete +(i.e. all component failure modes have been included in the model). + + +%OK need to describe the need for it +\section{The need for a new failure mode modelling methodology} + +In summary. + +\subsection { FTA } + +This, like all top~down methodologies introduces the very serious problem +of missing component failure modes, or modelling at +a too high level of failure mode abstraction. + +\subsection { FMEA } + +This places a burden of taking individual component failure modes +and trying to determine what affects this will have at SYSTEM level. +Justifications for this methodology are often statistical and Bayes Theorem \cite{probstat} +is often cited. +This lacks precision, or in order words, determinability prediction accuracy, +as often the compoinent failure mode cannt be proven to cause a SYSTEM level failure, only to make it more likely. +Also, it can miss combinations of failure modes that will cause SYSTEM level errors. + +\subsection { Statistical Analyis } + + +This uses MTFF and other statisical models to determine the probability of +failures occurring. A component failure mode, given its MTTF +the probability oif detecting the fault and its safety safety relevant validation time $\tau$, +contributes a simple risk factor that is summed +in to give a final risk result. Thus a statistical +model can be implemented on a spreadsheet, where each component +has a calculated risk, and estimated risk importance +and these are all summed to give the final asssement figure. + +The Statistical Analysis method is used from two perspectives, +Probability of Failure on Demand (PFD), and Probability of Failure +in contiuous Operation, Failure in Time (FIT) and measured in failures per billion ($10^9$) hours of operation. +For instance with the anti-lock system on a automobile braking +system, we would be interested in PFD. +For a continuously running nuclear powerstation +we would be interested in its FIT values. + +This suffers from the same problems of +lack of determinability prediction accuracy, as FMEA above. + +By this we may have the MTTF of some critical component failure +modes, but we can only guess, in most cases what the safety case outcome +will be if it occurrs. + +This leads to having components within a SYSTEM partitioned into different +safety level zones \cite{en61508}. This is a vague way of determining +safety. + +The Statistical Analyis methodology is the core philosophy +of the Safety Integrity Levels (SIL) of EN61508 \cite{en61508}. + + +%AND then how we can solve all there problems + +\section{A wish list for a failure mode methodolgy} \begin{itemize} \item All component failure modes must be considered in the model. \item It should be easy to integrate mechanical, electronic and software models. @@ -29,17 +130,116 @@ for its results. \end{itemize} -OK need to describe the need for it +\section{building blocks of a safety critical systen} + +This section looks at common features in a safety critical system and +then looks at the building blocks of these systems +and their characteristics. + +\subsection{what is a safety critical system?} + +DEFINITIONS GET REFS + + +TYPICALLY HAS MECHANICAL, ELECTRONIC and SOFTWARE + actuators control intelligence + +\subsection{An example : industrial burner} + +An industrial burner is a nice example of a safety critical system. +It has some lethal risks and some environmental. +It could, by igniting an explosive mixture, cause an explosion. +By burning incorrect proportions of fuel and air, it could be ineffecient and waste +resources, or worse could cause poisinous burning (typically carbon monoxide, but also +where flame temperature is very high, can produce NOX emmissions) + +To prevent igniting an explosive mixture, air is pumped though the furnace +chanber on start-up, and this is verified with an air pressure switch. + + +NEED A DIAGRAM HERE + + +NEED A STATE CHART TOO + +It is interesting here to compare how the different methodologies +would deal with a particular sub-system in the burner controller +and compare how they analyse it. +The Flame scanner is a good example for this. +We shall consider a simple infra red (IR) flame scanner. +This is in the form of an IR sensitive resistor. +The flame type we will be looking for will have a characteristic +flicker frequency of around 13Hz. +The circuit is then simply a resitor voltage divider connected to +a micro-controller reading the voltage. +The flame scanner is thus a two resistor voltage divider. + +\subsection{The Flame Scanner} +\subsubsection{Macro FTA perspective} + +SHOW ALL TOP LEVEL FAULTS. EXPLOSION, POISINOUS BURNING CO, POISINOUS BURNING NOX, FAILS TO LIGHT etc + +Follow the explosion tree down to flame scanner fails ON, and OFF + +etc +\subsubsection{Macro FMEA/Statistical perspective} + +Each of the resistors is considered critical, in the statistical case, and so the MTTF +is added inot the DANGEROUS section. + +For FMEA the resistor failures add up to the SYSTEM level, show this is inappropriate +and makes several jumps in applied knowledge, thus bayes theorem etc + +\subsubsection{Micro FMMD perspective} + + +Here show how the flame scanner becomes a black box, or component in itsself. +How it is now available to be integrated into higher level designs. + +%and then an ignition position is checked. +%Initially a pilot flame is started and when this is stable, the main +%flame is fired. +%To check the stability of the flame, a flame scanner is required. +%To mix the fuel and air, motors to position valves are generally used. +%To prevent fuel leakage into the furnace, safety shut-off valves are used \footnote{These generally open slowly under power, and when power is removed `slam shut'. Thus +%in the event of a general power failure, the default to safe behaviour.} -AND then how we can solve all there problems + +Motors controlling air and fuel flow +safety chain to power for shutdown valves +safety shutdown valves on fuel +flame sensor +air pressure sensor -AND then a rough outline of what is needed +\section{Base Level Components} + +A common factor with all safety critical systems, is +the bought in components. Be these +electrical, mechanical or firmware, they all +have known failure modes. + +\subsection { Failure modes defining the component} +We can consider each bought-in component as a base level component, +and it should have an associated set of failure modes. -AND then a general description of symptom extraction + +\subsection { Complication of multiple failure modes } + A very complicated component, like an integrated circuit or perhaps a servo motor, has +a set of failure modes, where several things could go worng with it within the $\tau$ period. +This is a simultaneous failure, or more than one failure mode being active during the same time period. + + +\section{FMMD Proposed Methology Outline} + +fire away, essentially the elevator pitch + +\subsection{Treating a functional group as a component} +\subsection{Using a derived component in designs} +\section{Building a failure Mode model Hierarchy} AND the hierarchy... diff --git a/fmmd_concept/paper.tex b/fmmd_concept/paper.tex index a4cde40..2d9d85c 100644 --- a/fmmd_concept/paper.tex +++ b/fmmd_concept/paper.tex @@ -4,7 +4,7 @@ \usepackage{fancyhdr} \usepackage{tikz} \usepackage{amsfonts,amsmath,amsthm} -\input{style} +\input{../style} \usepackage{ifthen} \newboolean{paper} \setboolean{paper}{true} % boolvar=true or false diff --git a/introduction/introduction.tex b/introduction/introduction.tex index 65580da..f5a2e5a 100644 --- a/introduction/introduction.tex +++ b/introduction/introduction.tex @@ -375,8 +375,8 @@ Should the gain resistors $R30$ or $R26$ go OPEN or SHORT a fault condition will % \paragraph{Not rigorous, but tested by time} This is a typical example of an industry standard circuit that has been -thought through, and in practise works and detects most failure modes. -But it is not rogorous. It does not take into account every failure +thought through, and in practise works and detects most commonly encountered failure modes. +But it is not rogorous: it does not take into account every failure mode of every component in it. However it does lead on to an important concept of three main states of a safety critical system. @@ -471,7 +471,9 @@ are commonly used for the gas certification process. Thus to manually check this number of combinations of faults is in practise impossible. A technique of modularising, or breaking down the problem is clearly necessary. -\section{Challenger Disaster} +\section{Examples of disasters caused by designs \\ missing component errors} + +\subsection{Challenger Disaster} One question that anyone developing a safety critical analysis design tool could do well to answer, is how the methodology would cope with known previous disasters. @@ -487,7 +489,7 @@ But because of the top down nature of the FTA technique, the safety designer mus the environemtnal constraints of all component parts in order to use this correctly. This element of FTA is discussed in \ref{surveysc} -\section{Therac 25} +\subsection{Therac 25} The therac-25 was a computer controlled radiation therapy machine, which overdosed 6 people between 1985 and 1987. @@ -499,12 +501,12 @@ carried out in 1983 excluded the software \cite{safeware}[App. A]. - +\section{Practical problems in using formal methods} %% Here need more detail of what therac 25 was and roughly how it failed %% with refs to nancy %% and then highlight the fact that the safety analysis did not integrate software and hardware domains. -\section{Problems with Natural Language} +\subsection{Problems with Natural Language} Written natural language desciptions can not only be ambiguous or easy to misinterpret, it is also not possible to apply mathematical checking to them. diff --git a/thesis.tex b/thesis.tex index 6e9bd3e..1442033 100644 --- a/thesis.tex +++ b/thesis.tex @@ -66,6 +66,9 @@ \chapter{An overview of European and North Americans Standards} \input{standards/standards} +\chapter{Failure Mode Modular De-Composition} +\input{fmmd_concept/fmmd_concept} + \typeout{ ---------------- Component Failure Modes Definition } \chapter { Component Failure \\ Modes Definition} \input{component_failure_modes_definition/component_failure_modes_definition} @@ -98,8 +101,6 @@ \input{symptom_abstraction/symptom_abstraction} -\chapter{Failure Mode Modular De-Composition} -%\input{fmmd/fmmd} \chapter{A Formal Description of FMMD} \input{fmmdset/fmmdset}