Robin_PHD/fmmdset/fmmdset.tex
2011-11-02 19:34:57 +00:00

702 lines
32 KiB
TeX

% $Id: fmmdset.tex,v 1.7 2009/06/06 11:52:09 robin Exp $
%
\ifthenelse {\boolean{paper}}
{
\begin{abstract}
This paper describes an incremental and modular approach to traditional FMEA
design analysis.
%a methodology to analyse
%safety critical designs from a failure mode perspective.
%This paper concentrates on the hierarchical model: the analysis
%phases (symtom abstraction) and {\fgs} are dealt with
%in \cite{symptom_ex}.
This methodology, Failure Mode Modular De-Composition (FMMD) provides
a rigorous method for creating a failure mode model of
a SYSTEM from the bottom up starting with {\bc} level failure modes.
The FMMD process in outline is that,
components are collected into functional groups, which are analysed from a failure mode perspective,
and then a failure mode behaviour for each particular {\fg} is determined.
From this failure mode behaviour we can now treat the {\fg}
as a component or `black~box', with a known set of failure symptoms.
%
%
The failure symptoms of the {\fg} may be considered the failure modes of the
{\fg}, when viewed as a `black~box' or as a higher level `component'/sub-system.
We can thus create a new component, a {\dc}, that we can use in place
of the functional group in our design.
%
By collecting {\dcs} into {\fgs} and analysing these into higher level {\dcs} a
hierarchy is naturally formed. This hierarchy is termed an `FMMD~failure~mode~tree'.
From the FMMD failure mode trees,
modular re-usable sections of safety critical systems,
%and accurate, statistical estimation for fault frequency can be derived/
can be extracted automatically.
Thus FMMD supports re-use of analysed design sections.
The failure mode relationships, when traced, are of the form of
a directed acyclic graph. SYSTEM or top level failure modes
can be traced back to the base components that can cause them.
This means that components that may cause more than one SYSTEM failure
are handled naturally by the FMMD methodology.
FMMD provides the means to trace the causes of dangerous detected and dangerous undetected faults.
FMMD provides the means to produce cut-sets, minimal cut-sets, FTA diagrams, FMECA and FMEDA models, from
a data model built by the FMMD methodology.
It has been designed for small safety critical embedded
systems, but because of its modular and hierarchical nature, can be used to model larger systems.
FMMD was originally designed to aid formal proof for industrial burner systems, to meet EN and UL standards, including and not limited to
EN298, EN61508, EN12067, EN230, UL1998.
FMMD has a common notation spanning mechanical, electrical and software domians.
Thus complete failure mode models can be produced for electro mechanical systems controlled
by a micro-processor.
\end{abstract}
}
{
This \chappap describes the Failure Mode Modular De-Composition (FMMD)
methodology to analyse
safety critical designs from a failure mode perspective, with emphasis on building a hierarchical model, in an incremental and modular fashion.
%Failure Mode Modular De-Composition (FMMD)
FMMD provides
a rigorous method for creating a fault effects model of a system from the bottom up starting with {\bc} level fault modes.
Using symptom extraction, and taking {\fgs} of components, a fault behaviour
hierarchy is built, forming a fault model tree.
From the fault model trees,
modular re-usable sections of safety critical systems,
and accurate, statistical estimation for fault frequency can be derived automatically.
It provides the means to trace the causes of dangerous detected and dangerous undetected faults.
It provides the means to produce Minimal cut-sets, FTA diagrams and FMEDA models, from
a data model built by the FMMD methodology.
It has a common notation spanning mechanical, electrical and software failures,
and can integrate all three into the same system models. It has been designed for small safety critical embedded
systems~\cite{Clark200519}, but because of its modular and hierarchical nature, can be used to model larger systems.
It is intended to be used to formally prove systems to meet EN and UL standards, including and not limited to
EN298, EN61508, EN12067, EN230, UL1998.
}
\section{Introduction}
%This paper describes the Failure Mode Modular de-Composition (FMMD) method.
% described here, models a safety critical system from the bottom up.
The purpose of the FMMD methodology is to apply formal techniques to
the assessment of safety critical designs, aiding in identifying detected and undetectable faults
\footnote{Undetectable faults are faults which may occur but are not self~detected, or are impossible to detect by the system.}.
Formal methods are beginning to be specified in some safety standards.\footnote{Formal methods
such as the Z notation appear as `highly recommended' techniques in the EN61508 standard\cite{en61508}, but
apply only to software currently. Semi formal methods such as FMEDA are recomended for electronics.} However, some standards are now implying the handling of
simultaneous faults which complicates the scenario based approvals that are
currently used\footnote{Standard EN298:2003 demands that double simultaneous failures must be handled, in the case of
a single dangerous fault being detected.}.
% Some safety critical system assemesment criteria
%are statistical, and require a target failure rate per hour of operation be met \cite{EN61508}.
%Specific safety standards may apply criteria such as no single part failure in a system may lead to
%a dangerous fault.
There are two main philosophies in assessing safety critical systems.
One is to specify an acceptable level of dangerous faults per hour of operation\footnote{The probability of failure per hour (PFH)
is measured in failures per 1e-9 seconds}.
This is a statistical approach. This is the approach taken by the European safety reliability
standard EN61508\cite{en61508} commonly referred to as the Safety Integrity Level (SIL)
standard.
The second is to specify
that any single or double part faults cannot lead to a dangerous fault in the system under consideration.
This entails tracing the effects of all part failure modes
and working out if they can lead to any dangerous faults in the system under consideration.
This is the deterministic approach, and looks for causal links from the {\bc} failure modes
to SYSTEM failures.
%For instance, during WWII after operational research teams had analysed data it was determined that
% an aircraft engine that can, through one part failure cause a catastrophic failure is an unacceptable design.\cite{boffin} .
Both of these methods require a complete fault behaviour model.
The statistical method
requires additional Mean Time To Failure (MTTF) data for all part failure modes.
The FMMD methodology applies defined stages and processes that will
create a modular fault mode hierarchy. From this,
complete fault analysis trees can be determined. It uses a modular approach, so that repeated sections
of system design can be modelled once, and re-used.
%formally prove safety critical
%hardware designs.
The FMMD method creates a hierarchy from
base~component fault~mode level up to system level.
%It does this using
%well defined stages, and processes.
%It allows re-use of analysed modules DOH DOH DOH
%, and to create a framework where
%fault causation trees, and statistical likelihood
%of faults occurring are
When a design has been analysed using this method, fault~trees may be traversed, and statistical likelihoods of failure
and dangerous~faults can be determined from traversing the fault tree down to the MTTFs of individual parts.
FMMD has a common notation for modelling mechanical, electronic and software designs and supports their integration.
%Starting with individual part failure modes, to collections of %parts (modules)
%and then to module level fault modes.
\subsection{Basic Concepts Of FMMD}
\paragraph{ Creating a fault hierarchy}
The main idea of the methodology is to build a hierarchy of fault modes from the {\bc}
level up to the top, or SYSTEM.
The first stage is to choose
{\bcs} that interact and naturally form {\fgs}. The initial {\fgs} are thus collections of base parts.
%These parts all have associated fault modes. A module is a set fault~modes.
From the point of view of fault analysis, we are not interested in the components themselves, but in the ways in which they can fail.
For this study a {\fg} will mean a collection of components.
In order to determine the symptoms or failure modes of a {\fg},
we need to consider all failure modes of its components.
By analysing the fault behaviour of a `{\fg}' with respect to these component failure modes,
we can derive a new set of possible failure modes. In fact we can call these
the symptoms of failure for the {\fg}.
We can stipulate that symptom collection process is surjective.
% i.e. $ \forall f in F $
By stipulating surjection for symptom collection, we ensure
that each component failure mode maps to at least one one symptom.
We also ensure that all symptoms have at least one component failure
mode.
%
This new set of faults is the set of derived faults from the perspective of the {\fg}, and is thus at a higher level of
fault~mode abstraction. Thus we can say that the {\fg} as an entity, can fail in a number of well defined ways.
In other words we have taken a {\fg}, and analysed how it can fail according to the failure modes of its parts.
These new failure~modes are derived failure modes.
%The ways in which the module can fail now becomes a new set of fault modes, the fault~modes
%being derived from the {\fg}.
We can now create a new `{\dc}' which has
the failure symptoms of the {\fg} as its set of failure modes.
This new {\dc} is at a higher `failure~mode~abstraction~level' than {\bcs}.
%What this means is the `fault~symptoms' of the module have been derived.
%
%When we have determined the fault~modes at the module level these can become a set of derived faults.
%By taking sets of derived faults (module level faults) we can combine these to form modules
%at a higher level of fault abstraction. An entire hierarchy of fault modes can now be built in this way,
%to represent the fault behaviour of the entire system. This can be seen as using the modules we have analysed
%as parts, parts which may now be combined to create new functional groups,
%but as parts at a higher level of fault abstraction.
Applying the same process with {\dcs} we can bring {\dcs}
together to form functional groups and create new {\dcs}
at a higher abstraction level.
\ifthenelse {\boolean{paper}}
{
%Reference the symptom abstraction paper here
}
{
This analysis and symptom collection process is described in detail in the Symptom Extraction chapter (see section \ref{symptomex}).
}
\subsubsection { Definitions }
\begin{itemize}
\item {\bc} - a component with a known set of unitary state failure modes. Base here mean a starting or `bought~in' component.
\item {\fg} - a collection of components chosen to perform a particular task
\item {\em derived failure mode} - a failure symptom of a functional group
\item {\dc} - a new component derived from an analysed {\fg}
\end{itemize}
\subsubsection{An algebraic notation for identifying FMMD enitities}
Consider all components used in a given system to exist as
members of a set $\mathcal{C}$.
%
Each component $c$ has an associated set of failure modes.
We can define a function $fm$ that returns a
set of failure modes $F$ for the component $c$.
Let the set of all possible components be $\mathcal{C}$
and let the set of all possible failure modes be $\mathcal{F}$.
We now define a function $fm$
as
\begin{equation}
fm : \mathcal{C} \rightarrow \mathcal{P}\mathcal{F}.
\end{equation}
This is defined by, where C is a component and F is a set of failure modes,
$ fm ( C ) = F. $
We can use the variable name $FG$ to represent a {\fg}. A {\fg} is a collection
of components. We thus define $FG$ as a set of chosen components defining
a {\fg}; all functional groups we can say that
$FG$ is a subset of the power set of all components, $ FG \in \mathcal{P} \mathcal{C}. $
We can overload the $fm$ function for a functional group $FG$
where it will return all the failure modes of the components in $FG$
given by
$$ fm (FG) = F. $$
And formally, where $\mathcal{FG}$ is the set of all functional groups,
\begin{equation}
fm : \mathcal{FG} \rightarrow \mathcal{P}\mathcal{F}.
\end{equation}
%$$ \mathcal{fm}(C) \rightarrow S $$
%$$ {fm}(C) \rightarrow S $$
We can indicate the abstraction level of a component by using a superscript.
Thus for the component $c$, where it is a `base component' we can assign it
the abstraction level zero thus $c^0$. Should we wish to index the components
(for example as in a product parts~list) we can use a sub-script.
Our base component (if first in the parts~list) could now be uniquely identified as
$c^0_1$.
We can further define the abstraction level of a {\fg}.
We can say that it is the maximum abstraction level of any of its
components. Thus a functional group containing only base components
would have an abstraction level zero and could be represented with a superscript of zero thus
`$FG^0$'. The functional group set may also be indexed.
We can apply symptom abstraction to a {\fg} to find
a set of derived failure modes. We are interested in the failure modes
of all the components in the {\fg}. An analysis process
defined by the symbol `$\bowtie$' is applied to the {\fg}.
The $\bowtie$ function takes a {\fg}
as an argument and returns a newly created {\dc}.
The $\bowtie$ analysis, a symptom extraction process, is described in chapter \ref{chap:sympex}.
Using $\abslevel$ to symbolise the fault abstraction level, we can now state:
$$ \bowtie(FG^{\abslevel}) \rightarrow c^{{\abslevel}+1}. $$
\paragraph{The symptom abstraction process in outline.} The $\bowtie$ function processes each member (component) of the set $FG$ and
extracts all the component failure modes, which are used by the analyst to
determine the derived failure modes. A new {\dc} is created
where its failure modes are the symptoms from $FG$.
Note that the component will have a higher abstraction level than the {\fg}
it was derived from.
\subsubsection{FMMD Hierarchy}
By applying stages of analysis to higher and higher abstraction
levels, we can converge to a complete failure mode model of the system under analysis.
Because the symptom abstraction process is defined as surjective (from component failure modes to symptoms)
the number of symptoms is guaranteed to the less than or equal to
the number of component failure modes.
In practice however, the number of symptoms greatly reduces as we traverse
up the hierarchy.
This is a natural process. When we have a complicated systems
they always have a small number of system failure modes.
An example of a simple system will illustrate this.
\subsection {Theoretical Example Symptom Abstraction Process }
Consider a simple {\fg} $ FG^0_1 $ comprising of two base components $c^0_1,c^0_2$.
We can apply $\bowtie$ to the {\fg} $FG$
and it will return a {\dc} at abstraction level 1 (with an index of 1 represented a as sub-script)
$$ \bowtie \big( fm(( FG^0_1 )) \big)= c^1_1 .$$
%to look at this analysis process in more detail.
By way of example, applying ${fm}$ to obtain the failure modes $f_N$
$$ {fm}(c^0_1) = \{ f_1, f_2 \} $$
$$ {fm}(c^0_2) = \{ f_3, f_4, f_5 \} $$
And overloading $fm$ to find the flat set of failure modes from the {\fg} $FG^0_1$
$$ {fm}({FG^0_1}) = \{ f_1, f_2, f_3, f_4, f_5 \} $$
The symptom extraction process is now applied
i.e. the analyst now considers failure modes $f_{1..5}$ in the context of the {\fg}
and determines the `failure symptoms' of the {\fg}.
The result of this process will be a set of derived failure modes.
For this example, let these be $ \{ s_6, s_7, s_8 \} $.
We can now create a {\dc} $c^1_1$ with this set of failure modes.
Thus:
$$ \bowtie \big( {fm}(FG^0_1) \big) = c^1_1 $$
and applying $fm$ to the newly derived component
$$ fm(c^1_1) = \{ s_6, s_7, s_8 \} $$
By representing this analysis process in a diagram, the hierarchical nature
of the process is apparent, see figure \ref{fig:onestage}.
Each $\bowtie$ analysis phase, raises the level of failure mode abstraction.
By this we can see the failure effects becoming less specific (for instance a resistor going open)
and more about the effect that will have on a functional system (for instance `amplifier one' failing)
as the failure modes raise in abstraction level.
\begin{figure}[h]
\centering
\includegraphics[width=200pt,bb=0 0 268 270]{fmmdset/onestage.jpg}
% onestage.jpg: 268x270 pixel, 72dpi, 9.45x9.52 cm, bb=0 0 268 270
%\caption{FMMD analysis of functional group}
\caption{FMMD Analysis of one functional Group: Two components form a functional group, which forms a derived component}
\label{fig:onestage}
\end{figure}
% \begin{figure}
% \centering
% \input{fmmdset/fmmdh.tex}
% \caption{FMMD example Hierarchy}
% \label{fig:sdfmea}
% \end{figure}
\begin{figure}[h]
\centering
\includegraphics[width=400pt,bb=0 0 555 520,keepaspectratio=true]{fmmdset/fmmdh.jpg}
% fmmdh.png: 555x520 pixel, 72dpi, 19.58x18.34 cm, bb=0 0 555 520
\caption{FMMD Example Hierarchy}
\label{fig:fmmdh}
\end{figure}
\section {Building the Hierarchy - Higher levels of Fault Mode Analysis}
Figure \ref{fig:fmmdh} shows a hierarchy of failure mode de-composition.
It can be seen that the derived fault~mode sets are higher level abstractions of the fault behaviour of the modules.
We can take this hierarchy one stage further by combining the abstraction level 1 components (i.e. like $c^{1}_{{N}}$) to form {\fgs}. These
$FG^1_{N}$ {\fgs} can be used to create $c^2_{{N}}$ {\dcs} and so on.
At the top of the hierarchy, there will be one final (where $t$ is the
top level) component $c^{t}_{{N}}$ and {\em its fault modes, are the failure modes of the SYSTEM}. The causes for these
system level fault~modes will be traceable down to part fault modes, traversing the tree
through the lower level {\fgs} and components.
Each SYSTEM level fault may have a number of paths through the
tree to different low level of base component failure modes.
In FTA~\cite{nucfta}~\cite{nasafta} terminology, these paths through the tree are called `cut sets'.
%A hierarchy of levels of faults becoming more abstract (at each level) should
%converge to a small sub-set of system level errors.
In any System there are number of general failure mode conditions.
This number will always be far smaller than the sum of component
failure modes of all its components.
This is because many component level failure modes
result in the same SYSTEM level failure modes.
%%-\subsection{ Proof of number of component~failure \\ modes preserved in hierarchy build}
%%-
%%-Here we need to prove that if there is an abstract fault, then as it goes higher in the tree, it can only collect MORE not less
%%-actual {\bc} failure modes.
As we go up through a fault hierarchy, the
number of failure modes to handle, should decrease
with each level of abstraction.
This thinning out of the number of system level errors is borne out in practice;
real time control systems often have a small number of major reportable faults (typically $ < 50$),
even though they may have accompanying diagnostic data.
The FMMD hierarchy can be drawn in the form of a directed acyclic graph~\cite{alg}, this is
explored in chapter \ref{chap:dag}.. % XXX better citation really required
Each stage of analysis builds the acyclic graph by adding to the top of the tree, leaving the previous work
unchanged, very much in the way that the source code control system git~\cite{git}
appends changes to source code trees.
Because of this, it is permissible, for instance, to
create a functional group from components at different levels of failure mode abstraction.
%\cite{sem}
%\begin{figure}
%\subfigure[Euler Diagram]{\epsfig{file=fmmd_hierarchy_cimg5040.eps,width=4.2cm}\label{fig:exa}}
%\subfigure[Intersection A B ]{\epsfig{file=exampleareasubtraction2.eps,width=4.2cm}\label{fig:exb}}
%\subfigure[area to subtract]{\epsfig{file=exampleareasubtraction3.eps,width=4.2cm}\label{fig:exc}}
%\subfigure[A second graphic]{\epsfig{file=exampleareasubtraction3.eps,width=2cm}}
%{\epsfig{file=fmmd_hierarchy_cimg5040.eps,width=12cm}
%\label{fig:ex}
%\caption{Simple Euler Diagram}
%\end{figure}
%\cite{sem}
\section {Modelling considerations}
%% This is obvious but needs a proof.
%% Also this means that we may need dummy modules so as not to violate jumping up the tree structure
%Complete coverage for all derived hierarch levels can be generalised thus:
%$$ CompleteCoverage = \forall \; h \; \forall \; x \exists \; y \; ( \; x \; \in \; \cup \; {\cal F} \; D^{h}
% \; \Rightarrow \; x \; \in \; \cup \; M^{h}_{y} ) $$
%% CASE STUDY BEGIN
\subsection{Case Study FMMD Hierarchy:\\ Simple RS-232 voltage reader}
\begin{figure}[h]
\centering
\includegraphics[width=340pt,bb=0 0 532 192,keepaspectratio=true]{./fmmdset/mvsblock.jpg}
% mvsblock.png: 532x192 pixel, 72dpi, 18.77x6.77 cm, bb=0 0 532 192
\caption{Milli-Volt Sensor Block Diagram}
\label{fig:mvsblock}
\end{figure}
%%% This is the tikz picture ??/
%
%\begin{figure}[h+]
%\centering
%\input{fmmdset/mvsblock.tex}
%\caption{Block Diagram : Example Milli-Volt Sensor : Block Diagram}
%%\includegraphics[scale=0.20]{ptop.eps}
%\label{fig:mvsblock}
%\end{figure}
%
Consider a simple electronic system, that provides say two milli-volt amplifiers
which passes the values onward via serial link - RS232 (see figure \ref{fig:mvsblock}). This is simple in concept, plug in a
computer, run a terminal program, and the instrument will report the milli volt readings in ASCII
with any error messages.
% in CRC checksum protected packets.
It is interesting to look at a particular {\fg}. The milli-volt amplifiers are a good example.
These can be analysed by taking a {\fg}, the components surrounding the op-amp,
a few resistors to determine offset and gain,
a safety resistor, and perhaps some smoothing capacitors.
These components form a {\fg}. This circuit is then analysed for all the fault combinations
of its parts. This produces a collection of possible symptoms/fault~modes for the milli-volt amplifier.
The two amplifiers are now connected to an ADC which converts the voltages to binary words for the micro-processor.
The micro-processor then uses the values to determine if the readings are valid and then formats text to send
via the RS232 serial line.
%
% \begin{figure}[h+]
% %\centering
% %\input{millivolt_sensor.tex}
% \includegraphics[scale=0.4]{fmmdset/millivolt_sensor.eps}
% \caption{Hierarchical Module Diagram : Milli-Volt Sensor Example}
% \label{fig:mvs}
% \end{figure}
\begin{figure}[h]
\centering
\includegraphics[width=400pt,bb=0 0 783 638,keepaspectratio=true]{./fmmdset/millivolt_sensor.jpg}
% millivolt_sensor.jpg: 783x638 pixel, 72dpi, 27.62x22.51 cm, bb=0 0 783 638
\caption{FMMD Hierarchy: Milli-volt sensor Example}
\label{fig:vs}
\end{figure}
%
% \begin{figure}[h]
% \centering
% \includegraphics[width=400pt,bb=0 0 749 507,keepaspectratio=true]{fmmdset/millivolt_sensor.png}
% % millivolt_sensor.png: 749x507 pixel, 72dpi, 26.42x17.89 cm, bb=0 0 749 507
% \caption{Hierarchial Module Diagram : Millivolt Sensor Example}
% \label{fig:mvs}
% \end{figure}
This has a number of obvious modules (or {\fgs}), the PCB power supply, the milli-volt amplifiers,
the analog to digital conversion circuitry, the micro processor and the UART (serial link - RS232 transceiver).
It would make sense when analysing this system to take each one of these {\fgs} in turn and examine them closely.
It would be sensible if the system could detect the most likely fault~modes by self testing.
When these have been examined and diagnostic safeguard strategies have been thought up,
we might look at reporting any fault via the RS232 link.
% (if it still works !).
By doing this we have already used a modular approach.
We have analysed each section of the circuitry,
and then considering possible failures from each module,
can fit these into a picture of the
fault~modes of the milli-volt monitor as a whole.
However this type of analysis is not guaranteed
to rigorously take into account all fault~modes.
It is useful to follow an example fault through levels of abstraction hierarchy to make this point clear.
%The FMMD technique,
%goes further than this by considering all part fault~modes and
%places the analysis phases into a rigid structure.
%Each analysis phase is
%described using set theory in later sections.
%By creating a rigid hierarchy, not only can we traverse back
%down it to find possible causes for system errors, we can also determine
%combinations of fault modes that cause certain high level fault modes.
%For instance, it may be a criteria that no single part failure may cause a fatal error.
%If a fault tree can trace down to a single part fault for a potentially fatal
%fault mode, then a re-design must be undertaken.
%Some standards for automated burner controllers demand that two part failure modes cannot cause
%a dangerous/potentially fatal error. Again having a complete fault analysis tree will reveal these conditions.
\subsection{An example part Fault and \\ its representation at different abstraction levels}
An example of a part fault effect on the example system is given below, showing how this fault
manifests itself at each abstraction level.
%\begin{example}
As an example let us consider a resistor failure in the first milli-volt sensor.
Let us say that this resistor, R48 say, with the particular fault mode `shorted'
causes the amplifier to output 5V.
At the part level, we have one fault mode in one part.
%This is the lowest or zero level of fault abstraction.
Let us say that this amplifier has been designed to amplify the milli-volt input
to between 1 and 4 volts, a convenient voltage for the ADC/microcontroller to read.
Any voltage outside this range will be considered erroneous.
As the resistor short causes the amplifier to output 5V we can detect the error condition.
This resistor is a part in the `millivolt amplifier 1' module.
% (see figure \ref{fig:mvs}).
The fault mode at the derived fault level (abstraction level 1) is OUTPUT\_HIGH.
Looking higher in the hierarchy, the next abstraction level higher, level 2, will see this as
a `CHANNEL\_1' input fault.
%The system as a whole (abstraction level 3) will see this as
%a `MILLI\_VOLT\_SENSOR' fault~mode.
%\end{example}/
\subsubsection{Abstraction Layer Summary \\ for example fault.}
\begin{description}
%\begin{list}
\item[Abstraction Level 0 :] Resistor has fault mode `R48\_SHORT' in amplifier 1.
\item[Abstraction Level 1 :] Amplifier 1 has fault mode `OUTPUT\_HIGH'.
\item[Abstraction Level 2 :] Milli-volt sensor has `CHANNEL\_1' fault.
%\item[Abstraction Level 3 :] System has `MILLI\_VOLT\_SENSOR' fault.
%\end{itemize}
%\end{list}
\end{description}
Thus we have looked at a single part fault and analysed its effect from the
bottom up on the system as a whole, going up through the abstraction layers.
\subsection{Natural Fault finding}
Suppose that we were handed one of these `dual milli-volt' sensors and told that it had a ``Channel 1''
fault and asked to trouble shoot and hopefully fix it.
The natural process would be to work from the top down.
First of all we would look at perhaps a circuit schematic.
We might, not believing the operator that the equipment is actually faulty, feed in a known and valid milli-volt signal into the input.
On verifying it was actually faulty,
we could then find the ADC port pins used to make the reading, and measure a voltage on them.
We would find that the voltage was indeed out of range and our attention would turn to
the circuitry between the input milli-volt signal and the ADC/Microcontroller.
On examining this we would probably measure the in circuit resistances
and discover the faulty resistor.
With the natural fault finding process, we have narrowed down until we came to
the faulty component.
Because FMMD analysis works from the bottom~up,
it is possible to check that all component failure modes have been considered in the model.
%%
%% END CASE STUDY
%%
\section{Future Ideas}
\subsection{ Production Quality Control }
Having a fault causation tree could be used for PCB board fault finding (from the fault codes that are reported
by the equipment). This could be used in conjunction with a database to provide
Production oriented FMEA\footnote{The term FMEA applied to production\cite{bfmea}, is a statistical process of
determining the probability of the fault occurring and multiplying that by the costs incurred from the fault.
This quickly becomes a priority to-do list with the most costly faults at the top}
\subsection { Test Rigs }
Test rigs apply a rigorous checking process to safety critical equipment before
they can be sold, and this usually is a legal or contractural requirement, backed up by inspections
and and an approval process.
They are usually a clamp arrangement where the PCB under test is placed over
connection points applied by gold plated sprung pins: these rigs are commonly known
as `beds of nails' \cite{garret} \cite{maikowski}.
Precision and calibrated test signals are then applied to the board under test. For PCBs containing
microprocessors, custom test~rig software may be run on them to exercise
active sections of the PCB (for instance to drive outputs, relays etc).
The main purpose of a test rig is to prevent fault equipment from being shipped.
However, often a test rig will reveal an easy to fix fault on a board (such as a part not soldered down completely
or missing parts). These boards can be mended and re-submitted to the test rig.
It is often a problem, when a unit fails in a test rig, to quickly determine why it has failed.
Having a fault causation tree would be useful for identifying which parts may be missing, not soldered down
or simply incorrect. The test rig armed with the fault analysis tree could point to parts or combinations of parts that could be checked
to correct the product.
\subsection {{\dcs} - Modules - re-usability}
In the example system in the introduction, the milli-volt amplifiers
use the same electronic circuit. The derived failure mode model for them is therefore
the same.
Thus, the derived component, for the amplifiers may be re-used, with a different index number in the model.
\subsection{ Multi Channel Safety Critical Systems }
It is common in safety critical systems to use redundancy.
Two or sometimes three control systems will be assigned to the same process.
An arbitration system, the arbiter, will decide which channel may control
the equipment.
Where a system has several independent parallel control channels, each one can be a separate FMMD hierarchy.
The FMMD trees for the channels can converge
up to a top hierarchy representing the arbiter (which is the sub-system that decides which control channels are valid).
This is commonly referred to as a multi-channel safety critical system.
Where there are 2 channels and one arbiter, the term 1oo2 is used (one out of two).
The Ericsson AXE telephone exchange hardware is a 1oo2 system, and the arbiter (the AMD)
can detect and switch control within on processor instruction. Should a hardware error
be detected,%\footnote{Or in a test plant environment, more likely someone coming along and `borrowing' a cpu board from
%your working exchange}
the processor will switch to the redundant side without breaking any telephone calls
or any being set up. An alarm will be raised to inform that this has happened, but the performance impact to
the 1oo2 system, is a one micro-processor instruction delay to the entire process.
The premise here is that the arbiter should be able to determine which
of the two control channels is faulty and use the data/allow control from the non-faulty one.
1oo3 systems are common in highly critical systems.
\paragraph{Fault mode of interfaces}
An advantage with FMMD in this case is that the interface between the channels and the
safety arbiter is not only defined functionally but as a failure model as well.
Thus failures in the interfacing between the safety arbiter and the
each channel is modelled.
\paragraph{re-use of FMMD analysis}
Note that we can reuse the results from analysing one channel to model them all.
Identical channels will have the same high level failure modes.
% \small
% \bibliography{vmgbibliography,mybib}
% \normalsize
% Typeset in \ \ {\huge \LaTeX} \ \ on \ \ \today
% \begin{verbatim}
% CVS Revision Identity $Id: fmmdset.tex,v 1.7 2009/06/06 11:52:09 robin Exp $
% \end{verbatim}
%\end{document}
%\theend