Robin_PHD/component_failure_modes_definition/component_failure_modes_definition.tex


\abstract{ This chapter defines what is meant by the terms
components, component fault modes and `unitary~state' component fault modes.
The application of Bayes theorem  in current methodologies, and
the suitability of the `null hypothesis' or `P' value statistical approach
ar discussed.
Mathematical constraints and definitions are made using set theory.
}


\section{Introduction}
When building a system from components,
we should be able to find all known failure modes for each component.
For most common electrical and mechanical components, the failure modes
for a given type of part can be obtained from standard literature\cite{mil1991}
\cite{mech}. %The failure modes for a given component $K$ form a set $F$.


Using these failure modes we can build a `failure model' from the bottom-up.
Traditional static fault analysis methods work from the top down.
They identify faults that can occur in a system, and then work down
to see how they could be caused. Some apply statistical tequniques to
determine the likelihood of component failures causing specific system level errors (see Bayes theorem \ref{bayes}).
Another top down technique is ato apply cost benifit analysis
to determine which faults are the highest priority to fix\cite{FMEA}.

The aim of this study is to produce complete  failure
models of safety critical systems  from the bottom-up,
starting, where possible with known component failure modes.


\subsection{Systems, functional groups, sub-systems and failure modes}

It is helpful here to define some terms, `system', `functional~group', `component', `base~component' and `sub-system'.

A System, is really any coherent entity that would be sold as a safety critical product.
A sub-system is a  part of some larger system.
For instance a stereo amplifier separate is a sub-system. The
whole Sound System, consists perhaps of the following `sub-systems':
CD-player, tuner, amplifier~separate, loudspeakers and ipod~interface.

%Thinking like this is a top~down analysis approach
%and is the way in which FTA\cite{nucfta} analyses a System
%and breaks it down.

A sub-system will be composed of component parts, which
may themselves be sub-systems.

Eventually by a recursive downwards process we would be able to identify
sub-systems built from base component parts.
Each `component part'
will have a known fault/failure behaviour.
That is to say, each base component has a set of known
ways in which it can fail.

If we look at the sound system again as an
example; the CD~player could fail in serveral distinct ways, no matter
what has happened to it or has gone wrong inside it.

A top down approach has an intrinsic problem in that we cannot guess
every possible failure mode at the SYSTEM level.
Using the reasoning that working from the bottom up forces the consideration of all possible
component failures (which could be missed in a top~down approach)
we are presented with a problem. Which initial collections of base components should we choose ?

For instance in the CD~player example; to start at the bottom; we are presented with
a massive list of base~components, resistors, motors, user~switches, laser~diodes all sorts !
Clearly, working from the bottom~up we need to pick small
collections of components that work together in some way.
These are termed `functional~groups'. For instance the  circuitry that powers the laser diode
to illuminate the CD might contain a handful of components, and as such would make a good candidate
to be one of the base level functional~groups.


In choosing the lowest level (base component) sub-systems we would look
for the smallest `functional~groups' of components within a system. A functional~group is a set of components that interact
to perform a specific function.

When we have analysed the fault behaviour of a functional group, we can treat it as a `black box'.
We can now call our functional~group a sub-system. The goal here is to  know how will behave under fault conditions !
%Imagine buying one such `sub~system'  from a very honest vendor.
%One of those sir, yes but be warned it may fail in these distinct ways, here
%in the honest data sheet the set of failure modes is listed!
This type of thinking is starting to become more commonplace in product literature, with the emergence
of reliability safety standards such as IOC1508\cite{sccs},EN61508\cite{en61508}.
FIT (Failure in Time - expected number of failures per billion hours of operation) values
are published for some micro-controllers. A micro~controller
is a complex sub-system in its self and could be considered a `black~box' with a given reliability.
\footnote{Microchip sources give an FIT of 4 for their PIC18 series micro~controllers\cite{microchip}, The DOD
1991 reliability manual\cite{mil1991} applies a FIT of 100 for this generic type of component}

As electrical components have detailed datasheets a useful extension of this would
be failure modes of the component, with environmental factors and MTTF statistics.

Currently this sort of information is generally only  available for generic component types\cite{mil1991}.


%At higher levels of analysis, functional~groups are pre-analysed sub-systems that interact to
%erform a given function.

%\vspace{0.3cm}
\begin{table}[h]
\begin{tabular}{||l|l||} \hline \hline
  {\em Definition } & {\em Description}    \\ \hline
System & A product designed to  \\
       & work as a coherent entity  \\  \hline
Sub-system & A part of a system, \\
           & sub-systems may contain sub-systems \\    \hline
Failure mode & A way in which a System, \\
             & Sub-system or component can fail \\     \hline
Functional Group & A collection of sub-systems and/or \\
                 & components that interact to \\
                 & perform a specific function  \\    \hline
Failure Mode     & The collection of all failure \\
Group            & modes from all the members of a \\
                 & functional group \\ \hline
Derived      & A failure mode determined from the analysis \\
Failure mode & of a `Failure Mode Group' \\ \hline
Base Component & Any bought in component, which \\
               & hopefully has a known set of failure modes  \\    \hline
 \hline

\end{tabular}
\label{tab:def}
\caption{Table of FMMD definitions}
\end{table}
%\vspace{0.3cm}

\section{A UML Model of terms introduced}


\begin{figure}[h]
 \centering
 \includegraphics[width=350pt,bb=0 0 680 500,keepaspectratio=true]{component_failure_modes_definition/fmmd_uml.jpg}
 % fmmd_uml.jpg: 680x500 pixel, 72dpi, 23.99x17.64 cm, bb=0 0 680 500
 \caption{UML respresentation of Failure Mode Data types}
 \label{fig:fmmd_uml}
\end{figure}

The diagram in figure \ref{fig:fmmd_uml}
shows the relationships between the terms defined in table \ref{tab:def} as classes in a UML model.
We can start with the functional group. This is a minimal collection
of components that perform a simple given function.
For our audio separates rig, this could be
the compoents that supply power to the laser diode.
From the `Functional~Group' we can now collect
all the `failure modes of the `components', and
produce a `Failure~Mode~Group'. This
has a reference to the `Functional~Group', and is a collection
of `failure modes.
By analysing the effects of the failure modes in the `Failure~Mode~Group'
we can determine the failure mode behaviour of the functional group.
This failure mode behaviour is a collection of derived failure modes.
We can now consider the Functional group as a component now, because
we have a set of failure modes for it.

\subsection{Sub-System Class Definition}
A sub-system can be defined by the classes used to create it, and
its set of derived failure modes.
In this way sub-systems naturally form trees, with the lower most leaf nodes being
base components.
Note that the UML model is recursive. We can build functional groups using sub-systems
as components. This UML model naturally therefore, forms a hierarchy
of failure mode analysis, which has a one top level entry, that being the SYSTEM.
The TOP level entry will determine the failure modes
for the product/system under analysis.

\subsection{Refining the UML model to use inheritance}
We can refine this model a little by noticing that a system is merely the
top level sub-system. We can thus have System inherit sub-system.
A derived failure mode, is simply a failure mode at a higher level of analysis
it can therefore inherit `failure\_mode'.

The modified UML diagram using inheritance is figure \ref{fig:fmmd_uml2}.
\begin{figure}[h]
 \centering
 \includegraphics[width=350pt,bb=0 0 877 675,keepaspectratio=true]{./fmmd_uml2.jpg}
 % fmmd_uml2.jpg: 877x675 pixel, 72dpi, 30.94x23.81 cm, bb=0 0 877 675
 \caption{UML Representation of Failure Mode Data Types}
 \label{fig:fmmd_uml2}
\end{figure}
%
% \begin{figure}[h]
%  \centering
%  \includegraphics[width=350pt,bb=0 0 680 500,keepaspectratio=true]{component_failure_modes_definition/fmmd_uml2.jpg}
%  % fmmd_uml.jpg: 680x500 pixel, 72dpi, 23.99x17.64 cm, bb=0 0 680 500
%  \caption{UML respresentation of Failure Mode Data types}
%  \label{fig:fmmd_uml2}
% \end{figure}


\subsection{Unitary State Component Failure Mode sets}

An important factor in defining a set of failure modes is that they
should be as clearly defined as possible.
%
It should not be possible for instance for
a component to have two or more failure modes active at once.

Having a set of failure modes where $N$ modes could be active simultaneously
would mean having to consider $2^N$ failure mode scenarios.
%
Should a component be analysed and simultaneous failure mode cases exit,
the combinations could be represented by new failure modes, or
the component should be considered from a fresh perspective,
perhaps considering it as several smaller components
within one package.


\begin{definition}
A set of failure modes where only one fault mode
can be active at a time is termed a `unitary~state' failure mode set.
This is termed the $U$ set thoughout this study.
This corresponds to the `mutually exclusive' definition in
probability theory\cite{probandstat}.
\end{definition}

We can define a function $FM()$ to
take a given component $K$ and return its set of failure modes $F$.

$$  FM : K \mapsto F $$

We can further define a set $U$ which is a set of sets of failure modes, where
the component failure modes in each of its members are unitary~state.
Thus if the failure modes of $F$ are unitary~state, we can say $F \in U$.


\subsection{Component failure modes : Unitary State example}

A component with simple ``unitary~state'' failure modes is the electrical resistor.

Electrical resistors can fail by going OPEN or SHORTED.
However they cannot fail with both conditions active. The conditions
OPEN and SHORT are mutually exclusive.
Because of this the failure mode set $F=FM(R)$ is `unitary~state'.


Thus

$$ R_{SHORTED} \cap R_{OPEN} = \emptyset $$


We can make this a general case by taking a set $C$ (where $c1, c2 \in C$) representing a collection
of component failure modes,
We can now state that


$$ c1 \cap c2 \neq \emptyset  | c1 \neq c2 \wedge c1,c2 \in C \wedge C \not\in U  $$

That is to say that if it is impossible that any pair of failure modes can be active at the same time
the failure mode set is not unitary~state and does not exist in the family of sets $U$

 Note where that are more than two failure~modes, by banning pairs from happening at the same time
 we have banned larger combinations as well


\subsection{Component Failure Modes and Statistical Sample Space}
%\paragraph{NOT WRITTEN YET PLEASE IGNORE}
A sample space is defined as the set of all possible outcomes.
When dealing with failure modes, we are not interested in
the state where the compoent is working perfectly or `OK' (i.e. operating with no error).
We are interested only in ways in which it can fail.
By definition while all components in a system are `working perfectly'
that system will not exhibit faulty behavuiour.
Thus the statistical sample space $\Omega$ for a component/sub-system K is
%$$ \Omega = {OK, failure\_mode_{1},failure\_mode_{2},failure\_mode_{3} ... failure\_mode_{N} $$
$$ \Omega(K) = \{OK, failure\_mode_{1},failure\_mode_{2},failure\_mode_{3} ... failure\_mode_{N}\} $$
The failure mode set for a given component or sub-system $F$
is therefore
$$ F = \Omega(K) \backslash OK $$

\subsection{Bayes Theorem}
 \paragraph{NOT WRITTEN YET PLEASE IGNORE}
\label{bayes}
Describe application - likely hood of faults being the cause of symptoms -
probablistic approach - no direct causation paths to the higher~abstraction fault mode.
Often for instance a component in a module within a module within a module etc
that has a probability of causing a SYSTEM level fault.

Used in FTA\cite{NASA}\cite{NUK}. Problems, difficult to get reliable stats
for probability to cause because of small sample numbers...

FMMD approach can by traversing down the tree  use known component failure figures
to
%$$ c1 \cap c2 \eq \emptyset  | c1 \neq c2 \wedge c1,c2 \in C \wedge C \in U  $$

%Thus if the failure~modes are pairwaise mutually exclusive they qualify for inclusion into the
%unitary~state set family.

\subsection{Tests of Hypotheses and Significance}
\paragraph{NOT WRITTEN YET PLEASE IGNORE}
Linked in with Bayes theorem
Accident analysis
plane crashes and faults etc
In high reliability systems the fauls are often logged - strange occurances -
processors resetting - what are the common factors - P values -
for instance very high voltage spikes can reset micro controllers -
but how do you corrollate that with unshielded suppressed contactors...

Maybe looking at the equipment and seeing if there is a 5\%
level of the error being caused ?
i.e. using it to search for these conditions ?