307 lines
13 KiB
TeX
307 lines
13 KiB
TeX
|
|
\abstract{ This chapter defines what is meant by the terms
|
|
components, component fault modes and `unitary~state' component fault modes.
|
|
The application of Bayes theorem in current methodologies, and
|
|
the suitability of the `null hypothesis' or `P' value statistical approach
|
|
ar discussed.
|
|
Mathematical constraints and definitions are made using set theory.
|
|
}
|
|
|
|
|
|
\section{Introduction}
|
|
When building a system from components,
|
|
we should be able to find all known failure modes for each component.
|
|
For most common electrical and mechanical components, the failure modes
|
|
for a given type of part can be obtained from standard literature\cite{mil1991}
|
|
\cite{mech}. %The failure modes for a given component $K$ form a set $F$.
|
|
|
|
|
|
Using these failure modes we can build a `failure model' from the bottom-up.
|
|
Traditional static fault analysis methods work from the top down.
|
|
They identify faults that can occur in a system, and then work down
|
|
to see how they could be caused. Some apply statistical tequniques to
|
|
determine the likelihood of component failures causing specific system level errors (see Bayes theorem \ref{bayes}).
|
|
Another top down technique is ato apply cost benifit analysis
|
|
to determine which faults are the highest priority to fix\cite{FMEA}.
|
|
|
|
The aim of this study is to produce complete failure
|
|
models of safety critical systems from the bottom-up,
|
|
starting, where possible with known component failure modes.
|
|
|
|
|
|
\subsection{Systems, functional groups, sub-systems and failure modes}
|
|
|
|
It is helpful here to define some terms, `system', `functional~group', `component', `base~component' and `sub-system'.
|
|
|
|
A System, is really any coherent entity that would be sold as a safety critical product.
|
|
A sub-system is a part of some larger system.
|
|
For instance a stereo amplifier separate is a sub-system. The
|
|
whole Sound System, consists perhaps of the following `sub-systems':
|
|
CD-player, tuner, amplifier~separate, loudspeakers and ipod~interface.
|
|
|
|
%Thinking like this is a top~down analysis approach
|
|
%and is the way in which FTA\cite{nucfta} analyses a System
|
|
%and breaks it down.
|
|
|
|
A sub-system will be composed of component parts, which
|
|
may themselves be sub-systems.
|
|
|
|
Eventually by a recursive downwards process we would be able to identify
|
|
sub-systems built from base component parts.
|
|
Each `component part'
|
|
will have a known fault/failure behaviour.
|
|
That is to say, each base component has a set of known
|
|
ways in which it can fail.
|
|
|
|
If we look at the sound system again as an
|
|
example; the CD~player could fail in serveral distinct ways, no matter
|
|
what has happened to it or has gone wrong inside it.
|
|
|
|
A top down approach has an intrinsic problem in that we cannot guess
|
|
every possible failure mode at the SYSTEM level.
|
|
Using the reasoning that working from the bottom up forces the consideration of all possible
|
|
component failures (which could be missed in a top~down approach)
|
|
we are presented with a problem. Which initial collections of base components should we choose ?
|
|
|
|
For instance in the CD~player example; to start at the bottom; we are presented with
|
|
a massive list of base~components, resistors, motors, user~switches, laser~diodes all sorts !
|
|
Clearly, working from the bottom~up we need to pick small
|
|
collections of components that work together in some way.
|
|
These are termed `functional~groups'. For instance the circuitry that powers the laser diode
|
|
to illuminate the CD might contain a handful of components, and as such would make a good candidate
|
|
to be one of the base level functional~groups.
|
|
|
|
|
|
In choosing the lowest level (base component) sub-systems we would look
|
|
for the smallest `functional~groups' of components within a system. A functional~group is a set of components that interact
|
|
to perform a specific function.
|
|
|
|
When we have analysed the fault behaviour of a functional group, we can treat it as a `black box'.
|
|
We can now call our functional~group a sub-system. The goal here is to know how will behave under fault conditions !
|
|
%Imagine buying one such `sub~system' from a very honest vendor.
|
|
%One of those sir, yes but be warned it may fail in these distinct ways, here
|
|
%in the honest data sheet the set of failure modes is listed!
|
|
This type of thinking is starting to become more commonplace in product literature, with the emergence
|
|
of reliability safety standards such as IOC1508\cite{sccs},EN61508\cite{en61508}.
|
|
FIT (Failure in Time - expected number of failures per billion hours of operation) values
|
|
are published for some micro-controllers. A micro~controller
|
|
is a complex sub-system in its self and could be considered a `black~box' with a given reliability.
|
|
\footnote{Microchip sources give an FIT of 4 for their PIC18 series micro~controllers\cite{microchip}, The DOD
|
|
1991 reliability manual\cite{mil1991} applies a FIT of 100 for this generic type of component}
|
|
|
|
As electrical components have detailed datasheets a useful extension of this would
|
|
be failure modes of the component, with environmental factors and MTTF statistics.
|
|
|
|
Currently this sort of information is generally only available for generic component types\cite{mil1991}.
|
|
|
|
|
|
%At higher levels of analysis, functional~groups are pre-analysed sub-systems that interact to
|
|
%erform a given function.
|
|
|
|
%\vspace{0.3cm}
|
|
\begin{table}[h]
|
|
\begin{tabular}{||l|l||} \hline \hline
|
|
{\em Definition } & {\em Description} \\ \hline
|
|
System & A product designed to \\
|
|
& work as a coherent entity \\ \hline
|
|
Sub-system & A part of a system, \\
|
|
& sub-systems may contain sub-systems \\ \hline
|
|
Failure mode & A way in which a System, \\
|
|
& Sub-system or component can fail \\ \hline
|
|
Functional Group & A collection of sub-systems and/or \\
|
|
& components that interact to \\
|
|
& perform a specific function \\ \hline
|
|
Failure Mode & The collection of all failure \\
|
|
Group & modes from all the members of a \\
|
|
& functional group \\ \hline
|
|
Derived & A failure mode determined from the analysis \\
|
|
Failure mode & of a `Failure Mode Group' \\ \hline
|
|
Base Component & Any bought in component, which \\
|
|
& hopefully has a known set of failure modes \\ \hline
|
|
\hline
|
|
|
|
\end{tabular}
|
|
\label{tab:def}
|
|
\caption{Table of FMMD definitions}
|
|
\end{table}
|
|
%\vspace{0.3cm}
|
|
|
|
\section{A UML Model of terms introduced}
|
|
|
|
|
|
\begin{figure}[h]
|
|
\centering
|
|
\includegraphics[width=350pt,bb=0 0 680 500,keepaspectratio=true]{component_failure_modes_definition/fmmd_uml.jpg}
|
|
% fmmd_uml.jpg: 680x500 pixel, 72dpi, 23.99x17.64 cm, bb=0 0 680 500
|
|
\caption{UML respresentation of Failure Mode Data types}
|
|
\label{fig:fmmd_uml}
|
|
\end{figure}
|
|
|
|
The diagram in figure \ref{fig:fmmd_uml}
|
|
shows the relationships between the terms defined in table \ref{tab:def} as classes in a UML model.
|
|
We can start with the functional group. This is a minimal collection
|
|
of components that perform a simple given function.
|
|
For our audio separates rig, this could be
|
|
the compoents that supply power to the laser diode.
|
|
From the `Functional~Group' we can now collect
|
|
all the `failure modes of the `components', and
|
|
produce a `Failure~Mode~Group'. This
|
|
has a reference to the `Functional~Group', and is a collection
|
|
of `failure modes.
|
|
By analysing the effects of the failure modes in the `Failure~Mode~Group'
|
|
we can determine the failure mode behaviour of the functional group.
|
|
This failure mode behaviour is a collection of derived failure modes.
|
|
We can now consider the Functional group as a component now, because
|
|
we have a set of failure modes for it.
|
|
|
|
\subsection{Sub-System Class Definition}
|
|
A sub-system can be defined by the classes used to create it, and
|
|
its set of derived failure modes.
|
|
In this way sub-systems naturally form trees, with the lower most leaf nodes being
|
|
base components.
|
|
Note that the UML model is recursive. We can build functional groups using sub-systems
|
|
as components. This UML model naturally therefore, forms a hierarchy
|
|
of failure mode analysis, which has a one top level entry, that being the SYSTEM.
|
|
The TOP level entry will determine the failure modes
|
|
for the product/system under analysis.
|
|
|
|
\subsection{Refining the UML model to use inheritance}
|
|
We can refine this model a little by noticing that a system is merely the
|
|
top level sub-system. We can thus have System inherit sub-system.
|
|
A derived failure mode, is simply a failure mode at a higher level of analysis
|
|
it can therefore inherit `failure\_mode'.
|
|
|
|
The modified UML diagram using inheritance is figure \ref{fig:fmmd_uml2}.
|
|
\begin{figure}[h]
|
|
\centering
|
|
\includegraphics[width=350pt,bb=0 0 877 675,keepaspectratio=true]{./fmmd_uml2.jpg}
|
|
% fmmd_uml2.jpg: 877x675 pixel, 72dpi, 30.94x23.81 cm, bb=0 0 877 675
|
|
\caption{UML Representation of Failure Mode Data Types}
|
|
\label{fig:fmmd_uml2}
|
|
\end{figure}
|
|
%
|
|
% \begin{figure}[h]
|
|
% \centering
|
|
% \includegraphics[width=350pt,bb=0 0 680 500,keepaspectratio=true]{component_failure_modes_definition/fmmd_uml2.jpg}
|
|
% % fmmd_uml.jpg: 680x500 pixel, 72dpi, 23.99x17.64 cm, bb=0 0 680 500
|
|
% \caption{UML respresentation of Failure Mode Data types}
|
|
% \label{fig:fmmd_uml2}
|
|
% \end{figure}
|
|
|
|
|
|
\subsection{Unitary State Component Failure Mode sets}
|
|
|
|
An important factor in defining a set of failure modes is that they
|
|
should be as clearly defined as possible.
|
|
%
|
|
It should not be possible for instance for
|
|
a component to have two or more failure modes active at once.
|
|
|
|
Having a set of failure modes where $N$ modes could be active simultaneously
|
|
would mean having to consider $2^N$ failure mode scenarios.
|
|
%
|
|
Should a component be analysed and simultaneous failure mode cases exit,
|
|
the combinations could be represented by new failure modes, or
|
|
the component should be considered from a fresh perspective,
|
|
perhaps considering it as several smaller components
|
|
within one package.
|
|
|
|
|
|
|
|
|
|
|
|
\begin{definition}
|
|
A set of failure modes where only one fault mode
|
|
can be active at a time is termed a `unitary~state' failure mode set.
|
|
This is termed the $U$ set thoughout this study.
|
|
This corresponds to the `mutually exclusive' definition in
|
|
probability theory\cite{probandstat}.
|
|
\end{definition}
|
|
|
|
We can define a function $FM()$ to
|
|
take a given component $K$ and return its set of failure modes $F$.
|
|
|
|
$$ FM : K \mapsto F $$
|
|
|
|
We can further define a set $U$ which is a set of sets of failure modes, where
|
|
the component failure modes in each of its members are unitary~state.
|
|
Thus if the failure modes of $F$ are unitary~state, we can say $F \in U$.
|
|
|
|
|
|
\subsection{Component failure modes : Unitary State example}
|
|
|
|
A component with simple ``unitary~state'' failure modes is the electrical resistor.
|
|
|
|
Electrical resistors can fail by going OPEN or SHORTED.
|
|
However they cannot fail with both conditions active. The conditions
|
|
OPEN and SHORT are mutually exclusive.
|
|
Because of this the failure mode set $F=FM(R)$ is `unitary~state'.
|
|
|
|
|
|
Thus
|
|
|
|
$$ R_{SHORTED} \cap R_{OPEN} = \emptyset $$
|
|
|
|
|
|
We can make this a general case by taking a set $C$ (where $c1, c2 \in C$) representing a collection
|
|
of component failure modes,
|
|
We can now state that
|
|
|
|
|
|
$$ c1 \cap c2 \neq \emptyset | c1 \neq c2 \wedge c1,c2 \in C \wedge C \not\in U $$
|
|
|
|
That is to say that if it is impossible that any pair of failure modes can be active at the same time
|
|
the failure mode set is not unitary~state and does not exist in the family of sets $U$
|
|
|
|
Note where that are more than two failure~modes, by banning pairs from happening at the same time
|
|
we have banned larger combinations as well
|
|
|
|
|
|
|
|
\subsection{Component Failure Modes and Statistical Sample Space}
|
|
%\paragraph{NOT WRITTEN YET PLEASE IGNORE}
|
|
A sample space is defined as the set of all possible outcomes.
|
|
When dealing with failure modes, we are not interested in
|
|
the state where the compoent is working perfectly or `OK' (i.e. operating with no error).
|
|
We are interested only in ways in which it can fail.
|
|
By definition while all components in a system are `working perfectly'
|
|
that system will not exhibit faulty behavuiour.
|
|
Thus the statistical sample space $\Omega$ for a component/sub-system K is
|
|
%$$ \Omega = {OK, failure\_mode_{1},failure\_mode_{2},failure\_mode_{3} ... failure\_mode_{N} $$
|
|
$$ \Omega(K) = \{OK, failure\_mode_{1},failure\_mode_{2},failure\_mode_{3} ... failure\_mode_{N}\} $$
|
|
The failure mode set for a given component or sub-system $F$
|
|
is therefore
|
|
$$ F = \Omega(K) \backslash OK $$
|
|
|
|
\subsection{Bayes Theorem}
|
|
\paragraph{NOT WRITTEN YET PLEASE IGNORE}
|
|
\label{bayes}
|
|
Describe application - likely hood of faults being the cause of symptoms -
|
|
probablistic approach - no direct causation paths to the higher~abstraction fault mode.
|
|
Often for instance a component in a module within a module within a module etc
|
|
that has a probability of causing a SYSTEM level fault.
|
|
|
|
Used in FTA\cite{NASA}\cite{NUK}. Problems, difficult to get reliable stats
|
|
for probability to cause because of small sample numbers...
|
|
|
|
FMMD approach can by traversing down the tree use known component failure figures
|
|
to
|
|
%$$ c1 \cap c2 \eq \emptyset | c1 \neq c2 \wedge c1,c2 \in C \wedge C \in U $$
|
|
|
|
%Thus if the failure~modes are pairwaise mutually exclusive they qualify for inclusion into the
|
|
%unitary~state set family.
|
|
|
|
\subsection{Tests of Hypotheses and Significance}
|
|
\paragraph{NOT WRITTEN YET PLEASE IGNORE}
|
|
Linked in with Bayes theorem
|
|
Accident analysis
|
|
plane crashes and faults etc
|
|
In high reliability systems the fauls are often logged - strange occurances -
|
|
processors resetting - what are the common factors - P values -
|
|
for instance very high voltage spikes can reset micro controllers -
|
|
but how do you corrollate that with unshielded suppressed contactors...
|
|
|
|
Maybe looking at the equipment and seeing if there is a 5\%
|
|
level of the error being caused ?
|
|
i.e. using it to search for these conditions ?
|