473 lines
20 KiB
TeX
473 lines
20 KiB
TeX
|
|
\abstract{ This chapter defines what is meant by the terms
|
|
components, component fault modes and `unitary~state' component fault modes.
|
|
%The application of Bayes theorem in current methodologies, and
|
|
%the suitability of the `null hypothesis' or `P' value statistical approach
|
|
%are discussed.
|
|
Data types and their relationships are described using UML.
|
|
Mathematical constraints and definitions are made using set theory.
|
|
}
|
|
|
|
|
|
\section{Introduction}
|
|
|
|
When analysing a safety critical system using the
|
|
FMMD technique, we need clearly defined failure modes for
|
|
all the components that are used to model the system.
|
|
These failure modes have a constraint such that
|
|
the compoent failure modes must be mutually exclusive.
|
|
This and the definition of a component are
|
|
described in this chapter.
|
|
%When building a system from components,
|
|
%we should be able to find all known failure modes for each component.
|
|
%For most common electrical and mechanical components, the failure modes
|
|
%for a given type of part can be obtained from standard literature\cite{mil1991}
|
|
%\cite{mech}. %The failure modes for a given component $K$ form a set $F$.
|
|
|
|
|
|
%%
|
|
%% Paragraph component and its relationship to its failure modes
|
|
%%
|
|
|
|
\section{ What is a Component ?}
|
|
|
|
|
|
Let us first define a component. This is anything we use to build a
|
|
product or system with. This could be something quite complicated
|
|
like an integrated microcontroller, or quite simple like the humble resistor.
|
|
We can define a
|
|
component by its name, a manufacturers part number and perhaps
|
|
a vendors reference number.
|
|
What these components all have in common is that they can fail, and fail in
|
|
a number of well defined ways. For common components
|
|
there is established literature for the failure modes for the system designer consider (with accompanying statistical
|
|
failure rates)\cite{mil1991}. For instance, a simple resistor is generally considered
|
|
to fail in two ways, it can go open circuit or it can short. But we can also
|
|
associate it with a set of known failure modes. The UML diagram in figure
|
|
\ref{fig:component} shows a component as a simple data
|
|
structure with its failure modes.
|
|
|
|
|
|
\begin{figure}[h]
|
|
\centering
|
|
\includegraphics[width=400pt,bb=0 0 437 141,keepaspectratio=true]{component_failure_modes_definition/component.jpg}
|
|
% component.jpg: 437x141 pixel, 72dpi, 15.42x4.97 cm, bb=0 0 437 141
|
|
\caption{A Component and its Failure Modes}
|
|
\label{fig:component}
|
|
\end{figure}
|
|
|
|
% \begin{figure}[h+]
|
|
% \centering
|
|
% \includegraphics[width=400pt,bb=0 0 433 68,keepaspectratio=true]{component_failure_modes_definition/component.jpg}
|
|
% % component.jpg: 433x68 pixel, 72dpi, 15.28x2.40 cm, bb=0 0 433 68
|
|
% \caption{A Component and its failure modes}
|
|
% \label{fig:component}
|
|
% \end{figure}
|
|
|
|
A product naturally consists of many components and these are traditionally
|
|
kept in a `parts list'. For safety critical product this is a usually formal document
|
|
and is used by quality inspectors to ensure the correct parts are being fitted.
|
|
For our UML diagram the parts list is simply a collection of components
|
|
as shown in figure \ref{fig:componentpl}.
|
|
\begin{figure}[h]
|
|
\centering
|
|
\includegraphics[width=400pt,bb=0 0 712 68,keepaspectratio=true]{component_failure_modes_definition/componentpl.jpg}
|
|
% componentpl.jpg: 712x68 pixel, 72dpi, 25.12x2.40 cm, bb=0 0 712 68
|
|
\caption{Parts List of Components}
|
|
\label{fig:componentpl}
|
|
\end{figure}
|
|
|
|
|
|
|
|
|
|
%%
|
|
%% Paragraph using failure modes to build from bottom up
|
|
%%
|
|
|
|
\section{Fault Mode Analysis, top down or bottom up?}
|
|
|
|
Traditional static fault analysis methods work from the top down.
|
|
They identify faults that can occur in a system, and then work down
|
|
to see how they could be caused. Some apply statistical tequniques to
|
|
determine the likelihood of component failures
|
|
causing specific system level errors (see Bayes theorem \ref{bayes}).
|
|
Another top down technique is to apply cost benifit analysis
|
|
to determine which faults are the highest priority to fix\cite{FMEA}.
|
|
The aim of this study is to produce complete failure
|
|
models of safety critical systems from the bottom-up,
|
|
starting, where possible with known component failure modes.
|
|
|
|
In order to analyse from the bottom-up, we need to take
|
|
small groups of components from the parts~list that naturally
|
|
work together to perform a simple function.
|
|
We can term this a `Functional~Group'. When we have a
|
|
`Functional~Group' we can look at the failure modes of all the components
|
|
in it and decide how these will affect the Group.
|
|
Or in other words we can determine the failure modes of the functional
|
|
group. These failure modes are derived from the functional group, as so we can call
|
|
them `derived failure modes'.
|
|
We now have something very useful, because
|
|
we can now treat this functional group as a component with a known set of failure modes.
|
|
This newly derived component can be used as a higher level
|
|
building block for the system we are analysing.
|
|
Derived components, can be used
|
|
to form higher level functional groups.
|
|
This process can continue until have build a hierarcy that converges to a failure model of the entire system.
|
|
To differentiate the components derived from functional groups, we can
|
|
add a new attribute to the class `Component', that of analysis
|
|
level.
|
|
We can represet this in a UML diagram see figure \ref{fig:cfg}
|
|
|
|
\begin{figure}[h]
|
|
\centering
|
|
\includegraphics[width=400pt,bb=0 0 712 235,keepaspectratio=true]{component_failure_modes_definition/cfg.jpg}
|
|
% cfg.jpg: 712x205 pixel, 72dpi, 25.12x7.23 cm, bb=0 0 712 205
|
|
\caption{Components Derived from Functional Groups}
|
|
\label{fig:cfg}
|
|
\end{figure}
|
|
|
|
\section{Set theory description}
|
|
|
|
$$ System \stackrel{has}{\longrightarrow} PartsList $$
|
|
|
|
$$ PartsList \stackrel{has}{\longrightarrow} Components $$
|
|
|
|
$$ Component \stackrel{has}{\longrightarrow} FailureModes $$
|
|
|
|
$$ FunctionalGroup \stackrel{has}{\longrightarrow} Components $$
|
|
|
|
Using the symbol $\bowtie$ to indicate an analysis process that takes a
|
|
functional group and converts it into a new component.
|
|
|
|
$$ \bowtie ( FG ) \mapsto Component $$
|
|
|
|
|
|
%
|
|
% \subsection{Systems, functional groups, sub-systems and failure modes}
|
|
%
|
|
% It is helpful here to define some terms, `system', `functional~group', `component', `base~component' and `sub-system'.
|
|
%
|
|
% A System, is really any coherent entity that would be sold as a safety critical product.
|
|
% A sub-system is a part of some larger system.
|
|
% For instance a stereo amplifier separate is a sub-system. The
|
|
% whole Sound System, consists perhaps of the following `sub-systems':
|
|
% CD-player, tuner, amplifier~separate, loudspeakers and ipod~interface.
|
|
%
|
|
% %Thinking like this is a top~down analysis approach
|
|
% %and is the way in which FTA\cite{nucfta} analyses a System
|
|
% %and breaks it down.
|
|
%
|
|
% A sub-system will be composed of component parts, which
|
|
% may themselves be sub-systems.
|
|
%
|
|
% Eventually by a recursive downwards process we would be able to identify
|
|
% sub-systems built from base component parts.
|
|
% Each `component part'
|
|
% will have a known fault/failure behaviour.
|
|
% That is to say, each base component has a set of known
|
|
% ways in which it can fail.
|
|
%
|
|
% If we look at the sound system again as an
|
|
% example; the CD~player could fail in serveral distinct ways, no matter
|
|
% what has happened to it or has gone wrong inside it.
|
|
%
|
|
% A top down approach has an intrinsic problem in that we cannot guess
|
|
% every possible failure mode at the SYSTEM level.
|
|
% Using the reasoning that working from the bottom up forces the consideration of all possible
|
|
% component failures (which could be missed in a top~down approach)
|
|
% we are presented with a problem. Which initial collections of base components should we choose ?
|
|
%
|
|
% For instance in the CD~player example; to start at the bottom; we are presented with
|
|
% a massive list of base~components, resistors, motors, user~switches, laser~diodes all sorts !
|
|
% Clearly, working from the bottom~up we need to pick small
|
|
% collections of components that work together in some way.
|
|
% These are termed `functional~groups'. For instance the circuitry that powers the laser diode
|
|
% to illuminate the CD might contain a handful of components, and as such would make a good candidate
|
|
% to be one of the base level functional~groups.
|
|
%
|
|
%
|
|
% In choosing the lowest level (base component) sub-systems we would look
|
|
% for the smallest `functional~groups' of components within a system. A functional~group is a set of components that interact
|
|
% to perform a specific function.
|
|
%
|
|
% When we have analysed the fault behaviour of a functional group, we can treat it as a `black box'.
|
|
% We can now call our functional~group a sub-system. The goal here is to know how will behave under fault conditions !
|
|
% %Imagine buying one such `sub~system' from a very honest vendor.
|
|
% %One of those sir, yes but be warned it may fail in these distinct ways, here
|
|
% %in the honest data sheet the set of failure modes is listed!
|
|
% This type of thinking is starting to become more commonplace in product literature, with the emergence
|
|
% of reliability safety standards such as IOC1508\cite{sccs},EN61508\cite{en61508}.
|
|
% FIT (Failure in Time - expected number of failures per billion hours of operation) values
|
|
% are published for some micro-controllers. A micro~controller
|
|
% is a complex sub-system in its self and could be considered a `black~box' with a given reliability.
|
|
% \footnote{Microchip sources give an FIT of 4 for their PIC18 series micro~controllers\cite{microchip}, The DOD
|
|
% 1991 reliability manual\cite{mil1991} applies a FIT of 100 for this generic type of component}
|
|
%
|
|
% As electrical components have detailed datasheets a useful extension of this would
|
|
% be failure modes of the component, with environmental factors and MTTF statistics.
|
|
%
|
|
% Currently this sort of information is generally only available for generic component types\cite{mil1991}.
|
|
%
|
|
%
|
|
% %At higher levels of analysis, functional~groups are pre-analysed sub-systems that interact to
|
|
% %erform a given function.
|
|
%
|
|
% %\vspace{0.3cm}
|
|
% \begin{table}[h]
|
|
% \begin{tabular}{||l|l||} \hline \hline
|
|
% {\em Definition } & {\em Description} \\ \hline
|
|
% System & A product designed to \\
|
|
% & work as a coherent entity \\ \hline
|
|
% Sub-system & A part of a system, \\
|
|
% & sub-systems may contain sub-systems \\ \hline
|
|
% Failure mode & A way in which a System, \\
|
|
% & Sub-system or component can fail \\ \hline
|
|
% Functional Group & A collection of sub-systems and/or \\
|
|
% & components that interact to \\
|
|
% & perform a specific function \\ \hline
|
|
% Failure Mode & The collection of all failure \\
|
|
% Group & modes from all the members of a \\
|
|
% & functional group \\ \hline
|
|
% Derived & A failure mode determined from the analysis \\
|
|
% Failure mode & of a `Failure Mode Group' \\ \hline
|
|
% Base Component & Any bought in component, which \\
|
|
% & hopefully has a known set of failure modes \\ \hline
|
|
% \hline
|
|
% component_failure_modes_definition/
|
|
% \end{tabular}
|
|
% \label{tab:def}
|
|
% \caption{Table of FMMD definitions}
|
|
% \end{table}
|
|
% %\vspace{0.3cm}
|
|
%
|
|
% \section{A UML Model of terms introduced}
|
|
%
|
|
%
|
|
% \begin{figure}[h]
|
|
% \centering
|
|
% \includegraphics[width=350pt,bb=0 0 680 500,keepaspectratio=true]{component_failure_modes_definition/fmmd_uml.jpg}
|
|
% % fmmd_uml.jpg: 680x500 pixel, 72dpi, 23.99x17.64 cm, bb=0 0 680 500
|
|
% \caption{UML respresentation of Failure Mode Data types}
|
|
% \label{fig:fmmd_uml}
|
|
% \end{figure}
|
|
%
|
|
% The diagram in figure \ref{fig:fmmd_uml}
|
|
% shows the relationships between the terms defined in table \ref{tab:def} as classes in a UML model.
|
|
% We can start with the functional group. This is a minimal collection
|
|
% of components that perform a simple given function.
|
|
% For our audio separates rig, this could be
|
|
% the compoents that supply power to the laser diode.
|
|
% From the `Functional~Group' we can now collect
|
|
% all the `failure modes of the `components', and
|
|
% produce a `Failure~Mode~Group'. This
|
|
% has a reference to the `Functional~Group', and is a collection
|
|
% of `failure modes.
|
|
% By analysing the effects of the failure modes in the `Failure~Mode~Group'
|
|
% we can determine the failure mode behaviour of the functional group.
|
|
% This failure mode behaviour is a collection of derived failure modes.
|
|
% We can now consider the Functional group as a component now, because
|
|
% we have a set of failure modes for it.
|
|
%
|
|
% \subsection{Sub-System Class Definition}
|
|
% A sub-system can be defined by the classes used to create it, and
|
|
% its set of derived failure modes.
|
|
% In this way sub-systems naturally form trees, with the lower most leaf nodes being
|
|
% base components.
|
|
% Note that the UML model is recursive. We can build functional groups using sub-systems
|
|
% as components. This UML model naturally therefore, forms a hierarchy
|
|
% of failure mode analysis, which has a one top level entry, that being the SYSTEM.
|
|
% The TOP level entry will determine the failure modes
|
|
% for the product/system under analysis.
|
|
%
|
|
% \subsection{Refining the UML model to use inheritance}
|
|
% We can refine this model a little by noticing that a system is merely the
|
|
% top level sub-system. We can thus have System inherit sub-system.
|
|
% A derived failure mode, is simply a failure mode at a higher level of analysis
|
|
% it can therefore inherit `failure\_mode'.
|
|
%
|
|
% The modified UML diagram using inheritance is figure \ref{fig:fmmd_uml2}.
|
|
% \begin{figure}[h]
|
|
% \centering
|
|
% \includegraphics[width=350pt,bb=0 0 877 675,keepaspectratio=true]{./fmmd_uml2.jpg}
|
|
% % fmmd_uml2.jpg: 877x675 pixel, 72dpi, 30.94x23.81 cm, bb=0 0 877 675
|
|
% \caption{UML Representation of Failure Mode Data Types}
|
|
% \label{fig:fmmd_uml2}
|
|
% \end{figure}
|
|
% %
|
|
% % \begin{figure}[h]
|
|
% % \centering
|
|
% % \includegraphics[width=350pt,bb=0 0 680 500,keepaspectratio=true]{component_failure_modes_definition/fmmd_uml2.jpg}
|
|
% % % fmmd_uml.jpg: 680x500 pixel, 72dpi, 23.99x17.64 cm, bb=0 0 680 500
|
|
% % \caption{UML respresentation of Failure Mode Data types}
|
|
% % \label{fig:fmmd_uml2}
|
|
% % \end{figure}
|
|
|
|
|
|
\section{Unitary State Component Failure Mode sets}
|
|
|
|
An important factor in defining a set of failure modes is that they
|
|
should be as clearly defined as possible.
|
|
%
|
|
It should not be possible for instance for
|
|
a component to have two or more failure modes active at once.
|
|
|
|
Having a set of failure modes where $N$ modes could be active simultaneously
|
|
would mean having to consider $2^N$ failure mode scenarios.
|
|
%
|
|
Should a component be analysed and simultaneous failure mode cases exit,
|
|
the combinations could be represented by new failure modes, or
|
|
the component should be considered from a fresh perspective,
|
|
perhaps considering it as several smaller components
|
|
within one package.
|
|
|
|
|
|
|
|
|
|
|
|
\begin{definition}
|
|
A set of failure modes where only one fault mode
|
|
can be active at a time is termed a `unitary~state' failure mode set.
|
|
This is termed the $U$ set thoughout this study.
|
|
This corresponds to the `mutually exclusive' definition in
|
|
probability theory\cite{probandstat}.
|
|
\end{definition}
|
|
|
|
We can define a function $FM()$ to
|
|
take a given component $K$ and return its set of failure modes $F$.
|
|
|
|
$$ FM : K \mapsto F $$
|
|
|
|
We can further define a set $U$ which is a set of sets of failure modes, where
|
|
the component failure modes in each of its members are unitary~state.
|
|
Thus if the failure modes of $F$ are unitary~state, we can say $F \in U$.
|
|
|
|
|
|
\section{Component failure modes : Unitary State example}
|
|
|
|
A component with simple ``unitary~state'' failure modes is the electrical resistor.
|
|
|
|
Electrical resistors can fail by going OPEN or SHORTED.
|
|
|
|
For a given resistor R we can assign it the failure mode by applying
|
|
the function $FM$ thus $ FM(R) = \{R_{SHORTED},R_{OPEN}\} $.
|
|
Nothing can fail with both conditions open and short active at the same time ! The conditions
|
|
OPEN and SHORT are mutually exclusive.
|
|
Because of this the failure mode set $F=FM(R)$ is `unitary~state'.
|
|
|
|
|
|
Thus
|
|
|
|
$$ R_{SHORTED} \cap R_{OPEN} = \emptyset $$
|
|
|
|
|
|
We can make this a general case by taking a set $C$ (where $c1, c2 \in C$) representing a collection
|
|
of component failure modes.
|
|
We can now state that
|
|
|
|
|
|
$$ c1 \cap c2 \neq \emptyset | c1 \neq c2 \wedge c1,c2 \in C \wedge C \not\in U $$
|
|
|
|
That is to say that it is impossible that any pair of failure modes can be active at the same time
|
|
for the failure mode set $C$ to exists in the family of sets $U$
|
|
|
|
Note where that are more than two failure~modes, by banning pairs from happening at the same time
|
|
we have banned larger combinations as well.
|
|
|
|
|
|
|
|
\section{Component Failure Modes and Statistical Sample Space}
|
|
%\paragraph{NOT WRITTEN YET PLEASE IGNORE}
|
|
A sample space is defined as the set of all possible outcomes.
|
|
Here the outcomes we are interested in are the failure modes
|
|
of the component.
|
|
When dealing with failure modes, we are not interested in
|
|
the state where the component is working perfectly or `OK' (i.e. operating with no error).
|
|
We are interested only in ways in which it can fail.
|
|
By definition while all components in a system are `working perfectly'
|
|
that system will not exhibit faulty behaviour.
|
|
Thus the statistical sample space $\Omega$ for a component/sub-system K is
|
|
%$$ \Omega = {OK, failure\_mode_{1},failure\_mode_{2},failure\_mode_{3} ... failure\_mode_{N} $$
|
|
$$ \Omega(K) = \{OK, failure\_mode_{1},failure\_mode_{2},failure\_mode_{3}, ... ,failure\_mode_{N}\} $$
|
|
The failure mode set for a given component or sub-system $F$
|
|
is therefore
|
|
$$ F = \Omega(K) \backslash OK $$
|
|
|
|
\clearpage
|
|
|
|
THIS SHOULD BE IN A DIFFERENT CHAPTER
|
|
|
|
\section{Current Methods for Safety Critical Analysis}
|
|
|
|
|
|
\subsection{Deterministic Approach}
|
|
\paragraph{NOT WRITTEN YET PLEASE IGNORE}
|
|
No single component fault may lead to a dangerous condition.
|
|
EN298 En230 etc
|
|
|
|
\subsection{Bayes Theorem}
|
|
\paragraph{NOT WRITTEN YET PLEASE IGNORE}
|
|
\label{bayes}
|
|
Describe application - likely hood of faults being the cause of symptoms -
|
|
probablistic approach - no direct causation paths to the higher~abstraction fault mode.
|
|
Often for instance a component in a module within a module within a module etc
|
|
that has a probability of causing a SYSTEM level fault.
|
|
|
|
Used in FTA\cite{NASA}\cite{NUK}.
|
|
The idea being that probabilities can be assigned to components
|
|
failing, causing system level errors.
|
|
|
|
Problems, difficult to get reliable stats
|
|
for probability to cause because of small sample numbers...
|
|
|
|
FMMD approach can by traversing down the tree use known component failure figures
|
|
to get {\em accurate} probabilities and potential causes.
|
|
%$$ c1 \cap c2 \eq \emptyset | c1 \neq c2 \wedge c1,c2 \in C \wedge C \in U $$
|
|
|
|
%Thus if the failure~modes are pairwaise mutually exclusive they qualify for inclusion into the
|
|
%unitary~state set family.
|
|
|
|
\subsection{ Saftey Integrity Level Analysis }
|
|
\paragraph{NOT WRITTEN YET PLEASE IGNORE}
|
|
\label{sil}
|
|
This technique looks at all components in the parts list
|
|
and asks what the effect of the component failing will be.
|
|
Note that particular failure modes of the compoent are not considered.
|
|
The component can fail in any of its failure modes from the perspective of this analysis.
|
|
The analyst has to make a choice between four conditions:
|
|
|
|
\begin{itemize}
|
|
\item sd - A safe fault that is detected by an automated system
|
|
\item su - A safe fault that is undetected by an automated system
|
|
\item dd - A potentially dangerous fault that is detected by an automated system
|
|
\item du - A potentially dangerous fault that is not detected by an automated system
|
|
\end{itemize}
|
|
Actually this is almost how sil analysis is done, because
|
|
the base components are listed
|
|
and their failure result as either sd su dd du
|
|
|
|
A formula is then applied according to the system architecture 1oo1 2oo3 3oo3 etc
|
|
|
|
What is not done is the probability for all these conditions, the sil analysis
|
|
person simple has to decide which it is.
|
|
Another fault in this is that it is very difficult to
|
|
extract meaning ful stats
|
|
for how likely the detection systems are to pick the fault up, or even to introduce a fault of their own.
|
|
|
|
\subsection{Tests of Hypotheses and Significance}
|
|
\paragraph{NOT WRITTEN YET PLEASE IGNORE}
|
|
Linked in with Bayes theorem
|
|
Accident analysis
|
|
plane crashes and faults etc
|
|
In high reliability systems the fauls are often logged - strange occurances -
|
|
processors resetting - what are the common factors - P values -
|
|
for instance very high voltage spikes can reset micro controllers -
|
|
but how do you corrollate that with unshielded suppressed contactors...
|
|
|
|
Maybe looking at the equipment and seeing if there is a 5\%
|
|
level of the error being caused ?
|
|
i.e. using it to search for these conditions ?
|
|
|
|
|
|
Actually this could be used to refine the SIL method \ref{sil}
|
|
and give probabilities for the four conditions.
|