1638 lines
72 KiB
TeX
1638 lines
72 KiB
TeX
|
|
%%% CHAPTER 2
|
|
\label{sec:chap2}
|
|
|
|
The generic and statistical European Safety Standard, EN61508:6\cite{en61508}[B.6.6]
|
|
describes Failure Mode Effect Analysis (FMEA) as:
|
|
\begin{quotation}
|
|
``To analyse a system design, by examining all possible sources of failure
|
|
of a system's components and determining the effects of these failures
|
|
on the behaviour and safety of the system.''
|
|
\end{quotation}
|
|
\fmeagloss
|
|
\section*{Introduction}
|
|
This chapter introduces Failure Mode Effect Analysis (FMEA).
|
|
%It begins with a simple example to demonstrate the basic concept of FMEA
|
|
%and then
|
|
It starts with a generic conceptual overview of the process.
|
|
It then looks at the stages of the FMEA process in greater detail, starting with
|
|
how to determine the failure modes associated with components.
|
|
%
|
|
Two common electrical components, the resistor and the operational amplifier
|
|
are examined in the context of two sources of information that define failure modes.
|
|
%
|
|
To introduce the concept of FMEA, a simple example is given, using a hypothetical four to twenty milli-amp ({\ft}) %milli-amp
|
|
reader.
|
|
%
|
|
The four main current FMEA variants are described along with %and we develop %conclude by describing concepts
|
|
the concepts
|
|
that underlie the usage and philosophy of FMEA. %Fof a grou discussed.
|
|
%
|
|
The overall process of FMEA is then reviewed and modelled using UML.
|
|
%
|
|
By using UML
|
|
the entities needed to implement FMEA
|
|
are defined.
|
|
%
|
|
The act of defining relationships between the data objects in FMEA raises questions about the nature of the process
|
|
and allows analysis of its strengths and weaknesses.
|
|
|
|
|
|
|
|
\section{FMEA Basic concept.}
|
|
\label{basicfmea}
|
|
%\subsection{FMEA}
|
|
%\tableofcontents[currentsection]
|
|
%\paragraph{FMEA basic concept.}
|
|
|
|
FMEA~\cite{safeware}[pp.341-344] is widely used, and proof of its use is a %mandatory
|
|
legal requirement
|
|
for a large proportion of safety critical products sold in the European Union.
|
|
The acronym FMEA can be expanded as follows:
|
|
\begin{itemize}
|
|
\item \textbf{F - Failures of given component,} Consider a particular component in a system;
|
|
\item \textbf{M - Failure Mode,} Choose a particular failure mode of this component; % `failure~mode';
|
|
\item \textbf{E - Effects,} Determine the effects this failure mode will cause; % the system; we are examining;
|
|
\item \textbf{A - Analysis,} Analyse how much impact this symptom will have on the environment/operators/the system itself.
|
|
\end{itemize}
|
|
\fmeagloss
|
|
%
|
|
FMEA is a broad term; it could mean anything from an informal check on
|
|
how failures could affect some equipment in %an initial
|
|
a brain-storming session
|
|
%in product design,
|
|
to formal submission as part of safety critical certification.
|
|
FMEA is a manual, % and therefore
|
|
time intensive process. To reduce the amount of manual work performed,
|
|
software packages~\cite{931423, 1778436820050601} and analysis strategies have
|
|
been developed~\cite{incrementalfmea, automatingFMEA1281774}.
|
|
%
|
|
FMEA is always performed in context. That is, the equipment is always analysed for a particular purpose
|
|
and in a given environment. An `O' ring for instance can fail by leaking
|
|
but if fitted to a water seal on a garden hose, the system level failure %is a
|
|
would be a slight leak at the tap. % outside the house.
|
|
%
|
|
Applied to the rocket engine on a space shuttle an 'O' ring failure
|
|
could cause a catastrophic fire and destruction of the spacecraft and occupants~\cite{challenger}.
|
|
%
|
|
At a lower level, consider a resistor and capacitor forming a potential divider to ground.
|
|
This could be considered a low pass filter in some electrical environments~\cite{aoe},
|
|
but for fixed frequencies the same circuit could be used as a phase changer~\cite{electronicssysapproach}[p.114].
|
|
The failure modes of the latter, could be `no~signal' and `all~pass',
|
|
but when used as a phase changer, would be `no~signal' and `no~phase' change.
|
|
%
|
|
The actual failure modes for a `group~of~components', are therefore defined by the
|
|
function that they perform.
|
|
%
|
|
% This chapter describes basic concepts of FMEA, uses a simple example to
|
|
% demonstrate a single FMEA analysis stage, describes the four main variants of FMEA in use today
|
|
% and explores some concepts with which we can discuss and evaluate
|
|
% the effectiveness of FMEA.
|
|
\fmeagloss
|
|
\section{FMEA Process}
|
|
|
|
The initial stage of the FMEA process is with the basic, or starting components.
|
|
%
|
|
These components are the sort bought in or considered as pre-assembled modules.
|
|
These are termed `{\bcs}'; they are considered ``atomic'' i.e. they are not broken down further.
|
|
%
|
|
The first requirement for a {\bc} is to define the ways in which it can fail,
|
|
this relationship %between a {\bc} and its failure modes,
|
|
is shown, using UML, in figure~\ref{fig:component_fm_rel}.
|
|
\fmmdglossBC
|
|
%DIAGRAM of Base components and failure modes
|
|
|
|
\begin{figure}[h]
|
|
\centering
|
|
\includegraphics[width=300pt]{./CH2_FMEA/component_fm_rel.png}
|
|
% component_fm_rel.png: 368x71 pixel, 72dpi, 12.98x2.50 cm, bb=0 0 368 71
|
|
\caption{Base Component to Failure Modes relationship UML diagram}
|
|
\label{fig:component_fm_rel}
|
|
\end{figure}
|
|
|
|
The next stage is analysis, that is reasoning applied to the system in the event of
|
|
a given failure mode.
|
|
%
|
|
To analyse how a failure
|
|
mode, after considering its effect on other components in the system,
|
|
will translate to a system level symptom/failure.
|
|
%
|
|
The result of FMEA is to determine system level failures,
|
|
or symptoms for each given component failure mode.
|
|
%
|
|
In practise, each entry of an FMEA analysis of a {\bc} {\fm}
|
|
would typically be one line in a spreadsheet.
|
|
%
|
|
The analysis to symptom relationship is generally % considered
|
|
one-to-one, however here (see figure~\ref{fig:component_fm_rel_ana}), allowance is made for the possibility
|
|
of more than one failure symptom.
|
|
%DIAGRAM of reasoning and Symptoms.
|
|
|
|
\begin{figure}[h]
|
|
\centering
|
|
\includegraphics[width=400pt]{./CH2_FMEA/component_fm_rel_ana.png}
|
|
% component_fm_rel_ana.png: 369x184 pixel, 72dpi, 13.02x6.49 cm, bb=0 0 369 184
|
|
\caption{FMEA analyis entry data relationships}
|
|
\label{fig:component_fm_rel_ana}
|
|
\end{figure}
|
|
|
|
Figure ~\ref{fig:component_fm_rel_ana} defines the data relationships
|
|
for FMEA. This model is later extended in the conclusion
|
|
of this chapter.
|
|
|
|
|
|
\section{Determining the failure modes of {\bcs}}
|
|
\fmodegloss
|
|
\fmmdglossBC
|
|
\label{sec:determine_fms}
|
|
\fmodegloss
|
|
In order to apply any form of FMEA the ways in which
|
|
the {\bcs}\footnote{A good introduction to hardware and software failure modes may be found in~\cite{sccs}[pp.114-124].} %used
|
|
can fail must be clearly defined.
|
|
%
|
|
In practice, this part of the process is guided by %%% PRACTICE NOUN Practice makes perfect.------------------- PRACTISE --- VERB I practise the piano.
|
|
the particular standard
|
|
which is being conformed to. %we are seeking to conform.% to.
|
|
%
|
|
%
|
|
Standards may differ in their definitions for the {\fms} of {\bcs}.
|
|
The reasons for these differences are examined below using two example components.
|
|
%
|
|
%
|
|
%%%%%%%%%% DATA SHEETS and FAILURE MODES %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%
|
|
Typically, when choosing components for a design, engineers will look at manufacturers' data~sheets
|
|
which describe functionality, physical dimensions,
|
|
environmental ranges and tolerances etc. .
|
|
%
|
|
It is rare for a data~sheet to list failure modes.
|
|
%
|
|
Data~sheets after all are a sales tool as well as being a usage guide and technical description.
|
|
%
|
|
However, `reading~between~the~lines' or noting what is not~stated,
|
|
can in some cases indicate how a component could fail/misbehave.
|
|
%
|
|
%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
%under given conditions.
|
|
%
|
|
How %base
|
|
components could fail internally is not of interest to an FMEA investigation.
|
|
The FMEA investigator needs to know what failure behaviour a component could exhibit. %, or in other words, its modes of failure.
|
|
%
|
|
A large body of literature exists giving guidance for the determination of component {\fms}.
|
|
%
|
|
An interesting discussion on semi-conductor failure modes may be found in~\cite{ehb}[Ch.44].
|
|
%
|
|
For this study FMD-91~\cite{fmd91} and the gas burner standard EN298~\cite{en298} are examined.
|
|
%Some standards prescribe specific failure modes for generic component types.
|
|
In EN298 failure modes for most generic component types are listed, or if not listed,
|
|
are determined using a procedure:
|
|
typically of the form of examining scenarios such as
|
|
`all~pins~open' and then `all~adjacent~pins~shorted'~\cite{en298}[A.1 note e].
|
|
|
|
%a procedure where failure scenarios of all pins OPEN and all adjacent pins shorted
|
|
%are examined.
|
|
%
|
|
%
|
|
FMD-91~\cite{fmd91} is a reference document released into the public domain by the United States DOD
|
|
and describes `failures' of common electronic components, with percentage statistics for each failure.
|
|
%
|
|
FMD-91 entries include general descriptions of internal failures alongside {\fms} of use to an FMEA investigation.
|
|
%
|
|
FMD-91 entries need, in some cases, some interpretation to be mapped to a clear set of
|
|
component {\fms} suitable for use in FMEA.
|
|
%
|
|
A third document, MIL-1991~\cite{mil1991} provides overall reliability statistics for
|
|
component types, but does not detail specific failure modes.
|
|
%
|
|
Using MIL1991 in conjunction with FMD-91 statistics can be determined for the failure modes
|
|
of component types.
|
|
%
|
|
As these documents are now a little old, the results
|
|
from them can be on the conservative side.
|
|
\frategloss
|
|
\fmmdglossFIT
|
|
%
|
|
A FIT\footnote{Failure rates measured per $10^9$ hours of operation
|
|
are known as Failure in Time (FIT) values.} value for a micro-processor
|
|
may be determined at around 100 using these documents for instance, but
|
|
FIT claims for modern integrated micro-controllers are typically less than five~\cite{microchipreliability}.
|
|
%
|
|
The FMEA variant\footnote{EN61508 (and related standards) are based on the FMEA variant Failure Mode Effects and Diagnostic Analysis (FMEDA)}
|
|
used for European standard EN61508~\cite{en61508}
|
|
requires statistics for Mean Time to Failure (MTTF) for all {\bc} failure modes.
|
|
|
|
|
|
% One is from the US military document FMD-91, where internal failures
|
|
% of components are described (with stats).
|
|
%
|
|
% The other is EN298 where the failure modes for generic component types are prescribed, or
|
|
% determined by a procedure where failure scenarios of all pins OPEN and all adjacent pins shorted
|
|
% is applied. These techniques
|
|
%
|
|
% The FMD-91 entries need, in some cases, some interpretation to be mapped to
|
|
% component failure symptoms, but include failure modes that can be due to internal failures.
|
|
% The EN298 SHORT/OPEN procedure cannot determine failures due to internal causes but can be applied to any IC.
|
|
%
|
|
% Could I come in and see you Chris to quickly discuss these.
|
|
%
|
|
% I hope to have chapter 5 finished by the end of March, chapter 5 being the
|
|
% electronics examples for the FMMD methodology.
|
|
|
|
\section{Determining the failure modes of Components.}
|
|
\fmodegloss
|
|
The starting points in the FMEA process are the failure modes of the {\bcs}.
|
|
%s
|
|
%Typically found in a production parts list, which are termed the {\bcs}.
|
|
%
|
|
In order to define FMEA, a discussion on how these failure modes are defined and
|
|
their relationship to particular standards is presented below.
|
|
%
|
|
%In this section we pick %look in detail at
|
|
Two common electrical components are used as examples,
|
|
and examined against two sources of {\fm} information. % define their failure mode behaviour.
|
|
%
|
|
Failure mode definitions for a given generic component may not always agree.
|
|
%
|
|
The reasons why, some {\fms}
|
|
can be found in one source, but not in the others and vice versa, are discussed.
|
|
%
|
|
Finally, the failure modes determined %for these components
|
|
from the FMD-91~\cite{fmd91} reference source and from the guidelines of the
|
|
European burner standard EN298~\cite{en298}, are compared and contrasted.
|
|
|
|
\clearpage
|
|
|
|
\subsection{Failure mode determination for generic resistor.}
|
|
\label{sec:resistorfm}
|
|
%- Failure modes. Prescribed failure modes EN298 - FMD91
|
|
\paragraph{Resistor failure modes according to FMD-91.}
|
|
\fmodegloss
|
|
|
|
%The resistor is a ubiquitous component in electronics, and is therefore a good candidate for detailed examination of its failure modes.
|
|
%
|
|
FMD-91\cite{fmd91}[3-178] lists many types of resistor
|
|
and lists many possible failure causes,
|
|
for instance for {\textbf{Resistor,~Fixed,~Film}} the following failure causes are given:
|
|
\begin{itemize}
|
|
\item Opened 52\% ,
|
|
\item Drift 31.8\% ,
|
|
\item Film Imperfections 5.1\% ,
|
|
\item Substrate defects 5.1\% ,
|
|
\item Shorted 3.9\% ,
|
|
\item Lead damage 1.9\% .
|
|
\end{itemize}
|
|
% This information may be of interest to the manufacturer of resistors, but it does not directly
|
|
% help a circuit designer.
|
|
% The circuit designer is not interested in the causes of resistor failure, but to build in contingency
|
|
% against {\fms} that the resistor could exhibit.
|
|
% We can determine these {\fms} by converting the internal failure descriptions
|
|
% to {\fms} thus:
|
|
To make this useful for FMEA each failure cause must be mapped to a
|
|
symptomatic failure mode descriptor~\footnote{The symptomatic descriptors chosen are based on experience and are not unique.}
|
|
as listed below:
|
|
%
|
|
%and map these failure causes to three symptoms,
|
|
%drift (resistance value changing), open and short.
|
|
|
|
\begin{itemize}
|
|
\item Opened 52\% $\mapsto$ OPENED,
|
|
\item Drift 31.8\% $\mapsto$ DRIFT,
|
|
\item Film Imperfections 5.1\% $\mapsto$ OPEN,
|
|
\item Substrate defects 5.1\% $\mapsto$ OPEN,
|
|
\item Shorted 3.9\% $\mapsto$ SHORT,
|
|
\item Lead damage 1.9\% $\mapsto$ OPEN.
|
|
\end{itemize}
|
|
%
|
|
|
|
%
|
|
Note, that the main cause of resistor value drift is overloading. % of components.
|
|
This is borne out in the FMD-91~\cite{fmd91} entry for a resistor network where the failure
|
|
modes do not include drift.
|
|
%
|
|
If it is ensured that resistors will not be exposed to overload conditions, the
|
|
probability of drift (sometimes called parameter change) %occurring
|
|
is significantly reduced, enough for some standards to exclude it~\cite{en298,en230}.
|
|
|
|
|
|
\paragraph{Resistor failure modes according to EN298.}
|
|
|
|
EN298, the European gas burner safety standard,
|
|
tends to give failure modes that are more directly
|
|
usable for performing FMEA than FMD-91.
|
|
%
|
|
The certification process for EN298 requires that a full FMEA be undertaken, examining all failure modes
|
|
of all electronic components~\cite{en298}[11.2 5]. % as part of the certification process.
|
|
%
|
|
Annex A of EN298, prescribes failure modes for common components
|
|
and guidance on determining sets of failure modes for complex components (i.e. integrated circuits).
|
|
EN298~\cite{en298}[Annex A] (for most types of resistor)
|
|
only requires that the failure mode OPEN be considered for FMEA analysis.
|
|
%
|
|
For resistor types not specifically listed in EN298, the failure modes
|
|
are considered to be either OPEN or SHORT.
|
|
%
|
|
The reason that parameter change is not considered for resistors chosen for an EN298 compliant system, is that they must be {\em downrated}
|
|
during the design process.
|
|
%
|
|
That is to say the power and voltage ratings of components must be calculated
|
|
for maximum possible exposure, with a 40\% margin of error.
|
|
%
|
|
This drastically reduces the probability
|
|
that the resistors will be overloaded,
|
|
and thus subject to drift/parameter change.
|
|
%
|
|
Clearly the assumed failure modes of base components represent a fundamental
|
|
limit of resolution in any failure analysis methodology.
|
|
% XXXXXX get ref from colin T
|
|
|
|
%If a resistor was rated for instance for
|
|
|
|
%These are useful for resistor manufacturersthey have three failure modes
|
|
%EN298
|
|
%Parameter change not considered for EN298 because the resistors are down-rated from
|
|
%maximum possible voltage exposure -- find refs.
|
|
|
|
|
|
% FMD-91 gives the following percentages for failure rates in
|
|
% \label{downrate}
|
|
% The parameter change, is usually a failure mode associated with over stressing the component.
|
|
%In a system designed to typical safety critical constraints (as in EN298)
|
|
%these environmentally induced failure modes need not be considered.
|
|
|
|
\subsubsection{Resistor Failure Modes}
|
|
\label{sec:res_fms}
|
|
The difference in resistor failure modes between FMD-91 and EN298 is that FMD-91 would
|
|
include the failure mode DRIFT.
|
|
%
|
|
EN298 does not include this, mainly because it imposes circuit design constraints
|
|
that effectively side step that problem.
|
|
%
|
|
For this study the conservative view from EN298, but restrictive view from FMD-91 (i.e. no DRIFT) is taken, and the failure
|
|
modes for a generic resistor taken to be both OPEN and SHORT. The function $fm$ is used
|
|
to return a set of failure modes,
|
|
i.e.
|
|
\label{ros}
|
|
$$ fm(R) = \{ OPEN, SHORT \} . $$
|
|
%
|
|
%
|
|
% Mention tolerance here
|
|
%
|
|
% hmmmmmm
|
|
%
|
|
%
|
|
\subsection{Failure modes determination for a generic operational amplifier}
|
|
%
|
|
The operational amplifier (op-amp) %is a differential amplifier and
|
|
is very widely used in nearly all fields of modern analogue electronics.
|
|
\fmmdglossOPAMP
|
|
%
|
|
Only one of two sources of information on {\bc} {\fms} being compared
|
|
has an entry specific to operational amplifiers (FMD-91).
|
|
%
|
|
EN298 does not specifically define the
|
|
{\fms} of op-amps but
|
|
instead has a procedure for determining the {\fms} of
|
|
components types not specifically listed. %in it.
|
|
%
|
|
Operational amplifiers are typically packaged in dual or quad configurations---meaning
|
|
that a chip will typically contain two or four amplifiers.
|
|
%
|
|
The failure modes determined from the FMD-91 entries are presented and then
|
|
the failure mode determination procedure of EN298
|
|
is applied to a typical op-amp designed for instrumentation and measurement, the dual packaged version of the LM358~\cite{lm358}
|
|
(see figure~\ref{fig:lm258}).
|
|
%
|
|
The results from both sources of {\fm} definition are then compared.
|
|
\fmmdglossOPAMP
|
|
|
|
\paragraph{Failure Modes of an Op-Amp according to FMD-91.}
|
|
\fmodegloss
|
|
%Literature suggests, latch up, latch down and oscillation.
|
|
For Op-Amp failures modes, FMD-91\cite{fmd91}{3-116] states,
|
|
\begin{itemize}
|
|
\item Degraded Output 50\% Low Slew rate - poor die attach
|
|
\item No Operation - overstress 31.3\%
|
|
\item Shorted inputs (labelled $V_+$ to $V_-$), overstress, resistive short in amplifier 12.5\%
|
|
\item Opened input (labelled $V_+$) open 6.3\%
|
|
\end{itemize}
|
|
|
|
These are mostly internal causes of failure, more of interest to the component manufacturer
|
|
than a test engineer % designer
|
|
looking for symptoms of failure.
|
|
%
|
|
These failure causes within the Op-Amp need to be translated to symptomatic {\fms}.
|
|
%
|
|
Each failure cause is examined in turn, and mapped to potential {\fms} suitable for use in FMEA
|
|
investigations.
|
|
|
|
\paragraph{Op-Amp failure cause: Poor Die attach.}
|
|
\fmmdglossOPAMP
|
|
The symptom for this is given as a low slew rate.
|
|
%
|
|
Slew rate for a circuit/component is the maximum rate at which it can change an output voltage level (i.e. $\frac{\delta V}{\delta t} $).
|
|
%
|
|
A low slew rate will mean that the op-amp will not react quickly to changes on its input terminals.
|
|
%
|
|
%
|
|
This is a failure symptom that may not be of concern in a slow responding system like an
|
|
instrumentation amplifier. However, where higher frequencies are being processed,
|
|
a signal may be lost entirely.
|
|
This failure cause can be mapped to a symptomatic {\fm} called $LOW\_SLEW$.
|
|
|
|
\paragraph{No Operation - over stress.}
|
|
Here the OP-Amp has been damaged, and the output may be held HIGH or LOW, or may be
|
|
effectively tri-stated, i.e. not able to drive circuitry along the next stages of
|
|
the signal path: this {\fm} is termed NOOP (no Operation).
|
|
%
|
|
This failure cause thus maps to three {\fms}, $LOW$, $HIGH$, $NOOP$.
|
|
|
|
\paragraph{Shorted inputs: $V_+$ to $V_-$.}
|
|
Due to the high intrinsic gain of an op-amp, and the effect of offset currents,
|
|
this will force the output HIGH or LOW.
|
|
This failure cause maps to $HIGH$ or $LOW$.
|
|
|
|
\paragraph{Open input: $V_+$.}
|
|
This failure cause will mean that the minus input will have the very high gain
|
|
of the Op-Amp applied to it, and the output will be forced HIGH or LOW.
|
|
This failure cause maps to $HIGH$ or $LOW$~\footnote{No failure mode for open input ${V}_{-}$ was listed in this FMD-91 entry~\cite{fmd91}[3-116].}.
|
|
|
|
\paragraph{Collecting Op-Amp failure modes from FMD-91.}
|
|
An Op-Amp's failure mode behaviour, under FMD-91 definitions will have the following {\fms}:
|
|
\begin{equation}
|
|
\label{eqn:opampfms}
|
|
fm(OpAmp) = \{ HIGH, LOW, NOOP, LOW\_SLEW \} .
|
|
\end{equation}
|
|
|
|
|
|
\paragraph{Failure Modes of an Op-Amp according to EN298.}
|
|
|
|
EN298 does not specifically define op-amp failure modes; these can be determined
|
|
by following a procedure for `integrated~circuits' outlined in
|
|
annex~A~\cite{en298}[A.1 note e].
|
|
%
|
|
This demands that all open connections, and shorts between adjacent pins be considered as failure scenarios.
|
|
%
|
|
In table ~\ref{tbl:lm358} these failure scenarios on the dual packaged $LM358$~\cite{lm358} %\mu741$
|
|
are examined and from this its {\fms} are determined.
|
|
%
|
|
% Collecting the op-amp failure modes from table ~\ref{tbl:lm358} we obtain the same {\fms}
|
|
% that we got from FMD-91, listed in equation~\ref{eqn:opampfms}, except for
|
|
% $LOW\_SLEW$.
|
|
%
|
|
Collating the op-amp failure modes from table ~\ref{tbl:lm358}, the same {\fms}
|
|
from FMD-91 are obtained---listed in equation~\ref{eqn:opampfms}---except for
|
|
$LOW\_SLEW$.
|
|
\fmmdglossOPAMP
|
|
|
|
%\paragraph{EN298: Open and shorted pin failure symptom determination technique}
|
|
|
|
|
|
%Eighth
|
|
|
|
|
|
\begin{table}[h+]
|
|
\caption{LM358: EN298 Open and shorted pin failure symptom determination technique}
|
|
\begin{tabular}{|| l | l | c | c | l ||} \hline
|
|
%\textbf{Failure Scenario} & & \textbf{Amplifier Effect} & & \textbf{Symptom(s)} \\
|
|
\textbf{Failure} & & \textbf{Amplifier Effect} & & \textbf{FMEA component} \\
|
|
\textbf{cause} & & \textbf{ } & & \textbf{Failure Mode} \\
|
|
|
|
\hline
|
|
|
|
& & & & \\ \hline
|
|
|
|
FS1: PIN 1 OPEN & & A output open & & $NOOP_A$ \\ \hline
|
|
|
|
FS2: PIN 2 OPEN & & A-input disconnected, & & \\
|
|
& & infinite gain on A+input & & $LOW_A$ or $HIGH_A$ \\ \hline
|
|
|
|
FS3: PIN 3 OPEN & & A+input disconnected, & & \\
|
|
& & infinite gain on A-input & & $LOW_A$ or $HIGH_A$ \\ \hline
|
|
|
|
FS4: PIN 4 OPEN & & power to chip (ground) disconnected & & $NOOP_A$ and $NOOP_B$ \\ \hline
|
|
|
|
|
|
FS5: PIN 5 OPEN & & B+input disconnected, & & \\
|
|
& & infinite gain on B-input & & $LOW_B$ or $HIGH_B$ \\ \hline
|
|
|
|
FS6: PIN 6 OPEN & & B-input disconnected, & & \\
|
|
FS6: & & infinite gain on B+input & & $LOW_B$ or $HIGH_B$ \\ \hline
|
|
|
|
|
|
FS7: PIN 7 OPEN & & B output open & & $NOOP_B$ \\ \hline
|
|
|
|
FS8: PIN 8 OPEN & & power to chip & & \\
|
|
FS8: & & (V+ supply) disconnected & & $NOOP_A$ and $NOOP_B$ \\ \hline
|
|
& & & & \\
|
|
% & & & & \\
|
|
% & & & & \\ \hline
|
|
|
|
FS9: PIN 1 $\stackrel{short}{\longrightarrow}$ PIN 2 & & A -ve 100\% Feed back, unity gain & & $LOW_A$ \\ \hline
|
|
|
|
FS10: PIN 2 $\stackrel{short}{\longrightarrow}$ PIN 3 & & A inputs shorted, & & \\
|
|
& & output controlled by internal offset & & $LOW_A$ or $HIGH_A$ \\ \hline
|
|
|
|
FS11: PIN 3 $\stackrel{short}{\longrightarrow}$ PIN 4 & & A + input held to ground & & $LOW_A$ or $HIGH_A$ \\ \hline
|
|
|
|
FS12: PIN 5 $\stackrel{short}{\longrightarrow}$ PIN 6 & & B inputs shorted, & & \\
|
|
& & output controlled by internal offset & & $LOW_B$ or $HIGH_B$ \\ \hline
|
|
|
|
FS13: PIN 6 $\stackrel{short}{\longrightarrow}$ PIN 7 & & B -ve 100\% Feed back, low gain & & $LOW_B$ \\ \hline
|
|
|
|
FS14: PIN 7 $\stackrel{short}{\longrightarrow}$ PIN 8 & & B output held high & & $HIGH_B$ \\ \hline
|
|
|
|
|
|
\hline
|
|
\end{tabular}
|
|
\label{tbl:lm358}
|
|
\end{table}
|
|
|
|
\begin{figure}[h+]
|
|
\centering
|
|
\includegraphics[width=200pt]{CH5_Examples/lm258pinout.jpg}
|
|
% lm258pinout.jpg: 478x348 pixel, 96dpi, 12.65x9.21 cm, bb=0 0 359 261
|
|
\caption{Pinout for an LM358 dual Op-Amp}
|
|
\label{fig:lm258}
|
|
\end{figure}
|
|
|
|
%\clearpage
|
|
|
|
\subsubsection{Failure modes of an Op-Amp}
|
|
|
|
\label{sec:opamp_fms}
|
|
For the purpose of the examples to follow in this document, op-amp's
|
|
are assigned the following failure modes:
|
|
%
|
|
$$ fm(OPAMP) = \{ LOW, HIGH, NOOP, LOW\_SLEW \} . $$
|
|
%
|
|
\fmmdglossOPAMP
|
|
\subsection{Comparing the component failure mode sources: EN298 vs FMD-91}
|
|
|
|
|
|
The EN298 pinouts failure mode technique cannot reveal failure modes due to internal failures,
|
|
and that is why it misses $LOW\_SLEW$.
|
|
%
|
|
The FMD-91 entries for op-amps are not directly usable as
|
|
component {\fms} in FMEA and require interpretation.
|
|
%
|
|
However, once a failure mode determination has been carried out, the model can
|
|
be re-used throughout the FMEA process.
|
|
|
|
%%%% Talk about R differences ?? XXXXX
|
|
|
|
|
|
|
|
|
|
\clearpage
|
|
|
|
|
|
|
|
|
|
|
|
\section{FMEA worked example: milli-volt reader.}
|
|
%
|
|
FMEA is a bottom-up procedure which starts with the failure modes of the low level components of a system.
|
|
%
|
|
An example analysis will serve to demonstrate it in practice.
|
|
%
|
|
%
|
|
Consider a system of a simple milli-volt reader, consisting
|
|
of instrumentation amplifiers connected to a micro-processor
|
|
that reports its readings via RS-232.
|
|
%
|
|
\begin{figure}
|
|
\centering
|
|
\includegraphics[width=175pt]{./CH2_FMEA/mvamp.png}
|
|
% mvamp.png: 561x403 pixel, 72dpi, 19.79x14.22 cm, bb=0 0 561 403
|
|
\caption{System diagram of a milli-volt reader, showing an expanded circuit diagram for the component of interest.}
|
|
\end{figure}
|
|
\fmeagloss
|
|
|
|
|
|
|
|
|
|
|
|
\subsection{FMEA Example: Milli-volt reader}
|
|
%
|
|
Undertaking an FMEA on the milli-volt reader to consider how one of its resistors failing could affect
|
|
it and choosing the resistor R1 in the OP-AMP gain circuitry:
|
|
% \begin{figure}
|
|
% \centering
|
|
% \includegraphics[width=175pt]{./mvamp.png}
|
|
% % mvamp.png: 561x403 pixel, 72dpi, 19.79x14.22 cm, bb=0 0 561 403
|
|
% \end{figure}
|
|
%\paragraph{FMEA Example: Milli-volt reader}
|
|
% \begin{figure}
|
|
% \centering
|
|
% \includegraphics[width=80pt]{./mvamp.png}
|
|
% % mvamp.png: 561x403 pixel, 72dpi, 19.79x14.22 cm, bb=0 0 561 403
|
|
% \end{figure}
|
|
\begin{itemize}
|
|
\item \textbf{F - Failures of given component} The resistor (R1) could fail by going OPEN or SHORT (EN298 definition),
|
|
\item \textbf{M - Failure Mode} Consider the component failure mode SHORT,
|
|
\item \textbf{E - Effects} This will drive the minus input LOW causing a HIGH OUTPUT/READING,
|
|
\item \textbf{A - Analysis} The reading will be out of the normal range, i.e. will have an erroneous milli-volt reading.
|
|
\end{itemize}
|
|
|
|
\fmeagloss
|
|
|
|
|
|
The analysis above has given a result for % one failure %scenario i.e.
|
|
one single component failure mode.
|
|
A complete FMEA report, would have to contain an entry
|
|
for each failure mode of all the components in the system under investigation.
|
|
%
|
|
In theory it would be necessary to look at the failure~mode
|
|
in relation to the entire circuit.
|
|
%
|
|
Intuition has been used to determine the probable
|
|
effect of this failure mode.
|
|
%
|
|
For instance it has been assumed that the resistor R1 going SHORT
|
|
will not affect the ADC, the Microprocessor or the UART.
|
|
\fmmdglossADC
|
|
%
|
|
%
|
|
%
|
|
The {\bc} {\fm} R1 SHORT has been examined
|
|
and failure reasoning applied,
|
|
along a heuristically determined signal path,
|
|
to find a putative system level symptom.
|
|
%
|
|
\fmmdglossSIGPATH
|
|
That is R1 going SHORT is expected to just give an out of range value
|
|
that can be read by the ADC and reported correctly by the software.
|
|
%
|
|
Potential side effects of this {\fm} may not have been factored.
|
|
%
|
|
To put this in more general terms, this failure mode has not been examined
|
|
against all other components in the system, only those expected on the signal path.
|
|
%
|
|
Examining the {\fm} R1 SHORT against all component in this system, would be a more rigorous and complete
|
|
approach in looking for system failures.
|
|
%
|
|
FMEA where
|
|
each failure mode is compared against all other components
|
|
is termed exhaustive FMEA (XFMEA).
|
|
%
|
|
An indicator of the vagueness of not performing XFMEA, in terms of failure outcome,
|
|
is shown in the UML relationship in figure~\ref{fig:component_fm_rel_ana}
|
|
giving a one to many mapping for a failure mode and its system level symptom.
|
|
|
|
|
|
\section{Theoretical Concepts in FMEA}
|
|
|
|
In this section some fundamental concepts and underlying philosophies of FMEA are examined.
|
|
|
|
\paragraph{Failure modes of a component and mutual exclusivity.}
|
|
It is desirable that the failure modes for a component are mutually exclusive, were a component able
|
|
to fail in several ways at the same time, this would complicate analysis.
|
|
%
|
|
It would mean having to consider combinations of internal component failures
|
|
as separate failure modes. This concept is discussed in sections~\ref{ch4:mutex}
|
|
and~\ref{ch7:mutex}.
|
|
%
|
|
\fmmdglossMUTEX
|
|
%
|
|
In general, failure modes
|
|
for simple components are mutually exclusive,
|
|
but large and complex components (such as integrated circuits), especially where they contain separate modules,
|
|
could have non mutually exclusive failure modes and these need special handling, see section~\ref{ch7:indfm}.
|
|
|
|
\paragraph{The signal path.}
|
|
\fmmdglossSIGPATH
|
|
% C Garret does not like the terms afferent and efferent here, try to think of something else
|
|
Most electronic systems are used to process a signal: with signal processing
|
|
there is usually a clear path from the signal coming into the system, it being processed in some way, and a resultant effect on
|
|
an output or control signal. % afferent to transform to efferent path.
|
|
%
|
|
That is, there is an input, some processing and an output.
|
|
%
|
|
In electronics this could be termed a sensor, processing and actuator
|
|
model.
|
|
%
|
|
In software this would be termed afferent, transform and efferent data flow.
|
|
%
|
|
For the purpose of FMEA, the signal path is defined by the components and connections used to process the signal.
|
|
%
|
|
Some circuits have feedback loops or even circular signal paths, but it
|
|
is normal for a signal path to exist.
|
|
%
|
|
%can be identified.
|
|
%
|
|
An FMEA investigation will often take the component {\fm} and examine its effect along this path,
|
|
in the direction of the signal,
|
|
echoing diagnostic/fault~finding methods~\cite{garrett, maikowski}. % loebowski}.
|
|
%
|
|
When fault finding, the signal path is followed, checking for correct behaviour
|
|
along it: when something out of place is found,
|
|
the circuit behaviour is measured in finer granularity,
|
|
until a faulty component or module~\cite{garrett} is identified.
|
|
%
|
|
With this style of fault finding, because it is based on experiment,
|
|
hopping from module to module eliminating working ones, until a
|
|
failure is found~\cite{maikowski}, is efficient in terms of
|
|
concentrating effort.
|
|
%
|
|
The rationale and work-culture of those tasked to
|
|
perform FMEA are generally personnel who have performed fault finding~\cite{cbds}[p.97].
|
|
%
|
|
|
|
|
|
FMEA is a theoretical discipline. %AF does not like this!
|
|
%
|
|
It would be very unusual to build a circuit and then simulate
|
|
component failure modes.
|
|
%
|
|
This would be time consuming as it would involve altering/building a circuit for each component {\fm} in
|
|
the system\footnote{Building circuit simulations and simulating component failure modes
|
|
would be a very time consuming process and might only be performed as a final-stage of accident investigation, where the cause is
|
|
required to be proven.}.
|
|
%
|
|
It is not possible, as with fault finding, to verify modules along the signal path for correct behaviour
|
|
and eliminate them from the investigation.
|
|
%
|
|
FMEA is a `thought~experiment', not actual experiment.
|
|
%
|
|
With FMEA there is a need to be more thorough in the consideration of the effects a failure mode may have
|
|
on the other components in a system, than with fault finding.
|
|
%
|
|
The question is by how much.
|
|
%
|
|
Too much and the task becomes impossible due to time/labour constraints.
|
|
%
|
|
Too little and the analysis could become meaningless, because it could miss
|
|
potential system failures.
|
|
%
|
|
For a more complete analysis, the strategy of examining each component {\fm} along the complete signal path,
|
|
forwards and backwards from the placement
|
|
of the component exhibiting the {\fm} under investigation, could be applied.
|
|
%
|
|
% Also, whether following the effects through the signal path {\em only} is acceptable, and instead
|
|
% would looking at its effect on all other components in the system be necessary?
|
|
Is following the effects of a {\fm} {\em only} through the components along the signal path acceptable?
|
|
This could easily ignore side effects; this leads onto the idea of
|
|
looking at a {\fm}'s effects on all other components in the system. % be necessary?
|
|
%is a matter for debate.
|
|
%
|
|
In practise, a compromise is made between the amount of time/money that can be spent
|
|
on analysis relative to the criticality of the project.
|
|
Metrics for measuring the amount of work to undertake for FMEA are examined in section~\ref{sec:xfmea}.
|
|
|
|
\paragraph{Failure Modes and the signal path.}
|
|
\fmmdglossSIGPATH
|
|
In general a component failure mode in an electronic circuit will
|
|
change the circuit topology. For a single failure
|
|
this effect may cause additional complications for the analyst.
|
|
For multiple failures this means
|
|
that the analyst
|
|
will have to deal with altered---or changed circuit topologies---
|
|
of the electronic circuit for each analysis.
|
|
|
|
|
|
\paragraph{Single component failure mode to system failure relation.}
|
|
%
|
|
%
|
|
% NEED SOME NICE HISTORICAL REFS HERE
|
|
FMEA, due to its inductive bottom-up approach, is good
|
|
at mapping potential single component failures to system level faults/events.
|
|
%
|
|
The concept of the unacceptability of a single component failure causing a system failure % catastrophe,
|
|
is an important and easily understood measurement of safety.
|
|
%
|
|
Statistics for single failures are easy to calculate
|
|
because Mean Time to Failure (MTTF) statistics~\cite{fmd91,mil1991} for commonly used components can be found.
|
|
%
|
|
Also, used in the design phase of a project, FMEA is a useful tool
|
|
for discovering potential failure scenarios~\cite{1778436820050601}.
|
|
%
|
|
From a large system perspective, it may be found that {\bc} {\fms}
|
|
may have more than one possible system event associated with them.
|
|
%
|
|
Often there will be a clear one to one mapping, but
|
|
probabilities to failure (as used in FMECA, see section~\ref{sec:FMECA})
|
|
could mean one ({\fm}) too many (system level symptoms). % mapping.
|
|
%
|
|
\paragraph{Use of Markov chains to model failure modes.}
|
|
We could represent a failure mode and its possible outcomes using a Markov chain~\cite{probfmea_4338247}.
|
|
%
|
|
Where multiple simultaneous %\footnote{Multiple simultaneous failures are taken to mean failures that occur within the same detection period.}
|
|
failure modes are considered this complicates
|
|
the statistical nature of the Markov chain cause and effect model.
|
|
%
|
|
What we in fact get is the merging, or local interaction of two Markov chains
|
|
for the cause and effect model.
|
|
% Subject Object Wiki answers : Best Answer
|
|
%It is not grammar or vocabulary. It is a philosophical reference.
|
|
%The dichotomy is the surrounding view of self that we act out of. It is often learned with language and not taught [like the alphabet and numbers are taught] in early life through language and the forming of distinctions.
|
|
%The Subject/Object dichotomy is related mostly to the Cartesian model of a 'self'. We can be both the subject that we observe, and the object doing the observing.But it goes beyond that into how we view the world we are in. In balanced thinking, we are both subjective and objective about situations and interactions in daily life, internally and externally. In unbalanced thinking, there is a tilt towards one side or the other. That is, either too subjective; as relating everything to how it affects you personally, [temperamental and self center] or, too objective; not having a sense of who you are in regards to what is occurring, [aloof, distant and apathetic]. It is related in Western philosophy as the basic nature of dualism. How do you know that you learned to live in a subject/object dichotomy?
|
|
%The core of Cartesianism is that you have a mind: a separate function of your'self'. If you have an invisible self called a mind - you are in the subject/object dichotomy. Non-dualism is mostly learned in Eastern philosophies and will refer to the mind as an integer of the self - not separate from it.
|
|
%You can not jump from one to the other. And, they both must be learned as referential contexts to who 'you' are in the world you live in.
|
|
%
|
|
\paragraph{Subjective and Objective thinking in relation to FMEA.}
|
|
\label{sec:subjectiveobjective}
|
|
FMEA is always performed in the context of the use of the equipment.
|
|
In terms of philosophy the context is in the domain of the subjective and the
|
|
logic and reasoning behind failure causation, the objective.
|
|
%
|
|
By using objective reasoning a component level failure to a system level event can be traced,
|
|
but only in
|
|
the subjective sense its meaning and/or severity be determined.
|
|
%
|
|
It is worth remembering that
|
|
failure mode analysis performed on the leaks possible from the O ring on the space shuttle
|
|
did not link this failure to the catastrophic failure of the spacecraft~\cite{challenger,sanjeev}.
|
|
%
|
|
This was not a failure in the objective reasoning, but more of the subjective, or the context in which the leak occurred.
|
|
%
|
|
What this means is that for an objectively calculated failure mode outcome, there may have
|
|
more than one subjective outcome. %, or definition, for it.
|
|
%
|
|
|
|
This means that objective reasoning can be applied to determine objective effects, but the criticality ---or the seriousness/consequences---
|
|
of those failures depends upon the Equipment Under Control (EUC)
|
|
and its environment.
|
|
%
|
|
For instance a leak of nuclear material %on an
|
|
aboard a spacecraft could have the consequences
|
|
of loss of mission, but a leak on earth could have serious health and environmental consequences.
|
|
This means one line of FMECA describing a system risk is an over simplification (consider that the same
|
|
nuclear material will be present during transport and launch, and when outside earth's environment).
|
|
%
|
|
Subjective appraisal of the outcome of a system failure mode can also
|
|
be subject to management and/or political pressure.
|
|
%
|
|
The two most recent variants of FMEA,
|
|
FMEDA and FMECA have dipped a metaphorical toe into the subjective realm, FMECA with its `criticality~factor' and
|
|
FMEDA with its definition of `dangerous'.
|
|
%
|
|
However, while starting to address the subjective side
|
|
of failure analysis,
|
|
these methodologies
|
|
do not separate the final subjective stage from the objective. % stage of analysis.
|
|
%
|
|
A subjective assessment is made during the analysis of each {\bc} {\fm}
|
|
regardless of the fact that most {\bc} {\fms} cause shared
|
|
system level failures.
|
|
%
|
|
This means that work at the subjective
|
|
level is repeated.
|
|
%
|
|
Detailed work on subjective analysis is beyond the scope of this study.
|
|
|
|
|
|
\paragraph{Multiple Simultaneous Failure Modes.}
|
|
%
|
|
FMEA is less useful for determining events for multiple
|
|
simultaneous
|
|
failures\footnote{Multiple simultaneous failures are taken to mean failures that occur within the same detection period.
|
|
Detection periods are typically determined for the process under control. For instance, for a flame detector in an industrial burner this
|
|
is typically one second.~\cite{en298}}.
|
|
%
|
|
Multiple failures may cause the same system level failure (i.e. two separate failures
|
|
could cause the same system failure, and in combination still cause the same failure), this
|
|
can be termed a common failure result.
|
|
%
|
|
Work has been performed using component failure statistics and logic to
|
|
offer selected---by virtue of statistical likelihood and common failure result reduction---multiple failures for analysis
|
|
and consideration by an investigating engineer~\cite{FMEAmultiple653556}.
|
|
%
|
|
%We now compound the multiple symptoms from one {\bc} {\fm} possibility
|
|
%with the merging of Markov chains.
|
|
%,this is an additional complication.
|
|
%, of having to change between these two modes of thinking, it becomes more difficult to
|
|
%get a balance between subjective and objective perspectives.
|
|
A complication for multiple failure analysis is that failure modes may cause a change in circuit topology
|
|
meaning the additional failures might have to be analysed with respect to the changed topology.
|
|
%subjective/objective become more cluttered when there are multiple possibilities
|
|
%for the the results of an FMEA line of reasoning.
|
|
Because multiple failures mean dealing with changed topologies
|
|
the objective criteria is additionally complicated with the subjective
|
|
adding yet another layer of complication.
|
|
%
|
|
%
|
|
Traditional FMEA has the translation from an objective to subjective
|
|
failure modes as an intrinsic part of its process, which can be considered a weakness.
|
|
|
|
\paragraph{Failure modes and their observability criterion: detectable and undetectable.}
|
|
\label{sec:detectable}
|
|
\fmmdglossOBS
|
|
Often the effects of a failure mode may be easy to detect,
|
|
and our equipment can react by raising an alarm or compensating for the resulting fault.
|
|
%
|
|
Some failure modes may cause undetectable failures, for instance a component that causes
|
|
a measured reading to change could have adverse consequences yet not be flagged as a failure.
|
|
%
|
|
This type of failure
|
|
can not be dealt with by passing error indication to higher level modules
|
|
because it simply cannot be detected.
|
|
%
|
|
The system therefore
|
|
has no way of knowing the reading is invalid.
|
|
%
|
|
The term observable has a specific meaning in the field of control engineering~\cite{721666, ACS:ACS1297};
|
|
systems submitted for FMEA are generally related to control systems,
|
|
and so to avoid confusion the terms `detectable' and `undetectable' (as defined in EN61508\cite{en61508})
|
|
will be used for describing the observability of failure modes in this document.
|
|
%\glossary{name={observability}, description={The property of a system failure in relation to a particular component failure mode, where it can be determined whether the readings/actions associated with it are valid, or the by-product of a failure. If we cannot determine that there is a fault present, the system level failure is said to be unobservable.}}
|
|
\fmmdglossOBS
|
|
|
|
|
|
\paragraph{Impracticality of Field Data for Modern Systems.}
|
|
\fmmdglossFIT
|
|
Modern electronic components, are generally very reliable, and the systems built from them
|
|
are thus very reliable too. Reliable field data on failures will, therefore, be sparse.
|
|
%
|
|
Should it be wished to prove a continuous demand system for say ${10}^{-7}$ failures\footnote{${10}^{-7}$ failures per hour of operation is the
|
|
threshold for S.I.L. 3 reliability~\cite{en61508}.
|
|
%
|
|
Failure rates are normally measured per $10^9$ hours of operation
|
|
and are known as Failure in Time (FIT) values.
|
|
%
|
|
The maximum FIT values for a SIL 3 system is therefore 100.}
|
|
per hour of operation, even with 1000 correctly monitored units in the field
|
|
there could only be one failure per ten thousand hours expected (i.e. a little over one a year) .
|
|
%
|
|
It would be utterly impractical to get statistically significant data for equipment
|
|
at these reliability levels.
|
|
%
|
|
However, FMEA can be used (more specifically the FMEDA variant, see section~\ref{sec:FMEDA}),
|
|
working from known component failure rates, to obtain
|
|
statistical estimates of the equipment reliability.
|
|
\fmmdglossFIT
|
|
%
|
|
\paragraph{Forward and Backward Searches.}
|
|
\fmmdglossFS
|
|
\fmmdglossBS
|
|
A forward search starts with possible failure causes
|
|
and uses logic and reasoning to determine system level outcomes.
|
|
%
|
|
Forward search types of fault analysis are said to be `inductive'.
|
|
%
|
|
A backward search starts with (undesirable) system level events and
|
|
works back down to potential causes using de-composition
|
|
of the system and logic.
|
|
%
|
|
FMEA based methodologies are forward searches\cite{Lutz:1997:RAU:590564.590572} and top down
|
|
methodologies such as FTA~\cite{nucfta,nasafta} are backward searches.
|
|
%
|
|
%
|
|
Backward (or bottom-up) searches are said to be deductive (i.e. the results of failure are
|
|
deduced).
|
|
|
|
|
|
|
|
\subsection{Reasoning distance.}
|
|
\label{reasoningdistance}
|
|
\fmmdglossRD
|
|
Reasoning distance, is the number of stages of logic and reasoning used
|
|
in {\fm} analysis to map a failure cause to its potential outcomes; counted
|
|
by the number of {\fm} to component checks made.
|
|
%
|
|
The basic FMEA example in section~\ref{basicfmea}
|
|
considered one {\fm} against some of the components in the milli-volt reader.
|
|
%
|
|
To create an exhaustive FMEA report on the milli-volt reader, every
|
|
known failure mode of every component within it would have to be examined against all its other components.
|
|
%
|
|
`Reasoning~distance', for one {\fm}, is defined as the number of components checked against it
|
|
to determine its system level symptom(s).
|
|
%
|
|
No current FMEA variant gives guidelines for the components that should
|
|
be included to analyse a {\fm} in a system.
|
|
%
|
|
Were a {\fm} examined against all the other components in a system
|
|
this would give us the maximum reasoning distance.
|
|
%
|
|
This is termed the exhaustive FMEA case for a single {\fm}.
|
|
%does not
|
|
% The exhaustive~reasoning~distance would be
|
|
% the sum of the number of failure modes, against all other components
|
|
% in that system.
|
|
Thus the exhaustive~reasoning~distance for a particular component
|
|
would be to multiply
|
|
the number of failure modes it has by the number of remaining components
|
|
in the system.
|
|
%
|
|
The exhaustive reasoning~distance for a system would be the
|
|
the sum of these multiplications for all the components it contains.
|
|
%
|
|
If the milli-volt reader had say 100 components, with three failure modes each, this
|
|
would give an exhaustive reasoning distance---for single failure analysis---of $3 \times 100 \times 99$.
|
|
%
|
|
The discussion on reasoning distance provides a metric to examine
|
|
the state explosion problems associated with forward search failure investigation
|
|
methodologies.
|
|
%
|
|
\fmmdglossSTATEEX
|
|
%
|
|
It is apparent that the shorter the reasoning distance, the more precisely theoretical examination
|
|
can determine failure symptoms.
|
|
%
|
|
For instance for a very simple small circuit, a better understanding of failure effects is expected,
|
|
than for a very large system where there are more variables and potential {\fm} interactions.
|
|
%
|
|
%.... general concept... simple ideas about how complex a
|
|
%failure analysis is the more modules and components are involved
|
|
% cite for forward and backward search related to safety critical software
|
|
%{sfmeaforwardbackward}
|
|
\subsection{FMEA and the State Explosion Problem}
|
|
\label{sec:xfmea}
|
|
\paragraph{Problem of which components to check for a given {\bc} {\fm}.}
|
|
\fmmdglossSTATEEX
|
|
%
|
|
FMEA for safety critical certification (i.e. for EN298 and EN61508)~\cite{en298,en61508} has to be applied
|
|
to all known failure modes of all components within a system.
|
|
%
|
|
Each one of these, in a typical report, would be one line of a spreadsheet entry.
|
|
%
|
|
FMEA does not define or specify the scope of the investigation for each component failure mode.
|
|
%
|
|
For instance should the signal path be followed, with all components encountered along that, or should the scope be wider?
|
|
%
|
|
%If we wethe effect of a component {\fm} against all other components
|
|
%in a system, this could be said to be exhaustive analysis.
|
|
|
|
\paragraph{Exhaustive Single Failure FMEA.}
|
|
\fmmdglossXFMEA
|
|
%
|
|
To perform exhaustive FMEA (XFMEA), every possible interaction
|
|
of a failure mode with all other components in a system must be examined.
|
|
%
|
|
Or in other words, all possible failure scenarios considered.
|
|
%
|
|
%to do this completely (all failure modes against all components).
|
|
This is represented in the equation below, %~\ref{eqn:fmea_state_exp},
|
|
where $N$ is the total number of components in the system, $RD_{single}$ is the reasoning~distance and
|
|
$f$ is the number of failure modes per component:
|
|
%
|
|
\begin{equation}
|
|
\label{eqn:fmea_single}
|
|
RD_{single} = N.(N-1).f . % \\
|
|
%(N^2 - N).f
|
|
\end{equation}
|
|
%
|
|
This means an order of $O(N^2)$ checks to perform
|
|
to undertake XFMEA for single failures.
|
|
%
|
|
Even small systems have typically
|
|
100 components, and they typically have 3 or more failure modes each, which would give
|
|
$100 \times 99 \times 3 = 29,700 $ as a reasoning~distance.
|
|
%
|
|
\fmmdglossSTATEEX
|
|
\paragraph{Exhaustive FMEA and double failure scenarios.}
|
|
%
|
|
%\paragraph{Exhaustive Double Failure FMEA}
|
|
For looking at potential double failure
|
|
scenarios\footnote{Certain double failure scenarios are already legal
|
|
requirements---The European Gas burner standard (EN298:2003)---demands the checking of
|
|
double failure scenarios (for burner lock-out scenarios).}
|
|
%
|
|
(two components failing within a given time frame) and the order becomes $O(N^3)$.
|
|
Where $RD_{double}$ is the reasoning~distance for double failure scenarios:
|
|
\begin{equation}
|
|
\label{eqn:fmea_double}
|
|
RD_{double} = N.(N-1).(N-2).f . % \\
|
|
%(N^2 - N).f
|
|
\end{equation}
|
|
%
|
|
For a theoretical system with 100 components and a fixed 3 failure modes each, this gives reasoning distance of
|
|
$100 \times 99 \times 98 \times 3 = 2,910,600$. % failure mode scenarios.
|
|
%
|
|
In practise there is an additional complication here, that of
|
|
the circuit topology changes that {\fms} can cause.
|
|
|
|
\paragraph{Reliance on experts for meaningful FMEA Analysis.}
|
|
Current FMEA methodologies cannot consider---for the reason of state explosion---an exhaustive approach.
|
|
%We define exhaustive FMEA ({\XFMEA}) as examining the effect of every component failure mode
|
|
%against the remaining components in the system under investigation.
|
|
%
|
|
\fmmdglossSTATEEX
|
|
%
|
|
Because for practical reasons, XFMEA cannot be performed for anything other than a trivial system,
|
|
reliance is placed upon experts on the system under investigation
|
|
to perform a meaningful analysis.
|
|
%
|
|
These experts must use their judgement and experience to choose
|
|
sub-sets of the components in the system to check against each {\fm}.
|
|
%
|
|
Also, %In practise
|
|
these experts have to select the areas they see as most critical for detailed FMEA analysis:
|
|
it is usually impossible, for reasons of time to perform the work,
|
|
to action a detailed level of analysis on all component {\fms}
|
|
on anything but a small hypothetical system.
|
|
|
|
\subsection{Component Tolerance}
|
|
|
|
Component tolerances may need considering when determining if a component has failed.
|
|
Calculations for acceptable ranges to determine failure or acceptable conditions
|
|
must be made where appropriate.
|
|
%
|
|
An example of component tolerance considered for FMEA
|
|
is given in section~\ref{sec:resistortolerance}.
|
|
|
|
\section{FMEA in current usage: Five variants}
|
|
|
|
\paragraph{Five main Variants of FMEA}
|
|
\begin{itemize}
|
|
\item \textbf{PFMEA - Production} Emphasis on cost reduction and product improvement;
|
|
\item \textbf{FMECA - Criticality} Emphasis on minimising the effect of critical systems failing; % Military/Space
|
|
\item \textbf{FMEDA - Statistical Safety} Statistical analysis giving Safety Integrity Levels;
|
|
\item \textbf{DFMEA - Design or Static/Theoretical} Approval of safety critical systems using FMEA and single or double failure prevention;% EN298/EN230/UL1998
|
|
\item \textbf{SFMEA - Software FMEA} --- Usage not enforced by most current standards~\cite{en298,en230,en61508}. %only used in highly critical systems at present.
|
|
\end{itemize}
|
|
|
|
|
|
|
|
|
|
|
|
\section{PFMEA - Production FMEA : 1940's to present}
|
|
\fmmdglossPFMEA
|
|
%
|
|
Production FMEA (or PFMEA), is FMEA used to prioritise, in terms of
|
|
cost, problems to be addressed in product production.
|
|
%
|
|
It generally focuses on known problems and using their
|
|
statistical frequency %they occur
|
|
and their cost to fix multiplied gives a Risk Priority Number (RPN)
|
|
for the germane component {\fm}.
|
|
%
|
|
Fixing problems with the highest RPN number
|
|
will return most cost benefit~\cite{bfmea}.
|
|
%
|
|
An example PFMEA report is presented in table~\ref{tbl:pfmeareport}.
|
|
|
|
% benign example of PFMEA in CARS - make something up.
|
|
\subsection{PFMEA Example}
|
|
\begin{table}[ht]
|
|
\label{tbl:pfmeareport}
|
|
\caption{FMEA Calculations} % title of Table
|
|
\centering % used for centering table
|
|
\begin{tabular}{|| l | l | c | c | l ||} \hline
|
|
\textbf{Failure Mode} & \textbf{P} & \textbf{Cost} & \textbf{Symptom} & \textbf{RPN} \\ \hline \hline
|
|
relay 1 n/c & $1*10^{-5}$ & 38.0 & indicators fail & 0.00038 \\ \hline
|
|
relay 2 n/c & $1*10^{-5}$ & 98.0 & doorlocks fail & 0.00098 \\ \hline
|
|
% rear end crash & $14.4*10^{-6}$ & 267,700 & fatal fire & 3.855 \\
|
|
% ruptured f.tank & & & & \\ \hline
|
|
\hline
|
|
\end{tabular}
|
|
\end{table}
|
|
|
|
|
|
\section{FMECA - Failure Modes Effects and Criticality Analysis}
|
|
\fmmdglossFMECA
|
|
\label{sec:FMECA}
|
|
%\paragraph{ FMECA - Failure Modes Effects and Criticality Analysis.}
|
|
% \begin{figure}
|
|
% \centering
|
|
% %\includegraphics[width=100pt]{./military-aircraft-desktop-computer-wallpaper-missile-launch.jpg}
|
|
% \includegraphics[width=300pt]{./CH2_FMEA/A10_thunderbolt.jpg}
|
|
% % military-aircraft-desktop-computer-wallpaper-missile-launch.jpg: 1024x768 pixel, 300dpi, 8.67x6.50 cm, bb=0 0 246 184
|
|
% \caption{A10 Thunderbolt}
|
|
% \label{fig:f16missile}
|
|
% \end{figure}
|
|
FMECA places emphasis on determining criticality rather than the cost of system failures.
|
|
%
|
|
%
|
|
It applies Bayesian statistics within the FMEA process (i.e. using probabilities of component failures
|
|
and the probability of those failures causing given system level failures)
|
|
to determine the risk of system level events/symptoms.
|
|
%
|
|
%
|
|
The results of these risk probabilities, i.e. for system level failures,
|
|
are then multiplied by the estimated operational time of the system.
|
|
%
|
|
For instance a military or emergency system may be typically operational for
|
|
a given number of hours. The risk against time value, in conjunction with the severity
|
|
of the system level event gives a `criticality~level'.
|
|
%
|
|
%Also the probability of the system failure causing a critical event.
|
|
%
|
|
Bayes' theorem can be seen as a theory on the `probability~of~causes'~\cite{probstatcrash}[p.9].
|
|
%
|
|
A given component failure may for instance, be associated with
|
|
a particular system failure to a calculated, or measured from field~data, statistical probability.
|
|
%
|
|
Applying Bayesian statistics to failure analysis, suffers the
|
|
problem that correlation does not imply causation~\cite{bayesfrequentist}.
|
|
%
|
|
However, correlation is evidence for causation, and maybe the only evidence to hand
|
|
and this is the justification behind its use.
|
|
%
|
|
This implies a weakness in the FMECA philosophy. It means that
|
|
failure causes can be inferred, rather than analytically
|
|
determined, to become part of the failure mode model.
|
|
%
|
|
A history of the usage and development of FMECA may be found in~\cite{FMECAresearch}.
|
|
\fmmdglossFMECA
|
|
|
|
\paragraph{FMECA - Statistical variables.}
|
|
%
|
|
FMECA extends PFMEA, but instead of cost, a criticality or
|
|
seriousness factor is ascribed to putative top level incidents.
|
|
FMECA has three probability factors for component failures, a system operational time and a severity factor.
|
|
|
|
\textbf{FMECA ${\lambda}_{p}$ value.}
|
|
This is the overall failure rate of a base component.
|
|
This will typically be the failure rate per million ($10^6$) or
|
|
billion ($10^9$) hours of operation~\cite{mil1991}.
|
|
|
|
\textbf{FMECA $\alpha$ value.}
|
|
The failure mode probability, usually denoted by $\alpha$ is the probability of
|
|
a particular failure~mode occurring within a component~\cite{fmd91}.
|
|
%, should it fail.
|
|
%A component with N failure modes will thus have
|
|
%have an $\alpha$ value associated with each of those modes.
|
|
%As the $\alpha$ modes are probabilities, the sum of all $\alpha$ modes for a component must equal one.
|
|
%
|
|
\fmmdglossFMECA
|
|
%
|
|
|
|
\textbf{FMECA $\beta$ value.}
|
|
The second probability factor $\beta$, is the probability that the failure mode
|
|
will cause a given system failure.
|
|
%
|
|
This corresponds to `Bayesian' probability, i.e. given a particular
|
|
component failure mode, the probability of a given system level failure~\cite{nucfta}[VI-19].
|
|
|
|
\textbf{FMECA `t' Value.}
|
|
The time that a system will be operating for, or the working life time of the product is
|
|
represented by the variable $t$.
|
|
%for probability of failure on demand studies,
|
|
%this can be the number of operating cycles or demands expected.
|
|
|
|
\textbf{Severity `s' value.}
|
|
A weighting factor to indicate the seriousness of the putative system level error.
|
|
%Typical classifications are as follows:~\cite{fmd91}
|
|
|
|
The statistical formula to calculate the criticallity factor for one component {\fm} is given below:
|
|
%
|
|
\begin{equation}
|
|
C_m = {\beta} . {\alpha} . {{\lambda}_p} . {t} . {s} .
|
|
\end{equation}
|
|
\fmmdglossFMECA
|
|
%
|
|
The highest $C_m$ values would represent the most dangerous or serious
|
|
system level failures.
|
|
The highest $C_m$ values would be at the top of a `to~fix' list
|
|
for a project manager, and some levels of risk may be considered unacceptable
|
|
and require re-design. % of some systems.
|
|
\fmmdglossFMECA
|
|
|
|
\section{FMEDA - Failure Modes Effects and Diagnostic Analysis}
|
|
%
|
|
%\subsection{ FMEDA - Failure Modes Effects and Diagnostic Analysis}
|
|
% \begin{figure}
|
|
% \centering
|
|
% \includegraphics[width=200pt]{./SIL.png}
|
|
% % SIL.jpg: 350x286 pixel, 72dpi, 12.35x10.09 cm, bb=0 0 350 286
|
|
% \caption{SIL requirements}
|
|
% \end{figure}
|
|
%\subsection{ FMEDA - Failure Modes Effects and Diagnostic Analysis}
|
|
%
|
|
\fmmdglossFMEDA
|
|
%
|
|
\begin{table}[ht]
|
|
\centering
|
|
|
|
%\centering % used for centering table
|
|
\begin{tabular}{|| l | l | c | c | l ||} \hline
|
|
\textbf{SIL} & \textbf{Low Demand} & \textbf{Continuous Demand} \\
|
|
& Prob of failing on demand & Prob of failure per hour \\ \hline \hline
|
|
4 & $ 10^{-5}$ to $< 10^{-4}$ & $ 10^{-9}$ to $< 10^{-8}$ \\ \hline
|
|
3 & $ 10^{-4}$ to $< 10^{-3}$ & $ 10^{-8}$ to $< 10^{-7}$ \\ \hline
|
|
2 & $ 10^{-3}$ to $< 10^{-2}$ & $ 10^{-7}$ to $< 10^{-6}$ \\ \hline
|
|
1 & $ 10^{-2}$ to $< 10^{-1}$ & $ 10^{-6}$ to $< 10^{-5}$ \\ \hline
|
|
|
|
\hline
|
|
\end{tabular}
|
|
\caption{Table adapted from EN61508-1:2001 [7.6.2.9 p33], showing statistical tolerance of `dangerous~failures' to
|
|
comply with a given SIL level} % title of Table
|
|
\label{tbl:sil_levels}
|
|
\end{table}
|
|
%
|
|
% \begin{itemize}
|
|
% \item \textbf{Statistical Safety} Safety Integrity Level (SIL) standards (EN61508/IOC5108).
|
|
% \item \textbf{Diagnostics} Diagnostic or self checking elements modelled
|
|
% \item \textbf{Complete Failure Mode Coverage} All failure modes of all components must be in the model
|
|
% \item \textbf{Guidelines} To system architectures and development processes
|
|
% \end{itemize}
|
|
FMEDA is a modern extension of FMEA, in that it recognises the effect of
|
|
self checking features on safety, and provides detailed recommendations for computer/software architecture.
|
|
%
|
|
%
|
|
%
|
|
FMEDA is the fundamental methodology of the statistical (safety integrity level)
|
|
type standards (EN61508/IOC5108).
|
|
The end result of an EN61508 analysis is an % provides a statistical
|
|
overall `level~of~safety' known as a Safety Integrity level (SIL) assigned to an installed system.
|
|
%
|
|
It has a simple final result, a Safety Integrity Level (SIL) from 1 to 4 (where 4 is safest).
|
|
%
|
|
These SIL levels are broadly linked to the concept of an
|
|
acceptance of given probabilities of dangerous
|
|
failures against time, as shown in table~\ref{tbl:sil_levels}.
|
|
%
|
|
The philosophy behind this is that it is recognised that no system can have a perfect
|
|
safety integrity, but that risk and criticality can be matched to acceptable,
|
|
or realistic levels of risk.
|
|
%There are currently four SIL `levels', one to four, with four being the highest level.
|
|
%
|
|
%
|
|
SIL levels are intended to
|
|
classify the statistical safety of installed plant:
|
|
sales terms such as a `SIL~3~sensor' or other `device' given a SIL level, are meaningless.
|
|
%
|
|
SIL analysis is concerned with `safety~loops', not individual modules, sensors, computing devices or actuators.
|
|
%
|
|
In control engineering terms, the safety~loop is the complete
|
|
path from sensors to signal~processing to actuators for a given function
|
|
in the plant.
|
|
%
|
|
This entire loop must be designed to detect and deal with any hazards
|
|
and have measures in place to reduce their affects.
|
|
%
|
|
In EN61508 terminology, a safety~loop is known as a Safety Instrumented Function (SIF).
|
|
%
|
|
\fmmdglossFMEDA
|
|
%
|
|
% for four levels of
|
|
%safety integrity, referred to as Safety Integrity Levels (SIL).
|
|
%For Hardware
|
|
%
|
|
FMEDA requires %does force
|
|
the analyst to consider all hardware components in a system
|
|
and requires that an MTTF value is assigned for each {\bc} {\fm};
|
|
the MTTF may be statistically mitigated (improved)
|
|
if it can be shown that self-checking measures will not only detect it within the SIF, but
|
|
also react in a safe way.
|
|
That is that the SIF can recognise that it has a fault condition and can take appropriate action.
|
|
%
|
|
The MTTF value for each component {\fm} is denoted using the symbol `$\lambda$'.
|
|
%
|
|
\paragraph{SIL and Software.}
|
|
EN61508 regulation in relation to software provides procedural quality guidelines and constraints (such as forbidding certain
|
|
programming languages and/or features): it does not provide a means to trace failure mode effects in software
|
|
or across the software/hardware interface.
|
|
%
|
|
While procedural guidelines and constraints can improve software reliability, ensuring that reliability targets, for software,
|
|
are actually met for given SIL levels is currently almost impossible~\cite{silsandsoftware}.
|
|
\fmmdglossFMEDA
|
|
|
|
%\subsection{ FMEDA - Failure Modes Effects and Diagnostic Analysis}
|
|
\label{sec:FMEDA}
|
|
\textbf{Failure Mode Classifications and metrics in FMEDA.}
|
|
\begin{itemize}
|
|
\item \textbf{Safe or Dangerous.} Failure modes are classified SAFE or DANGEROUS.
|
|
\item \textbf{Detectable failure modes.} Failure modes are given the attribute DETECTABLE or UNDETECTABLE.
|
|
\item \textbf{Four attributes for FMEDA Failure Modes.} All failure modes may thus be Safe Detected(SD), Safe Undetected(SU), Dangerous Detected(DD), Dangerous Undetected(DU)
|
|
\item \textbf{Four statistical properties of a system.} The statistics for the four classifications of system failures are summed: \\
|
|
$ \sum \lambda_{SD}$, $\sum \lambda_{SU}$, $\sum \lambda_{DD}$, $\sum \lambda_{DU}$. \\
|
|
\end{itemize}
|
|
|
|
% Failure modes are classified as Safe or Dangerous according
|
|
% to the putative system level failure they will cause.
|
|
% The Failure modes are also classified as Detected or
|
|
% Undetected.
|
|
% This gives us four level failure mode classifications:
|
|
% Safe-Detected (SD), Safe-Undetected (SU), Dangerous-Detected (DD) or Dangerous-Undetected (DU),
|
|
% and the probabilistic failure rate of each classification
|
|
% is represented by lambda variables
|
|
% (i.e. $\lambda_{SD}$, $\lambda_{SU}$, $\lambda_{DD}$, $\lambda_{DU}$).
|
|
|
|
|
|
%\subsection{ FMEDA - Failure Modes Effects and Diagnostic Analysis}
|
|
|
|
\textbf{Diagnostic Coverage.}
|
|
The diagnostic coverage is simply the ratio
|
|
of the dangerous detected probabilities
|
|
against the probability of all dangerous failures,
|
|
and is normally expressed as a percentage~\cite{en61508}[2-Annex C].
|
|
%
|
|
$\Sigma\lambda_{DD}$ represents
|
|
the percentage of dangerous detected base component failure modes, and
|
|
$\Sigma\lambda_D$ the total number of dangerous base component failure modes,
|
|
%
|
|
$$ DiagnosticCoverage = \Sigma\lambda_{DD} / \Sigma\lambda_D . $$
|
|
\fmmdglossFMEDA
|
|
%
|
|
%
|
|
%
|
|
%\subsection{ FMEDA - Failure Modes Effects and Diagnostic Analysis}
|
|
The \textbf{diagnostic coverage} for safe failures, where $\Sigma\lambda_{SD}$ represents the percentage of
|
|
safe detected base component failure modes,
|
|
and $\Sigma\lambda_S$ the total number of safe base component failure modes,
|
|
is given as
|
|
%
|
|
$$ SF = \frac{\Sigma\lambda_{SD}}{\Sigma\lambda_S} . $$
|
|
%
|
|
%
|
|
%
|
|
%\subsection{ FMEDA - Failure Modes Effects and Diagnostic Analysis}
|
|
\textbf{Safe Failure Fraction.}
|
|
A key concept in FMEDA is Safe Failure Fraction (SFF).
|
|
This is the ratio of safe and dangerous detected failures
|
|
against all safe and dangerous failure probabilities.
|
|
Again this is usually expressed as a percentage,
|
|
%
|
|
$$ SFF = \big( \Sigma\lambda_S + \Sigma\lambda_{DD} \big) / \big( \Sigma\lambda_S + \Sigma\lambda_D \big) . $$
|
|
%
|
|
SFF determines how proportionately fail-safe a system is, not how reliable it is.
|
|
%
|
|
A weakness in this philosophy is that by adding extra safe failures (even unused ones)
|
|
the apparent SFF would be improved\footnote{The artificial inflation of SFF,
|
|
by including unnecessary safe functions or unused components
|
|
(i.e. a loophole) is closed in the 2010 edition of the standard.}.
|
|
\fmmdglossFMEDA
|
|
%
|
|
%
|
|
%
|
|
\subsection{ FMEDA - Failure Modes Effects and Diagnostic Analysis}
|
|
To achieve SIL levels, diagnostic coverage and SFF levels are prescribed along with
|
|
hardware architectures and software techniques.
|
|
The overall aim of SIL is to classify the safety of a system,
|
|
by statistically determining how frequently it can fail dangerously.
|
|
\fmmdglossFMEDA
|
|
%
|
|
\subsection{Automotive Safety Integrity Levels}
|
|
%
|
|
\label{sec:asil}
|
|
%
|
|
The EN61508 variant for automotive use, as defined in standard ISO~26262, is known as Automotive SIL (ASIL)~\cite{Kafka20122}.
|
|
%
|
|
Safety instrumented functions (SIFs) for vehicles are assigned ASIL ratings.
|
|
%
|
|
ASIL classifications are rated from A to D, where D is the most safety critical.
|
|
%
|
|
For instance very critical functions such as the brakes and steering will have the highest ASIL rating of D.
|
|
%
|
|
The automotive industry generally uses bought in modules % which must have been tested and approved,
|
|
typically built by specialist companies.
|
|
%
|
|
These modules themselves must have been tested and approved so, for a car manufacturer
|
|
designing from scratch is not generally financially feasible.
|
|
%
|
|
This means that to implement an ASIL SIF designers will usually have to rely on bought in modules.
|
|
%
|
|
However, these bought in modules may not be rated to the ASIL level required by the SIF.
|
|
% %
|
|
% ASIL functions are therefore often implemented in a modular fashion.
|
|
%
|
|
Because of the modular paradigm forced on the designers by having to buy in components
|
|
a process has been developed called `ASIL~de-composition'~\cite{6464473}.
|
|
%
|
|
This allows a highly safety critical function to be implemented
|
|
with lower ASIL rated components, as long as it can be shown that they
|
|
have independent failure causes and implement redundancy. % for the SIF.
|
|
%
|
|
This is in effect a top down de-composition of safety requirements.
|
|
%
|
|
This is rather like the demand for multiple engines on aircraft
|
|
that must make long journeys over the sea to statistically limit
|
|
the likelihood of one failure cause --- i.e. one engine failure --- causing a serious incident.
|
|
%
|
|
The drawback to this redundancy concept is an unexpected common failure mode~\cite{allfour}.
|
|
%
|
|
The ASIL philosophy does represent a modular approach to safety analysis.
|
|
%
|
|
This makes it of interest to this study, which later proposes a modular failure mode analysis methodology.
|
|
%
|
|
|
|
|
|
|
|
|
|
|
|
|
|
\section{FMEA used for Safety Critical Approvals}
|
|
\fmmdglossDFMEA
|
|
\subsection{DESIGN FMEA: Safety Critical Approvals FMEA}
|
|
% \begin{figure}[h]
|
|
% \centering
|
|
% \includegraphics[width=300pt,keepaspectratio=true]{./CH2_FMEA/tech_meeting.png}
|
|
% % tech_meeting.png: 350x299 pixel, 300dpi, 2.97x2.53 cm, bb=0 0 84 72
|
|
% \caption{FMEA Meeting}
|
|
% \label{fig:tech_meeting}
|
|
% \end{figure}
|
|
%Static FMEA, Design FMEA, Approvals FMEA
|
|
%
|
|
Experts from Approval House and Equipment Manufacturer
|
|
discuss selected component failure modes
|
|
judged to be in critical sections of the product.
|
|
%
|
|
This could be considered as a design check method, deliberately
|
|
looking for weaknesses at a theoretical level.
|
|
%
|
|
Because design FMEA meetings can have the format of a meeting and discussion
|
|
they can have the following drawbacks:
|
|
%\subsection{DESIGN FMEA: Safety Critical Approvals FMEA}
|
|
%
|
|
% \begin{figure}[h]
|
|
% \centering
|
|
% \includegraphics[width=70pt,keepaspectratio=true]{./tech_meeting.png}
|
|
% % tech_meeting.png: 350x299 pixel, 300dpi, 2.97x2.53 cm, bb=0 0 84 72
|
|
% \caption{FMEA Meeting}
|
|
% \label{fig:tech_meeting}
|
|
% \end{figure}
|
|
%
|
|
\begin{itemize}
|
|
\item Impossible to look at all component failures let alone apply FMEA exhaustively/rigorously,
|
|
\item In practice, failure scenarios for critical sections are contested, and either justified or extra safety measures implemented,
|
|
\item Often meeting notes or minutes only: it is unusual for detailed technical arguments to be documented.
|
|
\end{itemize}
|
|
%
|
|
%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%% SFMEA????
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
\section{Conclusion}
|
|
\begin{figure}[h]
|
|
\centering
|
|
\includegraphics[width=400pt]{./CH2_FMEA/component_fm_rel_ana_subj_obj.png}
|
|
% component_fm_rel_ana_subj_obj.png: 694x303 pixel, 72dpi, 24.48x10.69 cm, bb=0 0 694 303
|
|
\caption{FMEA UML data representation with subjective system level failure modes.}
|
|
\label{fig:component_fm_rel_ana_subj_obj}
|
|
\end{figure}
|
|
%
|
|
Returning to the FMEA model, the data relationships shown in
|
|
figure~\ref{fig:component_fm_rel_ana} hold for the %five
|
|
variants of FMEA discussed.
|
|
%
|
|
This could be extended, if it is considered that the system level symptoms have subjective
|
|
interpretations.
|
|
%
|
|
With the addition of subjective failure mode symptoms, the UML model for FMEA gains an attribute
|
|
(see figure~\ref{fig:component_fm_rel_ana_subj_obj}).
|
|
%
|
|
The UML data model reveals some undefined qualities of FMEA.
|
|
These raise questions and are discussed below.
|
|
%
|
|
\paragraph{Which, or how many components should be checked for each {\fm} entry?}
|
|
For instance a given {\fm} will have its effect measured in relation
|
|
to some of the components in the system.
|
|
%
|
|
% These components can be chosen by stipulating several criteria,
|
|
% relating this to the signal path or adjacency in the electronic circuit,
|
|
% potential strategies are listed below:
|
|
These components could be chosen by stipulating criteria relating to the signal path or adjacency in the electronic circuit,
|
|
potential strategies are listed below:
|
|
%
|
|
\begin{itemize}
|
|
\item Look at all components electronically adjacent (i.e. connected to the affected component),
|
|
\item Look at all components connected (as above) and those once removed (those connected to those connected to the affected component),
|
|
\item Look at components forward of the {\fm} in the signal path,
|
|
\item Look at all components in the signal path,
|
|
\item Look at all components in the signal path including those one connection removed,
|
|
% dependency tree is a logical construct.
|
|
\item Look at all components within pre-determined dependency models~\cite{cbds}[Ch.5],
|
|
\item Look at all components in the system (i.e. XFMEA).
|
|
\end{itemize}
|
|
No current variant of FMEA gives any guidelines for which, or how many components to check for a given {\fm}.
|
|
\fmmdglossRD
|
|
\paragraph{FMEA gives us objective system level failures/symptoms.} %, what do we do with subjective or contextual failures resulting from this?}
|
|
%
|
|
The two more modern variants of FMEA, FMECA and FMEDA start to address the problem of subjective/contextual
|
|
failure symptoms of a system.
|
|
%
|
|
FMEDA classifies them as dangerous or safe failures.
|
|
%
|
|
FMECA gives us a statistically biased criticality level.
|
|
%
|
|
In both of these methodologies however, there is no formal stage where objective to subjective
|
|
system failures are mapped, this processes seems to be intertwined with the basic analysis itself.
|
|
%
|
|
%
|
|
\paragraph{Re-use potential of an FMEA report.}
|
|
%
|
|
Each {\fm} entry in an FMEA report should have a reasoning or comments field.
|
|
This should provide a guide to someone re-examining, or trying to re-use results
|
|
on a similar project.
|
|
However, %, as with the components that we should check against a {\fm},
|
|
%there are no guidelines for documenting
|
|
the depth of description for reasoning stages in FMEA entries is in practise variable.
|
|
%FMEA does not stipulat which
|
|
Ideally each FMEA entry would contain a clear reasoning description
|
|
for each {\fm},
|
|
so that the entry can be more easily reviewed or revisited/audited. % than a traditional FMEA report.
|
|
%
|
|
Because FMEA is traditionally performed with one entry per component {\fm}, full reasoning descriptions
|
|
are rare.
|
|
%
|
|
Another effect on a one entry per failure mode model, is that the terminology
|
|
may be inconsistent. Failure symptoms, although being the same at a system level, may be
|
|
given different names in the same project.
|
|
%
|
|
These factors mean that re-use, review and checking of traditional analysis can often be started from `cold'.
|
|
%
|
|
Work has been performed to assist in incremental FMEA production by use of a software tool
|
|
which in conjunction with circuit simulation
|
|
and a database of component failure modes (providing consistency in terminology)
|
|
speeds up the FMEA process and aids re-use~\cite{incrementalfmea}.
|
|
%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|