CH2. Describing FMEA.
This commit is contained in:
parent
7f47ceace9
commit
eac3ce5b82
@ -22,26 +22,29 @@ FMEA~\cite{safeware}[pp.341-344] is widely used, and proof of its use is a manda
|
||||
for a large proportion of safety critical products sold in the European Union.
|
||||
The acronym FMEA can be expanded as follows:
|
||||
\begin{itemize}
|
||||
\item \textbf{F - Failures of given component} Consider a component in a system,
|
||||
\item \textbf{M - Failure Mode} Look at one of the ways in which it can fail (i.e. determine a component `failure~mode'),
|
||||
\item \textbf{E - Effects} Determine the effects this failure mode will cause to the system we are examining,
|
||||
\item \textbf{A - Analysis} Analyse how much impact this symptom will have on the environment/people/the system its-self.
|
||||
\item \textbf{F - Failures of given component,} Consider a particular component in a system;
|
||||
\item \textbf{M - Failure Mode,} Look at one of the ways in which it can fail (i.e. determine a component `failure~mode');
|
||||
\item \textbf{E - Effects,} Determine the effects this failure mode will cause to the system we are examining;
|
||||
\item \textbf{A - Analysis,} Analyse how much impact this symptom will have on the environment/people/the system its-self.
|
||||
\end{itemize}
|
||||
%
|
||||
FMEA is a broad term; it could mean anything from an informal check on how
|
||||
how failures could affect some equipment in an initial brain-storming session
|
||||
in product design, to formal submission as part of safety critical certification.
|
||||
how failures could affect some equipment in %an initial
|
||||
a brain-storming session
|
||||
%in product design,
|
||||
to formal submission as part of safety critical certification.
|
||||
%
|
||||
FMEA is always performed in context. That is, the equipment is always analysed for a particular purpose
|
||||
and in a given environment. An `O' ring for instance can fail by leaking
|
||||
but if fitted to a water seal on a garden hose, the system level failure is a
|
||||
would be a slight leak at the tap outside the house.
|
||||
Applied to the rocket engine on a space shuttle the failure mode
|
||||
is a catastrophic fire and destruction of the spacecraft~\cite{challenger}.
|
||||
%
|
||||
Applied to the rocket engine on a space shuttle that same 'O' ring failure mode
|
||||
could cause a catastrophic fire and destruction of the spacecraft~\cite{challenger}.
|
||||
%
|
||||
At a lower level, consider a resistor and capacitor forming a potential divider to ground.
|
||||
This could be considered a low pass filter in some electrical environments,
|
||||
but for fixed frequencies the same circuit could be used as a phase changer.
|
||||
This could be considered a low pass filter in some electrical environments~\cite{aoe},
|
||||
but for fixed frequencies the same circuit could be used as a phase changer~\cite{electronicssysapproach}[p.114].
|
||||
The failure modes of the latter, could be `no~signal' and `all~pass',
|
||||
but when used as a phase changer, would be `no~signal' and `no~phase' change.
|
||||
|
||||
@ -84,13 +87,13 @@ FMD-91 entries include general descriptions of internal failures alongside {\fm
|
||||
FMD-91 entries need, in some cases, some interpretation to be mapped to a clear set of
|
||||
component {\fms} suitable for use in FMEA.
|
||||
|
||||
A third document, MIL-1991~\cite{mil1991} often used alongside FMD-91, provides overall reliability statistics for
|
||||
A third document, MIL-1991~\cite{mil1991} provides overall reliability statistics for
|
||||
component types, but does not detail specific failure modes.
|
||||
%
|
||||
Using MIL1991 in conjunction with FMD-91, we can determine statistics for the failure modes
|
||||
Using MIL1991 in conjunction with FMD-91 we can determine statistics for the failure modes
|
||||
of component types.
|
||||
%
|
||||
The FMEDA process from European standard EN61508~\cite{en61508} for instance,
|
||||
The FMEDA process from European standard EN61508~\cite{en61508}
|
||||
requires statistics for Meantime to Failure (MTTF) for all {\bc} failure modes.
|
||||
|
||||
|
||||
@ -110,6 +113,11 @@ requires statistics for Meantime to Failure (MTTF) for all {\bc} failure modes.
|
||||
% I hope to have chapter 5 finished by the end of March, chapter 5 being the
|
||||
% electronics examples for the FMMD methodology.
|
||||
|
||||
\section{Determining the failure modes of Components.}
|
||||
|
||||
The starting point for FMEA are the failure modes of {\bcs}.
|
||||
In order the define FMEA we must start with a discussion on how these failure modes are chosen.
|
||||
%
|
||||
In this section we look in detail at two common electrical components and examine how
|
||||
the two sources of information define their failure mode behaviour.
|
||||
We look at the reasons why some known failure modes % are omitted, or presented in
|
||||
@ -146,7 +154,7 @@ For instance for {\textbf{Resistor,~Fixed,~Film}} we are given the following fai
|
||||
% against {\fms} that the resistor could exhibit.
|
||||
% We can determine these {\fms} by converting the internal failure descriptions
|
||||
% to {\fms} thus:
|
||||
To make this useful for FMEA/FMMD we must assign each failure cause to an arbitrary failure mode descriptor
|
||||
To make this useful for FMEA/FMMD we must assign each failure cause to symptomatic failure mode descriptor
|
||||
as shown below.
|
||||
%
|
||||
%and map these failure causes to three symptoms,
|
||||
@ -171,7 +179,7 @@ is significantly reduced, enough for some standards to exclude it~\cite{en298}~\
|
||||
|
||||
\paragraph{Resistor failure modes according to EN298.}
|
||||
|
||||
EN298, the European gas burner safety standard, tends to be give failure modes more directly usable by FMEA than FMD-91.
|
||||
EN298, the European gas burner safety standard, tends to be give failure modes more directly usable for performing FMEA than FMD-91.
|
||||
EN298 requires that a full FMEA be undertaken, examining all failure modes
|
||||
of all electronic components~\cite{en298}[11.2 5] as part of the certification process.
|
||||
%
|
||||
@ -184,7 +192,7 @@ For resistor types not specifically listed in EN298, the failure modes
|
||||
are considered to be either OPEN or SHORT.
|
||||
The reason that parameter change is not considered for resistors chosen for an EN298 compliant system, is that they must be must be {\em downrated}.
|
||||
That is to say the power and voltage ratings of components must be calculated
|
||||
for maximum possible exposure, with a 40\% margin of error. This reduces the probability
|
||||
for maximum possible exposure, with a 40\% margin of error. This drastically reduces the probability
|
||||
that the resistors will be overloaded,
|
||||
and thus subject to drift/parameter change.
|
||||
|
||||
@ -256,8 +264,9 @@ a signal may entirely be lost.
|
||||
We can map this failure cause to a {\fm}, and we can call it $LOW_{slew}$.
|
||||
|
||||
\paragraph{No Operation - over stress}
|
||||
Here the OP\_AMP has been damaged, and the output may be held HIGH or LOW, or may be effectively tri-stated
|
||||
, i.e. not able to drive circuitry in along the next stages of the signal path: we can call this state NOOP (no Operation).
|
||||
Here the OP\_AMP has been damaged, and the output may be held HIGH or LOW, or may be
|
||||
effectively tri-stated, i.e. not able to drive circuitry in along the next stages of
|
||||
the signal path: we can call this state NOOP (no Operation).
|
||||
%
|
||||
We can map this failure cause to three {\fms}, $LOW$, $HIGH$, $NOOP$.
|
||||
|
||||
@ -492,13 +501,16 @@ for one component failure mode.
|
||||
A complete FMEA report would have to contain an entry
|
||||
for each failure mode of all the components in the system under investigation.
|
||||
%
|
||||
Note here that we have had to look at the failure~mode
|
||||
In theory we have had to look at the failure~mode
|
||||
in relation to the entire circuit.
|
||||
We have used intuition to determine the probable
|
||||
effect of this failure mode.
|
||||
For instance we have assumed that the resistor R1 going SHORT
|
||||
will not affect the ADC, the Microprocessor or the UART.
|
||||
%
|
||||
We have taken the {\bc} {\fm} R1 SHORT and then followed the failure reasoning path through to a putative system level symptom.
|
||||
We have not looked in detail at any side effects of this {\fm}.
|
||||
%
|
||||
To put this in more general terms, have not examined this failure mode
|
||||
against every other component in the system.
|
||||
Perhaps we should: this would be a more rigorous and complete
|
||||
@ -507,6 +519,8 @@ approach in looking for system failures.
|
||||
|
||||
\section{Theoretical Concepts in FMEA}
|
||||
|
||||
In this section we examine some fundamental concepts and underlying philosophies of FMEA.
|
||||
|
||||
\paragraph{The unacceptability of a single component failure causing a catastrophe}
|
||||
% NEED SOME NICE HISTORICAL REFS HERE
|
||||
FMEA, due to its inductive bottom-up approach, is good
|
||||
@ -524,14 +538,24 @@ for unearthing potential failure scenarios.
|
||||
\paragraph{Subjective and Objective thinking in relation to FMEA.}
|
||||
\label{sec:subjectiveobjective}
|
||||
FMEA is always performed in the context of the use of the equipment.
|
||||
In terms of philosophy this is in the domain of the subjective and the objective.
|
||||
We can using objective reasoning trace a component level failure to a system level event,
|
||||
In terms of philosophy the context is in the domain of the subjective and the
|
||||
logic and reasoning behind failure causation, the objective.
|
||||
By using objective reasoning trace a component level failure to a system level event,
|
||||
but only in
|
||||
the subjective sense can we determine its severity.
|
||||
Failure mode analysis on the leaks possible from the O ring on the space shuttle
|
||||
the subjective sense can we determine its meaning and severity.
|
||||
It is worth remembering that
|
||||
failure mode analysis performed on the leaks possible from the O ring on the space shuttle
|
||||
did not link this failure to the catastrophic failure of the spacecraft~\cite{challenger,sanjeev}.
|
||||
It is less useful for determining events for multiple
|
||||
This was not a failure in the objective reasoning, but more of the subjective, or the context in which the leak occurred.
|
||||
%
|
||||
FMEA is less useful for determining events for multiple
|
||||
simultaneous\footnote{Multiple simultaneous failures are taken to mean failure that occur within the same detection period.} failures.
|
||||
This is because these two modes of thinking, it becomes more difficult to
|
||||
get a balance between subjective and objective perspectives.
|
||||
|
||||
%subjective/objective become more cluttered when there are multiple possibilities
|
||||
%for the the results of an FMEA line of reasoning.
|
||||
|
||||
|
||||
\paragraph{Failure modes, dectectable and undetectable}
|
||||
Often the effects of a failure mode may be easy to detect, and our equipment can react by raising an alarm or compensating for the resulting fault.
|
||||
@ -555,7 +579,7 @@ However, we can use FMEA (more specifically the FMEDA variant, see section~\ref{
|
||||
working from known component failure rates, to obtain
|
||||
statistical estimates of the equipment reliability.
|
||||
|
||||
\paragraph{Forward and backward searches}
|
||||
\paragraph{Forward and backward searches.}
|
||||
|
||||
A forward search starts with possible failure causes
|
||||
and uses logic and reasoning to determine system level outcomes.
|
||||
@ -565,9 +589,11 @@ A backward search starts with (undesirable) system level events
|
||||
works back down to potential causes using de-composition of
|
||||
of the system and logic.
|
||||
FMEA based methodologies are forward searches\cite{Lutz:1997:RAU:590564.590572} and top down
|
||||
methodologies such as FTA~\cite{nucfta,nasafta}
|
||||
methodologies such as FTA~\cite{nucfta,nasafta} are backward searches.
|
||||
Forward search types of fault analysis is said to be `deductive'.
|
||||
\paragraph{Reasoning distance}
|
||||
Backward (or bottom-up) searches are said to be inductive (i.e. the results of failure are
|
||||
induced).
|
||||
\paragraph{Reasoning distance.}
|
||||
\label{reasoningdistance}
|
||||
A reasoning distance is the number of stages of logic and reasoning
|
||||
required to map a failure cause to its potential outcomes.
|
||||
@ -587,7 +613,7 @@ would give a reasoning distance of 3 * 100 * 99.
|
||||
%{sfmeaforwardbackward}
|
||||
\subsection{FMEA and the State Explosion Problem}
|
||||
|
||||
\paragraph{Rigorous Single Failure FMEA}
|
||||
\paragraph{Rigorous Single Failure FMEA.}
|
||||
|
||||
FMEA for a safety critical certification~\cite{en298,en61508} will have to be applied
|
||||
to all known failure modes of all components within a system.
|
||||
@ -631,14 +657,15 @@ $100*99*98*3=2,910,600$ failure mode scenarios.
|
||||
|
||||
|
||||
\paragraph{Reliance of experts for meaningful FMEA Analysis.}
|
||||
FMEA cannot consider---for practical reasons---a rigorous approach.
|
||||
Current FMEA methodologies cannot consider---for practical reasons---a rigorous approach.
|
||||
We define rigorous FMEA as examining the effect of every component failure mode
|
||||
against the remaining components in the system under investigation.
|
||||
%
|
||||
Because we cannot perform rigorous FMEA,
|
||||
we rely on experts in the system under investigation
|
||||
to perform a meaningful FMEA analysis.
|
||||
|
||||
%
|
||||
In practise these experts have to select the areas they see as most critical for detailed FMEA analysis.
|
||||
|
||||
|
||||
|
||||
@ -945,11 +972,13 @@ judged to be in critical sections of the product.
|
||||
|
||||
|
||||
|
||||
\section{Literature Review}
|
||||
\section{Conculsions on current FMEA Methodologies}
|
||||
|
||||
%% FOCUS
|
||||
The focus of this literature review is to establish the practice and applications
|
||||
of FMEA, and to examine its strengths and weaknesses.
|
||||
The focus of this chapter %literature review
|
||||
is to establish the current practice and applications
|
||||
of FMEA.
|
||||
%, and to examine its strengths and weaknesses.
|
||||
%% GOAL
|
||||
Its
|
||||
goal is to identify central issues and to criticise and assess the current
|
||||
@ -960,7 +989,7 @@ concerning approval of product
|
||||
to European safety standards, both the prescriptive~\cite{en298,en230} and statistical~\cite{en61508}.
|
||||
A second perspective is that of a software engineer trained to use formal methods.
|
||||
Examining FMEA methodologies for mathematical properties, influenced by
|
||||
formal methods applied to software, should provide an angle not traditionally considered.
|
||||
formal methods applied to software, should provide a perspective not traditionally considered.
|
||||
%% COVERAGE
|
||||
The literature reviewed, has been restricted to published books, European safety standards (as examples
|
||||
of current safety measures applied), and traditional research, from journal and conference papers.
|
||||
@ -1021,26 +1050,34 @@ external influences such as
|
||||
ionising radiation causing bits to be erroneously altered.
|
||||
|
||||
|
||||
\paragraph{FMEA and Modularity}
|
||||
Form the 1940's onwards, software has evolved from a simple procedural languages (i.e. assembly language/Fortran~\cite{f77} call return)
|
||||
to structured programming ( C~\cite{KandR}, pascal etc) and then to object oriented models (Java C++...).
|
||||
FMEA has undergone no such evolution.
|
||||
In a world where sensor systems, often including embedded software components, are bought in to
|
||||
create complex systems, FMEA still follows a rigid {\bc} {\fm} to system level error model,
|
||||
that is only suitable for simple electro mechanical systems.
|
||||
|
||||
|
||||
|
||||
%
|
||||
|
||||
|
||||
|
||||
\section{Conclusion}
|
||||
|
||||
\paragraph{Where FMEA is now}
|
||||
\subsection{Where FMEA is now.}
|
||||
FMEA useful tool for basic safety --- provides statistics on safety where field data impractical ---
|
||||
very good with single failure modes linked to top level events.
|
||||
FMEA has become part of the safety critical and safety certification industries.
|
||||
%
|
||||
SFMEA is in its infancy, but there is a gap in current
|
||||
SFMEA is in its infancy, and there are corresponding gaps in
|
||||
certification for software, EN61508~\cite{en61508}, recommends hardware redundancy architectures in conjunction
|
||||
with FMEDA for hardware: for software it recommends language constraints and quality procedures
|
||||
but no inductive fault finding technique.
|
||||
|
||||
%
|
||||
FMEA has adapted from a cost saving exercise for mass produced items~\cite{bfmea,generic_automotive_fmea_6034891}, to incorporating statistical techniques
|
||||
(FMECA) to allowing for self diagnostic mitigation (FMEDA).
|
||||
However, it is still based on the single component failure mapped to system level failure.
|
||||
%
|
||||
However, it is still based on the concept of single component failures mapped to top~level/system~failures.
|
||||
All these FMEA based methodologies have the following short comings:
|
||||
\begin{itemize}
|
||||
\item Impossible to integrate Software and hardware models,
|
||||
|
@ -241,7 +241,7 @@ rigorous checking feasible.
|
||||
\centering
|
||||
\includegraphics[width=400pt]{./CH6_Evaluation/components_81_euler.png}
|
||||
% components_81_euler.png: 3056x2532 pixel, 72dpi, 107.81x89.32 cm, bb=0 0 3056 2532
|
||||
\caption{FMMD Hierarchy with number of compnents in each $FG$ fixed to three ($|FG|=3$)}
|
||||
\caption{FMMD Hierarchy with number of components in each $FG$ fixed to three ($|FG|=3$)}
|
||||
\label{fig:three_tree}
|
||||
\end{figure}
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user