CH2. Describing FMEA.

This commit is contained in:
Robin Clark 2013-01-27 09:09:46 +00:00
parent 7f47ceace9
commit eac3ce5b82
2 changed files with 78 additions and 41 deletions

View File

@ -22,26 +22,29 @@ FMEA~\cite{safeware}[pp.341-344] is widely used, and proof of its use is a manda
for a large proportion of safety critical products sold in the European Union.
The acronym FMEA can be expanded as follows:
\begin{itemize}
\item \textbf{F - Failures of given component} Consider a component in a system,
\item \textbf{M - Failure Mode} Look at one of the ways in which it can fail (i.e. determine a component `failure~mode'),
\item \textbf{E - Effects} Determine the effects this failure mode will cause to the system we are examining,
\item \textbf{A - Analysis} Analyse how much impact this symptom will have on the environment/people/the system its-self.
\item \textbf{F - Failures of given component,} Consider a particular component in a system;
\item \textbf{M - Failure Mode,} Look at one of the ways in which it can fail (i.e. determine a component `failure~mode');
\item \textbf{E - Effects,} Determine the effects this failure mode will cause to the system we are examining;
\item \textbf{A - Analysis,} Analyse how much impact this symptom will have on the environment/people/the system its-self.
\end{itemize}
%
FMEA is a broad term; it could mean anything from an informal check on how
how failures could affect some equipment in an initial brain-storming session
in product design, to formal submission as part of safety critical certification.
how failures could affect some equipment in %an initial
a brain-storming session
%in product design,
to formal submission as part of safety critical certification.
%
FMEA is always performed in context. That is, the equipment is always analysed for a particular purpose
and in a given environment. An `O' ring for instance can fail by leaking
but if fitted to a water seal on a garden hose, the system level failure is a
would be a slight leak at the tap outside the house.
Applied to the rocket engine on a space shuttle the failure mode
is a catastrophic fire and destruction of the spacecraft~\cite{challenger}.
%
Applied to the rocket engine on a space shuttle that same 'O' ring failure mode
could cause a catastrophic fire and destruction of the spacecraft~\cite{challenger}.
%
At a lower level, consider a resistor and capacitor forming a potential divider to ground.
This could be considered a low pass filter in some electrical environments,
but for fixed frequencies the same circuit could be used as a phase changer.
This could be considered a low pass filter in some electrical environments~\cite{aoe},
but for fixed frequencies the same circuit could be used as a phase changer~\cite{electronicssysapproach}[p.114].
The failure modes of the latter, could be `no~signal' and `all~pass',
but when used as a phase changer, would be `no~signal' and `no~phase' change.
@ -84,13 +87,13 @@ FMD-91 entries include general descriptions of internal failures alongside {\fm
FMD-91 entries need, in some cases, some interpretation to be mapped to a clear set of
component {\fms} suitable for use in FMEA.
A third document, MIL-1991~\cite{mil1991} often used alongside FMD-91, provides overall reliability statistics for
A third document, MIL-1991~\cite{mil1991} provides overall reliability statistics for
component types, but does not detail specific failure modes.
%
Using MIL1991 in conjunction with FMD-91, we can determine statistics for the failure modes
Using MIL1991 in conjunction with FMD-91 we can determine statistics for the failure modes
of component types.
%
The FMEDA process from European standard EN61508~\cite{en61508} for instance,
The FMEDA process from European standard EN61508~\cite{en61508}
requires statistics for Meantime to Failure (MTTF) for all {\bc} failure modes.
@ -110,6 +113,11 @@ requires statistics for Meantime to Failure (MTTF) for all {\bc} failure modes.
% I hope to have chapter 5 finished by the end of March, chapter 5 being the
% electronics examples for the FMMD methodology.
\section{Determining the failure modes of Components.}
The starting point for FMEA are the failure modes of {\bcs}.
In order the define FMEA we must start with a discussion on how these failure modes are chosen.
%
In this section we look in detail at two common electrical components and examine how
the two sources of information define their failure mode behaviour.
We look at the reasons why some known failure modes % are omitted, or presented in
@ -146,7 +154,7 @@ For instance for {\textbf{Resistor,~Fixed,~Film}} we are given the following fai
% against {\fms} that the resistor could exhibit.
% We can determine these {\fms} by converting the internal failure descriptions
% to {\fms} thus:
To make this useful for FMEA/FMMD we must assign each failure cause to an arbitrary failure mode descriptor
To make this useful for FMEA/FMMD we must assign each failure cause to symptomatic failure mode descriptor
as shown below.
%
%and map these failure causes to three symptoms,
@ -171,7 +179,7 @@ is significantly reduced, enough for some standards to exclude it~\cite{en298}~\
\paragraph{Resistor failure modes according to EN298.}
EN298, the European gas burner safety standard, tends to be give failure modes more directly usable by FMEA than FMD-91.
EN298, the European gas burner safety standard, tends to be give failure modes more directly usable for performing FMEA than FMD-91.
EN298 requires that a full FMEA be undertaken, examining all failure modes
of all electronic components~\cite{en298}[11.2 5] as part of the certification process.
%
@ -184,7 +192,7 @@ For resistor types not specifically listed in EN298, the failure modes
are considered to be either OPEN or SHORT.
The reason that parameter change is not considered for resistors chosen for an EN298 compliant system, is that they must be must be {\em downrated}.
That is to say the power and voltage ratings of components must be calculated
for maximum possible exposure, with a 40\% margin of error. This reduces the probability
for maximum possible exposure, with a 40\% margin of error. This drastically reduces the probability
that the resistors will be overloaded,
and thus subject to drift/parameter change.
@ -256,8 +264,9 @@ a signal may entirely be lost.
We can map this failure cause to a {\fm}, and we can call it $LOW_{slew}$.
\paragraph{No Operation - over stress}
Here the OP\_AMP has been damaged, and the output may be held HIGH or LOW, or may be effectively tri-stated
, i.e. not able to drive circuitry in along the next stages of the signal path: we can call this state NOOP (no Operation).
Here the OP\_AMP has been damaged, and the output may be held HIGH or LOW, or may be
effectively tri-stated, i.e. not able to drive circuitry in along the next stages of
the signal path: we can call this state NOOP (no Operation).
%
We can map this failure cause to three {\fms}, $LOW$, $HIGH$, $NOOP$.
@ -492,13 +501,16 @@ for one component failure mode.
A complete FMEA report would have to contain an entry
for each failure mode of all the components in the system under investigation.
%
Note here that we have had to look at the failure~mode
In theory we have had to look at the failure~mode
in relation to the entire circuit.
We have used intuition to determine the probable
effect of this failure mode.
For instance we have assumed that the resistor R1 going SHORT
will not affect the ADC, the Microprocessor or the UART.
%
We have taken the {\bc} {\fm} R1 SHORT and then followed the failure reasoning path through to a putative system level symptom.
We have not looked in detail at any side effects of this {\fm}.
%
To put this in more general terms, have not examined this failure mode
against every other component in the system.
Perhaps we should: this would be a more rigorous and complete
@ -507,6 +519,8 @@ approach in looking for system failures.
\section{Theoretical Concepts in FMEA}
In this section we examine some fundamental concepts and underlying philosophies of FMEA.
\paragraph{The unacceptability of a single component failure causing a catastrophe}
% NEED SOME NICE HISTORICAL REFS HERE
FMEA, due to its inductive bottom-up approach, is good
@ -524,14 +538,24 @@ for unearthing potential failure scenarios.
\paragraph{Subjective and Objective thinking in relation to FMEA.}
\label{sec:subjectiveobjective}
FMEA is always performed in the context of the use of the equipment.
In terms of philosophy this is in the domain of the subjective and the objective.
We can using objective reasoning trace a component level failure to a system level event,
In terms of philosophy the context is in the domain of the subjective and the
logic and reasoning behind failure causation, the objective.
By using objective reasoning trace a component level failure to a system level event,
but only in
the subjective sense can we determine its severity.
Failure mode analysis on the leaks possible from the O ring on the space shuttle
the subjective sense can we determine its meaning and severity.
It is worth remembering that
failure mode analysis performed on the leaks possible from the O ring on the space shuttle
did not link this failure to the catastrophic failure of the spacecraft~\cite{challenger,sanjeev}.
It is less useful for determining events for multiple
This was not a failure in the objective reasoning, but more of the subjective, or the context in which the leak occurred.
%
FMEA is less useful for determining events for multiple
simultaneous\footnote{Multiple simultaneous failures are taken to mean failure that occur within the same detection period.} failures.
This is because these two modes of thinking, it becomes more difficult to
get a balance between subjective and objective perspectives.
%subjective/objective become more cluttered when there are multiple possibilities
%for the the results of an FMEA line of reasoning.
\paragraph{Failure modes, dectectable and undetectable}
Often the effects of a failure mode may be easy to detect, and our equipment can react by raising an alarm or compensating for the resulting fault.
@ -555,7 +579,7 @@ However, we can use FMEA (more specifically the FMEDA variant, see section~\ref{
working from known component failure rates, to obtain
statistical estimates of the equipment reliability.
\paragraph{Forward and backward searches}
\paragraph{Forward and backward searches.}
A forward search starts with possible failure causes
and uses logic and reasoning to determine system level outcomes.
@ -565,9 +589,11 @@ A backward search starts with (undesirable) system level events
works back down to potential causes using de-composition of
of the system and logic.
FMEA based methodologies are forward searches\cite{Lutz:1997:RAU:590564.590572} and top down
methodologies such as FTA~\cite{nucfta,nasafta}
methodologies such as FTA~\cite{nucfta,nasafta} are backward searches.
Forward search types of fault analysis is said to be `deductive'.
\paragraph{Reasoning distance}
Backward (or bottom-up) searches are said to be inductive (i.e. the results of failure are
induced).
\paragraph{Reasoning distance.}
\label{reasoningdistance}
A reasoning distance is the number of stages of logic and reasoning
required to map a failure cause to its potential outcomes.
@ -587,7 +613,7 @@ would give a reasoning distance of 3 * 100 * 99.
%{sfmeaforwardbackward}
\subsection{FMEA and the State Explosion Problem}
\paragraph{Rigorous Single Failure FMEA}
\paragraph{Rigorous Single Failure FMEA.}
FMEA for a safety critical certification~\cite{en298,en61508} will have to be applied
to all known failure modes of all components within a system.
@ -631,14 +657,15 @@ $100*99*98*3=2,910,600$ failure mode scenarios.
\paragraph{Reliance of experts for meaningful FMEA Analysis.}
FMEA cannot consider---for practical reasons---a rigorous approach.
Current FMEA methodologies cannot consider---for practical reasons---a rigorous approach.
We define rigorous FMEA as examining the effect of every component failure mode
against the remaining components in the system under investigation.
%
Because we cannot perform rigorous FMEA,
we rely on experts in the system under investigation
to perform a meaningful FMEA analysis.
%
In practise these experts have to select the areas they see as most critical for detailed FMEA analysis.
@ -945,11 +972,13 @@ judged to be in critical sections of the product.
\section{Literature Review}
\section{Conculsions on current FMEA Methodologies}
%% FOCUS
The focus of this literature review is to establish the practice and applications
of FMEA, and to examine its strengths and weaknesses.
The focus of this chapter %literature review
is to establish the current practice and applications
of FMEA.
%, and to examine its strengths and weaknesses.
%% GOAL
Its
goal is to identify central issues and to criticise and assess the current
@ -960,7 +989,7 @@ concerning approval of product
to European safety standards, both the prescriptive~\cite{en298,en230} and statistical~\cite{en61508}.
A second perspective is that of a software engineer trained to use formal methods.
Examining FMEA methodologies for mathematical properties, influenced by
formal methods applied to software, should provide an angle not traditionally considered.
formal methods applied to software, should provide a perspective not traditionally considered.
%% COVERAGE
The literature reviewed, has been restricted to published books, European safety standards (as examples
of current safety measures applied), and traditional research, from journal and conference papers.
@ -1021,26 +1050,34 @@ external influences such as
ionising radiation causing bits to be erroneously altered.
\paragraph{FMEA and Modularity}
Form the 1940's onwards, software has evolved from a simple procedural languages (i.e. assembly language/Fortran~\cite{f77} call return)
to structured programming ( C~\cite{KandR}, pascal etc) and then to object oriented models (Java C++...).
FMEA has undergone no such evolution.
In a world where sensor systems, often including embedded software components, are bought in to
create complex systems, FMEA still follows a rigid {\bc} {\fm} to system level error model,
that is only suitable for simple electro mechanical systems.
%
\section{Conclusion}
\paragraph{Where FMEA is now}
\subsection{Where FMEA is now.}
FMEA useful tool for basic safety --- provides statistics on safety where field data impractical ---
very good with single failure modes linked to top level events.
FMEA has become part of the safety critical and safety certification industries.
%
SFMEA is in its infancy, but there is a gap in current
SFMEA is in its infancy, and there are corresponding gaps in
certification for software, EN61508~\cite{en61508}, recommends hardware redundancy architectures in conjunction
with FMEDA for hardware: for software it recommends language constraints and quality procedures
but no inductive fault finding technique.
%
FMEA has adapted from a cost saving exercise for mass produced items~\cite{bfmea,generic_automotive_fmea_6034891}, to incorporating statistical techniques
(FMECA) to allowing for self diagnostic mitigation (FMEDA).
However, it is still based on the single component failure mapped to system level failure.
%
However, it is still based on the concept of single component failures mapped to top~level/system~failures.
All these FMEA based methodologies have the following short comings:
\begin{itemize}
\item Impossible to integrate Software and hardware models,

View File

@ -241,7 +241,7 @@ rigorous checking feasible.
\centering
\includegraphics[width=400pt]{./CH6_Evaluation/components_81_euler.png}
% components_81_euler.png: 3056x2532 pixel, 72dpi, 107.81x89.32 cm, bb=0 0 3056 2532
\caption{FMMD Hierarchy with number of compnents in each $FG$ fixed to three ($|FG|=3$)}
\caption{FMMD Hierarchy with number of components in each $FG$ fixed to three ($|FG|=3$)}
\label{fig:three_tree}
\end{figure}