we removal. Depressing cos CH4 and CH7 are full of them. Only a few in CH3

This commit is contained in:
Robin P. Clark 2013-09-11 14:43:47 +01:00
parent 6b5ce8503d
commit 4a96ce3cdd

View File

@ -24,9 +24,9 @@ are examined in the context of two sources of information that define failure mo
To introduce the concept of FMEA, a simple example is given, using a hypothetical four to twenty milli-amp ({\ft}) %milli-amp To introduce the concept of FMEA, a simple example is given, using a hypothetical four to twenty milli-amp ({\ft}) %milli-amp
reader. reader.
% %
The four main current FMEA variants are described %and we develop %conclude by describing concepts The four main current FMEA variants are described along with %and we develop %conclude by describing concepts
the concepts the concepts
that underlie the usage and philosophy of FMEA discussed. that underlie the usage and philosophy of FMEA. %Fof a grou discussed.
% %
The overall process of FMEA is then reviewed and modelled using UML. The overall process of FMEA is then reviewed and modelled using UML.
% %
@ -81,7 +81,7 @@ but for fixed frequencies the same circuit could be used as a phase changer~\cit
The failure modes of the latter, could be `no~signal' and `all~pass', The failure modes of the latter, could be `no~signal' and `all~pass',
but when used as a phase changer, would be `no~signal' and `no~phase' change. but when used as a phase changer, would be `no~signal' and `no~phase' change.
% %
The actual failure modes of a group of components, are therefore defined by the The actual failure modes for a `group~of~components', are therefore defined by the
function that they perform. function that they perform.
% %
% This chapter describes basic concepts of FMEA, uses a simple example to % This chapter describes basic concepts of FMEA, uses a simple example to
@ -162,7 +162,7 @@ The reasons for these differences are examined below using two example component
% %
Typically, when choosing components for a design, engineers will look at manufacturers' data~sheets Typically, when choosing components for a design, engineers will look at manufacturers' data~sheets
which describe functionality, physical dimensions, which describe functionality, physical dimensions,
environmental ranges, tolerances. environmental ranges and tolerances etc. .
% %
It is rare for a data~sheet to list failure modes. It is rare for a data~sheet to list failure modes.
% %
@ -287,7 +287,7 @@ For instance for {\textbf{Resistor,~Fixed,~Film}} the following failure causes a
% against {\fms} that the resistor could exhibit. % against {\fms} that the resistor could exhibit.
% We can determine these {\fms} by converting the internal failure descriptions % We can determine these {\fms} by converting the internal failure descriptions
% to {\fms} thus: % to {\fms} thus:
To make this useful for FMEA/FMMD each failure cause must be mapped to a symptomatic failure mode descriptor To make this useful for FMEA each failure cause must be mapped to a symptomatic failure mode descriptor
as listed below: as listed below:
% %
%and map these failure causes to three symptoms, %and map these failure causes to three symptoms,
@ -564,72 +564,16 @@ The EN298 pinouts failure mode technique cannot reveal failure modes due to inte
and that is why it misses the $LOW_{slew}$. and that is why it misses the $LOW_{slew}$.
% %
The FMD-91 entries for op-amps are not directly usable as The FMD-91 entries for op-amps are not directly usable as
component {\fms} in FMEA or FMMD and require interpretation. component {\fms} in FMEA and require interpretation.
% %
However, once a failure mode analysis has been carried out, the model can However, once a failure mode analysis has been carried out, the model can
be used throughout the FMEA and FMMD process. be used throughout the FMEA process.
%%%% Talk about R differences ?? XXXXX
%For our Op-Amp example could have come up with different symptoms for both sides. Cannot predict the effect of internal errors, for instance ($LOW_{slew}$)
%is missing from the EN298 failure modes set.
% FMD-91
%
% I have been working on two examples of determining failure modes of components.
% One is from the US military document FMD-91, where internal failures
% of components are described (with stats).
%
% The other is EN298 where the failure modes for generic component types are prescribed, or
% determined by a procedure where failure scenarios of all pins OPEN and all adjacent pins shorted
% is applied. These techniques
%
% The FMD-91 entries need, in some cases, some interpretation to be mapped to
% component failure symptoms, but include failure modes that can be due to internal failures.
% The EN298 SHORT/OPEN procedure cannot determine failures due to internal causes but can be applied to any IC.
%
% Could I come in and see you Chris to quickly discuss these.
%
% I hope to have chapter 5 finished by the end of March, chapter 5 being the
% electronics examples for the FMMD methodology.
%%
%% Paragraph using failure modes to build from bottom up
%%
% \subsection{FMEA}
% This talk introduces Failure Mode Effects Analysis, and the different ways it is applied.
% These techniques are discussed, and then
% a refinement is proposed, which is essentially a modularisation of the FMEA process.
% %
%
% \begin{itemize}
% \item Failure
% \item Mode
% \item Effects
% \item Analysis
% \end{itemize}
%
%
%
% % % \begin{itemize}
% % \item Failure
% % \item Mode
% % \item Effects
% % \item Analysis
% % \end{itemize}
\clearpage \clearpage
@ -706,24 +650,35 @@ will not affect the ADC, the Microprocessor or the UART.
%%%%%%%%%%%% WE removal project ends here today 08SEP2013 %%%%%%%% %%%%%%%%%%%% WE removal project ends here today 08SEP2013 %%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% %
We have taken the {\bc} {\fm} R1 SHORT and then followed the failure reasoning path through to a putative system level symptom. The {\bc} {\fm} R1 SHORT has been examined
We have not looked in detail at any side effects of this {\fm}. and failure reasoning applied,
along a heuristically determined signal path,
to find a putative system level symptom.
% %
To put this in more general terms, have not examined this failure mode \fmmdglossSIGPATH
against every other component in the system. That is R1 going SHORT is expected to just give an out of range value
Perhaps we should: this would be a more rigorous and complete that can be read by the ADC and reported correctly by the software.
approach in looking for system failures. We could term FMEA where %
Potential side effects of this {\fm} may not have been factored.
%
To put this in more general terms, this failure mode has not been examined
against all other components in the system, only those expected on the signal path.
%
Examining the {\fm} R1 SHORT against all component in this system, would be a more rigorous and complete
approach in looking for system failures.
%
FMEA where
each failure mode is compared against all other components each failure mode is compared against all other components
as exhaustive FMEA (XFMEA). is termed exhaustive FMEA (XFMEA).
% %
An indicator of the potential vagueness, in terms of failure outcome, An indicator of the vagueness of not performing XFMEA, in terms of failure outcome,
is manifested in the UML relationship in figure~\ref{fig:component_fm_rel_ana} is shown in the UML relationship in figure~\ref{fig:component_fm_rel_ana}
giving a one to many mapping for a failure mode and its system level symptom. giving a one to many mapping for a failure mode and its system level symptom.
\section{Theoretical Concepts in FMEA} \section{Theoretical Concepts in FMEA}
In this section we examine some fundamental concepts and underlying philosophies of FMEA. In this section some fundamental concepts and underlying philosophies of FMEA are examined.
\paragraph{Failure modes of a component and mutual exclusivity.} \paragraph{Failure modes of a component and mutual exclusivity.}
It is desirable that the failure modes for a component are mutually exclusive, were a component able It is desirable that the failure modes for a component are mutually exclusive, were a component able
@ -747,10 +702,14 @@ Most electronic systems are used to process a signal: with signal processing
there is usually a clear path from the signal coming into the system, it being processed in some way, and a resultant effect on there is usually a clear path from the signal coming into the system, it being processed in some way, and a resultant effect on
an output or control signal. % afferent to transform to efferent path. an output or control signal. % afferent to transform to efferent path.
% %
That is, there is an input, some processing and an output. In electronics we might term this a sensor, processing and actuator That is, there is an input, some processing and an output.
model. In software we would term this afferent, transform and efferent data flow.
% %
For the purpose of FMEA, we define the signal path as the components used to process the signal. In electronics this could be termed a sensor, processing and actuator
model.
%
In software this would be termed afferent, transform and efferent data flow.
%
For the purpose of FMEA, the signal path is defined by the components and connections used to process the signal.
% %
Some circuits have feedback loops or even circular signal paths, but it Some circuits have feedback loops or even circular signal paths, but it
is normal for a signal path to exist. is normal for a signal path to exist.
@ -761,13 +720,14 @@ An FMEA investigation will often take the component {\fm} and examine its effect
in the direction of the signal, in the direction of the signal,
echoing diagnostic/fault~finding methods~\cite{garrett, maikowski}. % loebowski}. echoing diagnostic/fault~finding methods~\cite{garrett, maikowski}. % loebowski}.
% %
When fault finding, we generally follow the signal path checking for correct behaviour When fault finding, the signal path is followed, checking for correct behaviour
along it: when we find something out of place, we zoom in and measure along it: when something out of place is found,
the circuit behaviour until we find a faulty component or module~\cite{garrett}. the circuit behaviour is measured in finer granularity,
until a faulty component or module~\cite{garrett} is identified.
% %
With this style of fault finding, because it is based on experiment, With this style of fault finding, because it is based on experiment,
we can hop from module to module eliminating working modules, until we find the hopping from module to module eliminating working ones, until
failure~\cite{maikowski}. failure is found~\cite{maikowski}, is effective.
% %
The rationale and work-culture of those tasked to The rationale and work-culture of those tasked to
perform FMEA are generally personnel who have performed fault finding~\cite{cbds}[p.97]. perform FMEA are generally personnel who have performed fault finding~\cite{cbds}[p.97].
@ -784,12 +744,12 @@ the system\footnote{Building circuit simulations and simulating component failur
would be a very time consuming process and might only be performed as a final-stage of accident investigation, where the cause is would be a very time consuming process and might only be performed as a final-stage of accident investigation, where the cause is
required to be proven.}. required to be proven.}.
% %
We cannot, as with fault finding, verify modules along the signal path for correct behaviour It is not possible, as with fault finding, to verify modules along the signal path for correct behaviour
and eliminate them from the investigation. and eliminate them from the investigation.
% %
FMEA is a `thought~experiment', not actual experiment. FMEA is a `thought~experiment', not actual experiment.
% %
With FMEA we therefore need to be more thorough in the consideration of the effects a failure mode may have With FMEA there is a need to be more thorough in the consideration of the effects a failure mode may have
on the other components in a system, than with fault finding. on the other components in a system, than with fault finding.
% %
The question is by how much. The question is by how much.
@ -799,9 +759,9 @@ Too much and the task becomes impossible due to time/labour constraints.
Too little and the analysis could become meaningless, because it could miss Too little and the analysis could become meaningless, because it could miss
potential system failures. potential system failures.
% %
For a more complete analysis, we should perhaps examine each component {\fm} along the complete signal path, For a more complete analysis, the strategy of examining each component {\fm} along the complete signal path,
forwards and backwards from the placement forwards and backwards from the placement
of the component exhibiting the {\fm} under investigation. of the component exhibiting the {\fm} under investigation, could be applied.
% %
% Also, whether following the effects through the signal path {\em only} is acceptable, and instead % Also, whether following the effects through the signal path {\em only} is acceptable, and instead
% would looking at its effect on all other components in the system be necessary? % would looking at its effect on all other components in the system be necessary?
@ -835,27 +795,28 @@ at mapping potential single component failures to system level faults/events.
The concept of the unacceptability of a single component failure causing a system failure, % catastrophe, The concept of the unacceptability of a single component failure causing a system failure, % catastrophe,
is an important and easily understood measurement of safety. is an important and easily understood measurement of safety.
% %
It is easy to calculate They are easy to calculate
because we can usually find Mean Time to Failure (MTTF) statistics~\cite{fmd91,mil1991} for commonly used components. because Mean Time to Failure (MTTF) statistics~\cite{fmd91,mil1991} for commonly used components can be found.
% %
Also, used in the design phase of a project, FMEA is a useful tool Also, used in the design phase of a project, FMEA is a useful tool
for discovering potential failure scenarios~\cite{1778436820050601}. for discovering potential failure scenarios~\cite{1778436820050601}.
% %
From a large system perspective, we may find that {\bc} {\fms} From a large system perspective, it may be found that {\bc} {\fms}
may have more than one possible system event associated with them. may have more than one possible system event associated with them.
%
Often there will be a clear one to one mapping, but Often there will be a clear one to one mapping, but
probabilities to failure (as used in FMECA) probabilities to failure (as used in FMECA)
could mean one too many. % mapping. could mean one ({\fm}) too many (system level symptoms). % mapping.
% %
\paragraph{Use of Markov chains to model failure modes.} \paragraph{Use of Markov chains to model failure modes.}
We could represent a failure mode and its possible outcomes using a Markov chain~\cite{probfmea_4338247}. We could represent a failure mode and its possible outcomes using a Markov chain~\cite{probfmea_4338247}.
% %
Where multiple simultaneous %\footnote{Multiple simultaneous failures are taken to mean failures that occur within the same detection period.} Where multiple simultaneous %\footnote{Multiple simultaneous failures are taken to mean failures that occur within the same detection period.}
failure modes are considered this complicates failure modes are considered this complicates
the statistical nature of the Markov chain, cause effect model. the statistical nature of the Markov chain cause and effect model.
% %
What we in fact get is the merging, or local interaction of two Markov chains What we in fact get is the merging, or local interaction of two Markov chains
for our cause effect model. for the cause and effect model.
% Subject Object Wiki answers : Best Answer % Subject Object Wiki answers : Best Answer
%It is not grammar or vocabulary. It is a philosophical reference. %It is not grammar or vocabulary. It is a philosophical reference.
%The dichotomy is the surrounding view of self that we act out of. It is often learned with language and not taught [like the alphabet and numbers are taught] in early life through language and the forming of distinctions. %The dichotomy is the surrounding view of self that we act out of. It is often learned with language and not taught [like the alphabet and numbers are taught] in early life through language and the forming of distinctions.
@ -869,9 +830,9 @@ FMEA is always performed in the context of the use of the equipment.
In terms of philosophy the context is in the domain of the subjective and the In terms of philosophy the context is in the domain of the subjective and the
logic and reasoning behind failure causation, the objective. logic and reasoning behind failure causation, the objective.
% %
By using objective reasoning we trace a component level failure to a system level event, By using objective reasoning a component level failure to a system level event can be traced,
but only in but only in
the subjective sense can we determine its meaning and/or severity. the subjective sense its meaning and/or severity be determined.
% %
It is worth remembering that It is worth remembering that
failure mode analysis performed on the leaks possible from the O ring on the space shuttle failure mode analysis performed on the leaks possible from the O ring on the space shuttle
@ -879,7 +840,7 @@ did not link this failure to the catastrophic failure of the spacecraft~\cite{ch
% %
This was not a failure in the objective reasoning, but more of the subjective, or the context in which the leak occurred. This was not a failure in the objective reasoning, but more of the subjective, or the context in which the leak occurred.
% %
What this means is that for an objectively calculated failure mode outcome, we may have What this means is that for an objectively calculated failure mode outcome, there may have
more than one subjective outcome. %, or definition, for it. more than one subjective outcome. %, or definition, for it.
% %
@ -952,10 +913,11 @@ and our equipment can react by raising an alarm or compensating for the resultin
Some failure modes may cause undetectable failures, for instance a component that causes Some failure modes may cause undetectable failures, for instance a component that causes
a measured reading to change could have adverse consequences yet not be flagged as a failure. a measured reading to change could have adverse consequences yet not be flagged as a failure.
% %
This type of failure % This type of failure
%would not be flagged as a failure by the system, because can not be dealt with by passing error indication to higher level modules
can not be dealt with by passing an error indication to higher level modules because it simply cannot be detected.
because we cannot detect it. The system therefore %
The system therefore
has no way of knowing the reading is invalid. has no way of knowing the reading is invalid.
% %
The term observable has a specific meaning in the field of control engineering~\cite{721666, ACS:ACS1297}; The term observable has a specific meaning in the field of control engineering~\cite{721666, ACS:ACS1297};
@ -967,17 +929,24 @@ will be used for describing the observability of failure modes in this document.
\paragraph{Impracticality of Field Data for Modern Systems.} \paragraph{Impracticality of Field Data for Modern Systems.}
\fmmdglossFIT
Modern electronic components, are generally very reliable, and the systems built from them Modern electronic components, are generally very reliable, and the systems built from them
are thus very reliable too. Reliable field data on failures will, therefore, be sparse. are thus very reliable too. Reliable field data on failures will, therefore, be sparse.
Should we wish to prove a continuous demand system for say ${10}^{-7}$ failures\footnote{${10}^{-7}$ failures per hour of operation is the %
threshold for S.I.L. 3 reliability~\cite{en61508}. Failure rates are normally measured per $10^9$ hours of operation Should it be wished to prove a continuous demand system for say ${10}^{-7}$ failures\footnote{${10}^{-7}$ failures per hour of operation is the
and are known as Failure in Time (FIT) values. The maximum FIT values for a SIL 3 system is therefore 100.} threshold for S.I.L. 3 reliability~\cite{en61508}.
%
Failure rates are normally measured per $10^9$ hours of operation
and are known as Failure in Time (FIT) values.
%
The maximum FIT values for a SIL 3 system is therefore 100.}
per hour of operation, even with 1000 correctly monitored units in the field per hour of operation, even with 1000 correctly monitored units in the field
we could only expect one failure per ten thousand hours (a little over one a year). there could only be one failure per ten thousand hours expected (i.e. a little over one a year) .
%
It would be utterly impractical to get statistically significant data for equipment It would be utterly impractical to get statistically significant data for equipment
at these reliability levels. at these reliability levels.
However, we can use FMEA (more specifically the FMEDA variant, see section~\ref{sec:FMEDA}), %
However, FMEA can be used (more specifically the FMEDA variant, see section~\ref{sec:FMEDA}),
working from known component failure rates, to obtain working from known component failure rates, to obtain
statistical estimates of the equipment reliability. statistical estimates of the equipment reliability.
\fmmdglossFIT \fmmdglossFIT
@ -993,43 +962,44 @@ Forward search types of fault analysis are said to be `inductive'.
A backward search starts with (undesirable) system level events and A backward search starts with (undesirable) system level events and
works back down to potential causes using de-composition works back down to potential causes using de-composition
of the system and logic. of the system and logic.
%
FMEA based methodologies are forward searches\cite{Lutz:1997:RAU:590564.590572} and top down FMEA based methodologies are forward searches\cite{Lutz:1997:RAU:590564.590572} and top down
methodologies such as FTA~\cite{nucfta,nasafta} are backward searches. methodologies such as FTA~\cite{nucfta,nasafta} are backward searches.
% %
Forward search types of fault analysis are said to be `deductive'.
% %
Backward (or bottom-up) searches are said to be inductive (i.e. the results of failure are Backward (or bottom-up) searches are said to be deductive (i.e. the results of failure are
induced). deduced).
\paragraph{Reasoning distance.} \paragraph{Reasoning distance.}
\label{reasoningdistance} \label{reasoningdistance}
\fmmdglossRD \fmmdglossRD
A reasoning distance, is defined here, as the number of stages of logic and reasoning used Reasoning distance, is the number of stages of logic and reasoning used
in {\fm} analysis to map a failure cause to its potential outcomes. in {\fm} analysis to map a failure cause to its potential outcomes; counted
by th number of {\fm} to component checks made.
% %
In our basic FMEA example in section~\ref{basicfmea} The basic FMEA example in section~\ref{basicfmea}
we were asked to consider one failure mode against all the components in the milli-volt reader. considered one {\fm} against some of the components in the milli-volt reader.
% %
To create an exhaustive FMEA report on the milli-volt reader, we would have had to examine every To create an exhaustive FMEA report on the milli-volt reader, every
known failure mode of every component within it---against all its other components. known failure mode of every component within it would have to be examined against all its other components.
% %
We define `reasoning~distance' as the number of components checked against `Reasoning~distance', for one {\fm}, is defined as the number of components checked against it
for a given failure mode to determine a system level symptom. to determine its system level symptom(s).
% %
No current FMEA variant gives guidelines for the components that should No current FMEA variant gives guidelines for the components that should
be included to analyse a {\fm} in a system. be included to analyse a {\fm} in a system.
% %
Were we to examine a {\fm} against all the other components in a system Were a {\fm} examined against all the other components in a system
this would give us the maximum reasoning distance. this would give us the maximum reasoning distance.
% %
We term this the exhaustive FMEA case. This is termed the exhaustive FMEA case for a single {\fm}.
%does not %does not
% The exhaustive~reasoning~distance would be % The exhaustive~reasoning~distance would be
% the sum of the number of failure modes, against all other components % the sum of the number of failure modes, against all other components
% in that system. % in that system.
The exhaustive~reasoning~distance for a particular component Thus the exhaustive~reasoning~distance for a particular component
would be to multiply would be to multiply
the number of failure modes it has by the number of remaining components the number of failure modes it has by the number of remaining components
in the system. in the system.
@ -1038,16 +1008,20 @@ The exhaustive reasoning~distance for a system would be the
the sum of these multiplications for all the components it contains. the sum of these multiplications for all the components it contains.
% %
If the milli-volt reader had say 100 components, with three failure modes each, this If the milli-volt reader had say 100 components, with three failure modes each, this
would give an exhaustive reasoning distance---for single failure analysis---of 3 * 100 * 99. would give an exhaustive reasoning distance---for single failure analysis---of $3 \times 100 \times 99$.
% %
The discussion on reasoning distance leads provides us with a metric to examine The discussion on reasoning distance provides a metric to examine
the state explosion problems associated with forward search failure investigation the state explosion problems associated with forward search failure investigation
methodologies. methodologies.
%
\fmmdglossSTATEEX \fmmdglossSTATEEX
It is apparent that the shorter the reasoning distance, the more precisely our theoretical examination %
is to determine failure symptoms. For instance for a very simple small circuit, we can have a better understanding It is apparent that the shorter the reasoning distance, the more precisely theoretical examination
of failure effects, than for a very large system where there are more variables and potential {\fm} interactions. can determine failure symptoms.
%
For instance for a very simple small circuit, a better understanding of failure effects is expected,
than for a very large system where there are more variables and potential {\fm} interactions.
%
%.... general concept... simple ideas about how complex a %.... general concept... simple ideas about how complex a
%failure analysis is the more modules and components are involved %failure analysis is the more modules and components are involved
% cite for forward and backward search related to safety critical software % cite for forward and backward search related to safety critical software
@ -1056,37 +1030,45 @@ of failure effects, than for a very large system where there are more variables
\label{sec:xfmea} \label{sec:xfmea}
\paragraph{Problem of which components to check for a given {\bc} {\fm}.} \paragraph{Problem of which components to check for a given {\bc} {\fm}.}
\fmmdglossSTATEEX \fmmdglossSTATEEX
FMEA for a safety critical certification~\cite{en298,en61508} will have to be applied %
FMEA for safety critical certification (i.e. for EN298 and EN61508)~\cite{en298,en61508} has to be applied
to all known failure modes of all components within a system. to all known failure modes of all components within a system.
% %
Each one of these, in a typical report, would be one line of a spreadsheet entry. Each one of these, in a typical report, would be one line of a spreadsheet entry.
% %
FMEA does not define or specify the scope of the investigation of each component failure mode. FMEA does not define or specify the scope of the investigation for each component failure mode.
Should we follow the signal path, and all components we encounter along that, or should the scope be wider?
% %
If we were to examine the effect of a component {\fm} against all other components For instance should the signal path be followed, with all components encountered along that, or should the scope be wider?
in a system, this could be said to be exhaustive analysis. %
%If we wethe effect of a component {\fm} against all other components
%in a system, this could be said to be exhaustive analysis.
\paragraph{Exhaustive Single Failure FMEA.} \paragraph{Exhaustive Single Failure FMEA.}
\fmmdglossXFMEA \fmmdglossXFMEA
To perform FMEA exhaustively (i.e. to examine every possible interaction %
of a failure mode with all other components in a system). Or in other words, To perform exhaustive FMEA (XFMEA), every possible interaction
---we would need to look at all possible failure scenarios. of a failure mode with all other components in a system must be examined.
%
Or in other words, all possible failure scenarios considered.
%
%to do this completely (all failure modes against all components). %to do this completely (all failure modes against all components).
This is represented in the equation below, %~\ref{eqn:fmea_state_exp}, This is represented in the equation below, %~\ref{eqn:fmea_state_exp},
where $N$ is the total number of components in the system, $RD_{single}$ is the reasoning~distance and where $N$ is the total number of components in the system, $RD_{single}$ is the reasoning~distance and
$f$ is the number of failure modes per component. $f$ is the number of failure modes per component:
% %
\begin{equation} \begin{equation}
\label{eqn:fmea_single} \label{eqn:fmea_single}
RD_{single} = N.(N-1).f % \\ RD_{single} = N.(N-1).f . % \\
%(N^2 - N).f %(N^2 - N).f
\end{equation} \end{equation}
% %
This would mean an order of $O(N^2)$ number of checks to perform This means an order of $O(N^2)$ checks to perform
to undertake an `exhaustive~FMEA'. Even small systems have typically to undertake XFMEA for single failures.
%
Even small systems have typically
100 components, and they typically have 3 or more failure modes each, which would give 100 components, and they typically have 3 or more failure modes each, which would give
$100*99*3=29,700$ as a reasoning distance. $100 \times 99 \times 3 = 29,700 $ as a reasoning~distance.
%
\fmmdglossSTATEEX \fmmdglossSTATEEX
\paragraph{Exhaustive FMEA and double failure scenarios.} \paragraph{Exhaustive FMEA and double failure scenarios.}
% %
@ -1095,6 +1077,7 @@ For looking at potential double failure
scenarios\footnote{Certain double failure scenarios are already legal scenarios\footnote{Certain double failure scenarios are already legal
requirements---The European Gas burner standard (EN298:2003)---demands the checking of requirements---The European Gas burner standard (EN298:2003)---demands the checking of
double failure scenarios (for burner lock-out scenarios).} double failure scenarios (for burner lock-out scenarios).}
%
(two components failing within a given time frame) and the order becomes $O(N^3)$. (two components failing within a given time frame) and the order becomes $O(N^3)$.
Where $RD_{double}$ is the reasoning~distance for double failure scenarios: Where $RD_{double}$ is the reasoning~distance for double failure scenarios:
\begin{equation} \begin{equation}
@ -1102,30 +1085,32 @@ Where $RD_{double}$ is the reasoning~distance for double failure scenarios:
RD_{double} = N.(N-1).(N-2).f . % \\ RD_{double} = N.(N-1).(N-2).f . % \\
%(N^2 - N).f %(N^2 - N).f
\end{equation} \end{equation}
%
For our theoretical 100 components with 3 failure modes each example, this is a reasoning distance of For a theoretical system with 100 components and a fixed 3 failure modes each, this gives reasoning distance of
$100*99*98*3=2,910,600$. % failure mode scenarios. $100*99*98*3=2,910,600$. % failure mode scenarios.
%
In practise there is an additional complication here, that of In practise there is an additional complication here, that of
the circuit topology changes that {\fms} can cause. the circuit topology changes that {\fms} can cause.
\paragraph{Reliance on experts for meaningful FMEA Analysis.} \paragraph{Reliance on experts for meaningful FMEA Analysis.}
Current FMEA methodologies cannot consider---for the reason of state explosion---an exhaustive approach. Current FMEA methodologies cannot consider---for the reason of state explosion---an exhaustive approach.
We define exhaustive FMEA ({\XFMEA}) as examining the effect of every component failure mode %We define exhaustive FMEA ({\XFMEA}) as examining the effect of every component failure mode
against the remaining components in the system under investigation. %against the remaining components in the system under investigation.
% %
\fmmdglossSTATEEX \fmmdglossSTATEEX
Because we cannot, for practical reasons, perform XFMEA, %
we rely on experts in the system under investigation Because for practical reasons, XFMEA cannot be performed for anything other than a trivial system,
to perform a meaningful FMEA analysis. reliance is placed upon experts on the system under investigation
to perform a meaningful analysis.
% %
These experts must use their judgement and experience to choose These experts must use their judgement and experience to choose
sub-sets of the components in the system, to check against each {\fm}. sub-sets of the components in the system to check against each {\fm}.
% %
Also, %In practise Also, %In practise
these experts have to select the areas they see as most critical for detailed FMEA analysis: these experts have to select the areas they see as most critical for detailed FMEA analysis:
it is usually impossible, for the reason of time to perform the work, it is usually impossible, for reasons of time to perform the work,
to action a detailed level of analysis on all component {\fms} to action a detailed level of analysis on all component {\fms}
on anything but a non-trivial system. on anything but a small hypothetical system.
\subsection{Component Tolerance} \subsection{Component Tolerance}
@ -1231,9 +1216,10 @@ A history of the usage and development of FMECA may be found in~\cite{FMECAresea
\fmmdglossFMECA \fmmdglossFMECA
\paragraph{ FMECA - Failure Modes Effects and Criticality Analysis.} \paragraph{ FMECA - Failure Modes Effects and Criticality Analysis.}
%
Very similar to PFMEA, but instead of cost, a criticality or Very similar to PFMEA, but instead of cost, a criticality or
seriousness factor is ascribed to putative top level incidents. seriousness factor is ascribed to putative top level incidents.
FMECA has three probability factors for component failures. FMECA has three probability factors for component failures, a system operational time and a severity factor.
\textbf{FMECA ${\lambda}_{p}$ value.} \textbf{FMECA ${\lambda}_{p}$ value.}
This is the overall failure rate of a base component. This is the overall failure rate of a base component.
@ -1250,7 +1236,7 @@ a particular failure~mode occurring within a component~\cite{fmd91}.
% %
\fmmdglossFMECA \fmmdglossFMECA
% %
\paragraph{ FMECA - Failure Modes Effects and Criticality Analysis.}
\textbf{FMECA $\beta$ value.} \textbf{FMECA $\beta$ value.}
The second probability factor $\beta$, is the probability that the failure mode The second probability factor $\beta$, is the probability that the failure mode
will cause a given system failure. will cause a given system failure.
@ -1268,10 +1254,13 @@ represented by the variable $t$.
A weighting factor to indicate the seriousness of the putative system level error. A weighting factor to indicate the seriousness of the putative system level error.
%Typical classifications are as follows:~\cite{fmd91} %Typical classifications are as follows:~\cite{fmd91}
The statistical formula to calculate the criticallity factor for one component {\fm} is given below:
%
\begin{equation} \begin{equation}
C_m = {\beta} . {\alpha} . {{\lambda}_p} . {t} . {s} C_m = {\beta} . {\alpha} . {{\lambda}_p} . {t} . {s} .
\end{equation} \end{equation}
\fmmdglossFMECA \fmmdglossFMECA
%
The highest $C_m$ values would represent the most dangerous or serious The highest $C_m$ values would represent the most dangerous or serious
system level failures. system level failures.
The highest $C_m$ values would be at the top of a `to~fix' list The highest $C_m$ values would be at the top of a `to~fix' list
@ -1386,7 +1375,7 @@ are actually met for given SIL levels is currently almost impossible~\cite{silsa
\item \textbf{Safe or Dangerous.} Failure modes are classified SAFE or DANGEROUS. \item \textbf{Safe or Dangerous.} Failure modes are classified SAFE or DANGEROUS.
\item \textbf{Detectable failure modes.} Failure modes are given the attribute DETECTABLE or UNDETECTABLE. \item \textbf{Detectable failure modes.} Failure modes are given the attribute DETECTABLE or UNDETECTABLE.
\item \textbf{Four attributes for FMEDA Failure Modes.} All failure modes may thus be Safe Detected(SD), Safe Undetected(SU), Dangerous Detected(DD), Dangerous Undetected(DU) \item \textbf{Four attributes for FMEDA Failure Modes.} All failure modes may thus be Safe Detected(SD), Safe Undetected(SU), Dangerous Detected(DD), Dangerous Undetected(DU)
\item \textbf{Four statistical properties of a system.} We sum the statistics for the four classifications of system failures \\ \item \textbf{Four statistical properties of a system.} the statistics for the four classifications of system failures are summed: \\
$ \sum \lambda_{SD}$, $\sum \lambda_{SU}$, $\sum \lambda_{DD}$, $\sum \lambda_{DU}$. \\ $ \sum \lambda_{SD}$, $\sum \lambda_{SU}$, $\sum \lambda_{DD}$, $\sum \lambda_{DU}$. \\
\end{itemize} \end{itemize}
@ -1438,8 +1427,11 @@ Again this is usually expressed as a percentage,
$$ SFF = \big( \Sigma\lambda_S + \Sigma\lambda_{DD} \big) / \big( \Sigma\lambda_S + \Sigma\lambda_D \big) . $$ $$ SFF = \big( \Sigma\lambda_S + \Sigma\lambda_{DD} \big) / \big( \Sigma\lambda_S + \Sigma\lambda_D \big) . $$
% %
SFF determines how proportionately fail-safe a system is, not how reliable it is. SFF determines how proportionately fail-safe a system is, not how reliable it is.
A weakness in this philosophy; adding extra safe failures (even unused ones) would improve the apparent SFF, this %
apparent loophole is closed in the 2010 edition of the standard. A weakness in this philosophy is that by adding extra safe failures (even unused ones)
the apparent SFF would be improved\footnote{The artificial inflation of SFF,
by including unnecessary safe functions or unused components
(i.e. a loophole) is closed in the 2010 edition of the standard.}.
\fmmdglossFMEDA \fmmdglossFMEDA
% %
% %
@ -1487,12 +1479,16 @@ looking for weaknesses at a theoretical level.
% \end{figure} % \end{figure}
% %
\begin{itemize} \begin{itemize}
\item Impossible to look at all component failures let alone apply FMEA exhaustively/rigorously. \item Impossible to look at all component failures let alone apply FMEA exhaustively/rigorously,
\item In practice, failure scenarios for critical sections are contested, and either justified or extra safety measures implemented. \item In practice, failure scenarios for critical sections are contested, and either justified or extra safety measures implemented,
\item Often meeting notes or minutes only. Unusual for detailed technical arguments to be documented. \item Often meeting notes or minutes only: unusual for detailed technical arguments to be documented.
\end{itemize} \end{itemize}
% %
% %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% SFMEA????
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Conclusion} \section{Conclusion}
\begin{figure}[h] \begin{figure}[h]
\centering \centering
@ -1523,7 +1519,7 @@ relating this to the signal path or adjacency in the electronic circuit,
potential strategies are listed below: potential strategies are listed below:
% %
\begin{itemize} \begin{itemize}
\item look at all components electronically adjacent (i.e. connected to the affected component), \item Look at all components electronically adjacent (i.e. connected to the affected component),
\item Look at all components connected (as above) and those one removed (those connected to those connected to the affected component), \item Look at all components connected (as above) and those one removed (those connected to those connected to the affected component),
\item Look at components forward of the {\fm} in the signal path, \item Look at components forward of the {\fm} in the signal path,
\item Look at all components in the signal path, \item Look at all components in the signal path,
@ -1557,8 +1553,8 @@ However, %, as with the components that we should check against a {\fm},
the depth of description for reasoning stages in FMEA entries is in practise variable. the depth of description for reasoning stages in FMEA entries is in practise variable.
%FMEA does not stipulat which %FMEA does not stipulat which
Ideally each FMEA entry would contain a reasoning description Ideally each FMEA entry would contain a reasoning description
for each component the {\fm} is checked against, for each {\fm},
so that the entry can be more easily reviewed or revisited/audited than a traditional FMEA report. so that the entry can be more easily reviewed or revisited/audited. % than a traditional FMEA report.
% %
Because FMEA is traditionally performed with one entry per component {\fm}, full reasoning descriptions Because FMEA is traditionally performed with one entry per component {\fm}, full reasoning descriptions
are rare. are rare.