CH2 JMC PR and more copy

on the subject of UML applied to the FMEA
process.
This commit is contained in:
Robin Clark 2013-03-31 17:49:22 +01:00
parent e66fdd08a5
commit 5d8eb97000
6 changed files with 85 additions and 41 deletions

View File

@ -3,7 +3,7 @@
#
# Place all .dia files here as .png targets
#
DIA = ftcontext.png
DIA = ftcontext.png component_fm_rel.png component_fm_rel_ana.png component_fm_rel_ana_subj_obj.png
doc: $(DIA)

View File

@ -17,16 +17,17 @@ This chapter introduces Failure Mode Effect Analysis (FMEA).
It starts with a generic conceptual overview of the process.
It then looks at the stages of the FMEA process in greater detail, starting with
how we determine the failure modes associated with components.
%
Two common electrical components, the resistor and the operational amplifier
and examined in the context of two sources of information that define failure modes.
are examined in the context of two sources of information that define failure modes.
%
A simple example of an FMEA is then given, using a hypothetical {\ft} milli-amp reader.
A simple example of an FMEA is given, using a hypothetical {\ft} milli-amp reader.
%
The four main variants are then described and we then develop %conclude by describing concepts
The four main variants are described and we develop %conclude by describing concepts
the concepts
that underlie the usage and philosophy of FMEA.
%
We then return to the overall process of FMEA
We return to the overall process of FMEA
and model it using UML.
%
By using UML we define relationships between the data objects
@ -51,7 +52,7 @@ The acronym FMEA can be expanded as follows:
\item \textbf{F - Failures of given component,} Consider a particular component in a system;
\item \textbf{M - Failure Mode,} Choose a component `failure~mode');
\item \textbf{E - Effects,} Determine the effects this failure mode will cause to the system we are examining;
\item \textbf{A - Analysis,} Analyse how much impact this symptom will have on the environment/people/the system its-self.
\item \textbf{A - Analysis,} Analyse how much impact this symptom will have on the environment/people/the system itself.
\end{itemize}
%
FMEA is a broad term; it could mean anything from an informal check on how
@ -59,7 +60,7 @@ how failures could affect some equipment in %an initial
a brain-storming session
%in product design,
to formal submission as part of safety critical certification.
FMEA is a manual and therefore time intensive process. To reduce amount of work to perform,
FMEA is a manual and therefore time intensive process. To reduce the amount of work to perform,
software packages~\cite{931423, 1778436820050601} and analysis strategies have
been developed~\cite{incrementalfmea, automatingFMEA1281774}.
%
@ -86,31 +87,40 @@ the effectiveness of FMEA.
We begin FMEA with the basic, or starting components.
This components are the sort we buy in or consider as pre-assembled modules.
We need to know how these can fail. So our first relationship
We term these the {\bcs}.
Firstly we need to know how these can fail. So our first relationship
is between a {\bc} and its failure modes, see figure~\ref{fig:component_fm_rel}.
%DIAGRAM of Base components and failure modes
\begin{figure}[h]
\centering
\includegraphics[width=400pt]{./CH3_FMEA_criticism/component_fm_rel.png}
\includegraphics[width=400pt]{./CH2_FMEA/component_fm_rel.png}
% component_fm_rel.png: 368x71 pixel, 72dpi, 12.98x2.50 cm, bb=0 0 368 71
\caption{Base Component to Failure Modes relationship}
\label{fig:component_fm_rel}
\end{figure}
The next stage is reasoning. We take a component failure
mode and analyse its effect on some of the other components in the system.
The result of this is a system level failure, or symptom.
The analysis would typically be one line in a spreadsheet entry.
and analysis to symptom relationship is generally % considered
The next stage is analysis, that is reasoning applied to the system in the event of
a given failure mode.
%
To perform this we need to know how a failure
mode, considering its effect on other components in the system
will translate to a system level symptom/failure.
%
The result of FMEA is to determine a system level failures, or symptoms for given component failure modes.
%
In practise, an FMEA analysis of a {\bc} {\fm}
would typically be one line in a spreadsheet entry.
%
The analysis to symptom relationship is generally % considered
one-to-one, however here (see figure~\ref{fig:component_fm_rel_ana}), we allow for the possibility
of more than one failure symptom.
%DIAGRAM of reasoning and Symptoms.
\begin{figure}[h]
\centering
\includegraphics[width=400pt]{./CH3_FMEA_criticism/component_fm_rel_ana.png}
\includegraphics[width=400pt]{./CH2_FMEA/component_fm_rel_ana.png}
% component_fm_rel_ana.png: 369x184 pixel, 72dpi, 13.02x6.49 cm, bb=0 0 369 184
\caption{FMEA analyis entry data relationships}
\label{fig:component_fm_rel_ana}
@ -120,16 +130,17 @@ Figure ~\ref{fig:component_fm_rel_ana} defines the data relationships
for FMEA. This model is expanded upon in the conclusion
of this chapter.
\section{Determining the failure modes of components}
\label{sec:determine_fms}
In order to apply any form of FMEA we need to know the ways in which
the components we are using can fail. In practise, this part of the process is guided by
the standards to which we are seeking to conform to.
the standards to which we are seeking to conform.% to.
%
\footnote{A good introduction to hardware and software failure modes may be found in~\cite{sccs}[pp.114-124].}
%
Typically when choosing components for a design, we look at manufacturers' data sheets
which describe functionality, physical dimensions
Typically, when choosing components for a design, we look at manufacturers' data sheets
which describe functionality, physical dimensions,
environmental ranges, tolerances and can indicate how a component may fail/misbehave
under given conditions.
%
@ -187,7 +198,7 @@ requires statistics for Meantime to Failure (MTTF) for all {\bc} failure modes.
The starting point in the FMEA process are the failure modes of the components
we would typically find in a production parts list, which we can term the {\bcs}.
%
In order the define FMEA we must start with a discussion on how these failure modes are chosen.
In order to define FMEA we must start with a discussion on how these failure modes are chosen.
%
In this section we pick %look in detail at
two common electrical components as examples, and examine how
@ -251,7 +262,9 @@ is significantly reduced, enough for some standards to exclude it~\cite{en298}~\
\paragraph{Resistor failure modes according to EN298.}
EN298, the European gas burner safety standard, tends to be give failure modes more directly usable for performing FMEA than FMD-91.
EN298, the European gas burner safety standard, tends to
provide failure modes more directly usable for performing FMEA than FMD-91.
%
EN298 requires that a full FMEA be undertaken, examining all failure modes
of all electronic components~\cite{en298}[11.2 5] as part of the certification process.
%
@ -263,7 +276,8 @@ only requires that the failure mode OPEN be considered for FMEA analysis.
For resistor types not specifically listed in EN298, the failure modes
are considered to be either OPEN or SHORT.
%
The reason that parameter change is not considered for resistors chosen for an EN298 compliant system, is that they must be must be {\em downrated}.
The reason that parameter change is not considered for resistors chosen for an EN298 compliant system, is that they must be {\em downrated}.
%
That is to say the power and voltage ratings of components must be calculated
for maximum possible exposure, with a 40\% margin of error.
%
@ -289,7 +303,7 @@ and thus subject to drift/parameter change.
\subsubsection{Resistor Failure Modes}
\label{sec:res_fms}
The differneces in resistor failure modes between FMD-91 and EN298 are that FMD-91 would
The differences in resistor failure modes between FMD-91 and EN298 are that FMD-91 would
include the failure mode DRIFT. EN298 does not include this, mainly because it imposes circuit design constraints
that effectively side step that problem.
%
@ -345,7 +359,7 @@ The symptom for this is given as a low slew rate. This means that the op-amp
will not react quickly to changes on its input terminals.
This is a failure symptom that may not be of concern in a slow responding system like an
instrumentation amplifier. However, where higher frequencies are being processed,
a signal may entirely be lost.
a signal may be lost entirely.
We can map this failure cause to a {\fm}, and we can call it $LOW_{slew}$.
\paragraph{No Operation - over stress.}
@ -631,7 +645,7 @@ An FMEA investigation will often take the component {\fm} and examine its effect
in the direction of the signal,
echoing diagnostic/fault~finding methods~\cite{garrett, maikowski}. % loebowski}.
%
When fault finding we generally follow the signal path, checking for correct behaviour
When fault finding, we generally follow the signal path checking for correct behaviour
along it: when we find something out of place we zoom in and measure
the circuit behaviour until we find a faulty component or module.
%
@ -652,7 +666,7 @@ component failure modes.
This would be time consuming as it would involve building a circuit for each component {\fm} in
the system\footnote{Building circuit simulations and simulating component failure modes
would be a very time consuming process and might only be performed as a final-stage of accident investigation, where the cause is
required to be proven.}
required to be proven.}.
%
We cannot, as with fault finding, verify modules along the signal path for correct behaviour
and eliminate them from the investigation.
@ -674,8 +688,8 @@ forwards and backwards from the placement
of the component exhibiting the {\fm} under investigation.
%
Also, whether following the effects through the signal path {\em only} is acceptable, and instead
looking at its effect on all other components in the system is necessary,
is a matter for debate.
would looking at its effect on all other components in the system be necessary.
%is a matter for debate.
%
In practise, it is a compromise between the amount of time/money that can be spent
on analysis relative to the criticality of the project.
@ -717,7 +731,7 @@ could mean one too many. % mapping.
\paragraph{Use of Markov chains to model failure modes.}
We could represent a failure mode and its possible outcomes using a Markov chain~\cite{probfmea_4338247}.
%
Where multiple simultaneous%\footnote{Multiple simultaneous failures are taken to mean failures that occur within the same detection period.}
Where multiple simultaneous %\footnote{Multiple simultaneous failures are taken to mean failures that occur within the same detection period.}
failure modes are considered this complicates
the statistical nature of the Markov chain, cause effect model.
%
@ -791,7 +805,7 @@ will be used for describing the observability of failure modes in this document.
\paragraph{Impracticality of Field Data for modern systems.}
Modern electronic components, are generally very reliable, and the systems built from them
are thus very reliable too. Reliable field data on failures will, therefore be sparse.
are thus very reliable too. Reliable field data on failures will, therefore, be sparse.
Should we wish to prove a continuous demand system for say ${10}^{-7}$ failures\footnote{${10}^{-7}$ failures per hour of operation is the
threshold for S.I.L. 3 reliability~\cite{en61508}. Failure rates are normally measured per $10^9$ hours of operation
and are known as Failure in Time (FIT) values. The maximum FIT values for a SIL 3 system is therefore 100.}
@ -890,7 +904,8 @@ $100*99*3=29,700$.
\paragraph{Exhaustive Double Failure FMEA}
For looking at potential double failure
scenarios\footnote{Certain double failure scenarios are already legal requirements---The European Gas burner standard (EN298:2003)---demands the checking of
scenarios\footnote{Certain double failure scenarios are already legal
requirements---The European Gas burner standard (EN298:2003)---demands the checking of
double failure scenarios (for burner lock-out scenarios).}
(two components failing within a given time frame) and the order becomes $O(N^3)$.
@ -1085,7 +1100,7 @@ It provides a statistical overall level of safety
and allows diagnostic mitigation for self checking etc.
It provides guidelines for the design and architecture
of computer/software systems for four levels of
safety Integrity, referred to as Safety Integrity Levels (SIL).
safety integrity, referred to as Safety Integrity Levels (SIL).
%For Hardware
%
FMEDA does force the user to consider all hardware components in a system
@ -1155,7 +1170,8 @@ Again this is usually expressed as a percentage.
$$ SFF = \big( \Sigma\lambda_S + \Sigma\lambda_{DD} \big) / \big( \Sigma\lambda_S + \Sigma\lambda_D \big) $$
SFF determines how proportionately fail-safe a system is, not how reliable it is !
Weakness in this philosophy; adding extra safe failures (even unused ones) improves the SFF.
A Weakness in this philosophy; adding extra safe failures (even unused ones) improves the SFF, this
apparent loophole is closed in the 2010 edition of the standard.
@ -1236,21 +1252,48 @@ looking for weaknesses at a theoretical level.
\section{Conclusion}
Returning to the FMEA model, we now show that
figure holds for the five variants of FMEA discussed.
We can however, extend this
with subjective failure mode symptoms (see figure~\ref{fig:component_fm_rel_ana_subj_obj}).
\begin{figure}[h]
\centering
\includegraphics{./CH3_FMEA_criticism/component_fm_rel_ana_subj_obj.png}
\includegraphics[width=400pt]{./CH2_FMEA/component_fm_rel_ana_subj_obj.png}
% component_fm_rel_ana_subj_obj.png: 694x303 pixel, 72dpi, 24.48x10.69 cm, bb=0 0 694 303
\caption{FMEA UML data representation with subjective system level failure modes.}
\label{fig:component_fm_rel_ana_subj_obj}
\end{figure}
Returning to the FMEA model, we now show that the data relationships shown in
figure~\ref{fig:component_fm_rel_ana} hold for the five variants of FMEA discussed.
We can however, extend this
with subjective failure mode symptoms (see figure~\ref{fig:component_fm_rel_ana_subj_obj}).
The UML data model reveals some undefined qualities of FMEA.
These raise questions and are discussed below.
\paragraph{Which, or how many components should we check for each {\fm} entry?}
For instance a given {\fm} will have its effect measured in relation
to some of the components in the system.
We could choose these components by stipulating several criteria,
relating this to the signal path or adjacency in the electronic circuit, among which are:
\begin{itemize}
\item look at all components electronically adjacent (i.e. connected to the affected component)
\item Look at all components connected (as above) and those one removed (those connected to those connected to the affected component)
\item Look at components forward of the {\fm} in the signal path.
\item Look at all components in the signal path.
\item Look at all components in the signal path including those one connection removed on.
\item Look at all components in the system.
\end{itemize}
No current variant of FMEA gives any guidelines for which, or how many components to check for a given {\fm}.
\paragraph{FMEA gives us objective system level failures/symptoms, what do we do with subjective or contextual failures resulting from this?}
The two more modern variants of FMEA, FMECA and FMEDA start to address the problem of subjective/contextual
failure symptoms of a system.
%
FMEDA classifies them as dangerous or safe failures.
FMECA gives us a statistically biased criticality level.
In both of these methodologies however, there is no formal stage where we map from an objective to subjective
system failure, the processes are intertwined with the basic analysis its self.
%FMEA does not stipulat which
% MOVED TO CH3: 15MAR2013
%

View File

@ -3,7 +3,8 @@
#
# Place all .dia files here as .png targets
#
DIA = distcon.png component_fm_rel.png component_fm_rel_ana.png component_fm_rel_ana_subj_obj.png
DIA = distcon.png
#component_fm_rel.png component_fm_rel_ana.png component_fm_rel_ana_subj_obj.png
doc: $(DIA)