Survey needs sorting out:
Needs lots of detail, procedural and historic
This commit is contained in:
parent
d485131a5c
commit
ffc9310ddd
@ -1,7 +1,4 @@
|
|||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
\ifthenelse {\boolean{paper}}
|
\ifthenelse {\boolean{paper}}
|
||||||
{
|
{
|
||||||
\abstract{
|
\abstract{
|
||||||
@ -278,7 +275,8 @@ system level outcomes.
|
|||||||
|
|
||||||
\subsubsection{ FTA weaknesses }
|
\subsubsection{ FTA weaknesses }
|
||||||
\begin{itemize}
|
\begin{itemize}
|
||||||
\item Possibility to miss component failure modes.
|
\item Complex component interaction effects are by definition modelled by FTA, but because of the top down approach, not all
|
||||||
|
base component failure modes are guaranteed to be included in the model.
|
||||||
\item Possibility to miss environmental affects.
|
\item Possibility to miss environmental affects.
|
||||||
\item No possibility to model base component level double failure modes.
|
\item No possibility to model base component level double failure modes.
|
||||||
\end{itemize}
|
\end{itemize}
|
||||||
@ -305,8 +303,10 @@ a prioritised `todo list', with higher $RPN$ values being the most urgent.
|
|||||||
|
|
||||||
\subsubsection{ FMEA weaknesses }
|
\subsubsection{ FMEA weaknesses }
|
||||||
\begin{itemize}
|
\begin{itemize}
|
||||||
\item Possibility to miss the effects of failure modes at SYSTEM level.
|
\item Possibility to miss the effects of base component failure modes at SYSTEM level.
|
||||||
|
(because the its each individual component, not all its failure modes, that are considered for analysis).
|
||||||
\item Possibility to miss environmental effects.
|
\item Possibility to miss environmental effects.
|
||||||
|
\item Complex component interaction effects can be missed.
|
||||||
\item No possibility to model base component level double failure modes.
|
\item No possibility to model base component level double failure modes.
|
||||||
\end{itemize}
|
\end{itemize}
|
||||||
|
|
||||||
@ -359,6 +359,7 @@ Again this essentially produces a prioritised `todo' list.
|
|||||||
\begin{itemize}
|
\begin{itemize}
|
||||||
\item Possibility to miss the effects of failure modes at SYSTEM level.
|
\item Possibility to miss the effects of failure modes at SYSTEM level.
|
||||||
\item Possibility to miss environmental affects.
|
\item Possibility to miss environmental affects.
|
||||||
|
\item Complex component interaction effects can be missed.
|
||||||
\item No possibility to model base component level double failure modes.
|
\item No possibility to model base component level double failure modes.
|
||||||
\end{itemize}
|
\end{itemize}
|
||||||
|
|
||||||
@ -381,9 +382,12 @@ The following gives an outline of the procedure.
|
|||||||
|
|
||||||
|
|
||||||
\subsubsection{Two statistical perspectives}
|
\subsubsection{Two statistical perspectives}
|
||||||
|
\ifthenelse {\boolean{paper}}
|
||||||
|
{
|
||||||
FMEDA is a statistical analysis methodology and is used from one of two perspectives,
|
FMEDA is a statistical analysis methodology and is used from one of two perspectives,
|
||||||
Probability of Failure on Demand (PFD), and Probability of Failure
|
Probability of Failure on Demand (PFD), and Probability of Failure
|
||||||
in continuous Operation, or Failure in Time (FIT).
|
in continuous Operation, or Failure in Time (FIT).
|
||||||
|
|
||||||
\paragraph{Failure in Time (FIT).} Continuous operation is measured in failures per billion ($10^9$) hours of operation.
|
\paragraph{Failure in Time (FIT).} Continuous operation is measured in failures per billion ($10^9$) hours of operation.
|
||||||
For a continuously running nuclear powerstation, industrial burner or aircraft engine
|
For a continuously running nuclear powerstation, industrial burner or aircraft engine
|
||||||
we would be interested in its operational FIT values.
|
we would be interested in its operational FIT values.
|
||||||
@ -392,6 +396,12 @@ we would be interested in its operational FIT values.
|
|||||||
automobile braking, or other fail safe measure applied in an emergency, we would be interested in PFD.
|
automobile braking, or other fail safe measure applied in an emergency, we would be interested in PFD.
|
||||||
That is to say the ratio of it failing
|
That is to say the ratio of it failing
|
||||||
to succeeding to operate correctly on demand.
|
to succeeding to operate correctly on demand.
|
||||||
|
}
|
||||||
|
{
|
||||||
|
FMEDA is a statistical analysis methodology and is used from one of two perspectives,
|
||||||
|
Probability of Failure on Demand (PFD) (see \ref{survey:pfd}), and Probability of Failure
|
||||||
|
in continuous Operation, or Failure in Time (FIT) (see \ref{survey:fit}).
|
||||||
|
}
|
||||||
|
|
||||||
\subsubsection{The FMEDA Analysis Process}
|
\subsubsection{The FMEDA Analysis Process}
|
||||||
|
|
||||||
@ -406,18 +416,18 @@ environmental conditions. The SYSTEM errors are categorised as `safe' or `danger
|
|||||||
%
|
%
|
||||||
%Statistical data exists for most component types \cite{mil1992}.
|
%Statistical data exists for most component types \cite{mil1992}.
|
||||||
%
|
%
|
||||||
This phase is typically implemented on a spreadsheet
|
%This phase is typically implemented on a spreadsheet
|
||||||
with rows representing each component. A typical component spreadsheet row would
|
%with rows representing each component. A typical component spreadsheet row would
|
||||||
comprise of
|
%comprise of
|
||||||
component type, placement,
|
%component type, placement,
|
||||||
part number, environmental stress factors, MTTF, safe/dangerous etc.
|
%part number, environmental stress factors, MTTF, safe/dangerous etc.
|
||||||
%will be a determination of whether the component failing will lead to a `safe'
|
%%will be a determination of whether the component failing will lead to a `safe'
|
||||||
%or `unsafe' condition.
|
%or `unsafe' condition.
|
||||||
|
|
||||||
\paragraph{Overall SYSTEM failure rate.}
|
%\paragraph{Overall SYSTEM failure rate.}
|
||||||
The product failure rate is the sum of all component
|
%The product failure rate is the sum of all component
|
||||||
failure rates. Typically the sum of all MTTF rates for all
|
%failure rates. Typically the sum of all MTTF rates for all
|
||||||
components in an FMEDA spreadsheet.
|
%components in an FMEDA spreadsheet.
|
||||||
%This is the sum of safe and unsafe
|
%This is the sum of safe and unsafe
|
||||||
%failures.
|
%failures.
|
||||||
|
|
||||||
@ -430,16 +440,16 @@ This is done by taking a component failure mode and determining
|
|||||||
if the SYSTEM error it is tied to is dangerous or safe.
|
if the SYSTEM error it is tied to is dangerous or safe.
|
||||||
The decision for this may be
|
The decision for this may be
|
||||||
based on heuristics or field data.
|
based on heuristics or field data.
|
||||||
EN61508 uses the $\lambda$ symbol to represent probabilities.
|
%EN61508 uses the $\lambda$ symbol to represent probabilities.
|
||||||
Because we have statistics for each component failure mode,
|
%Because we have statistics for each component failure mode,
|
||||||
we can now now classify these in terms of safe and dangerous lambda values.
|
%we can now now classify these in terms of safe and dangerous lambda values.
|
||||||
Detectable failure probabilities are labelled `$\lambda_D$' (for
|
%Detectable failure probabilities are labelled `$\lambda_D$' (for
|
||||||
dangerous) and `$\lambda_S$' (for safe) \cite{en61508}.
|
%dangerous) and `$\lambda_S$' (for safe) \cite{en61508}.
|
||||||
|
|
||||||
\paragraph{Determine Detectable and Undetectable Failures.}
|
\paragraph{Determine Detectable and Undetectable Failures.}
|
||||||
Each safe and dangerous failure mode is now
|
Each safe and dangerous failure mode is now
|
||||||
classified as detectable or un-detectable.
|
classified as detectable or un-detectable.
|
||||||
EN61508 assumes that products have a high level of
|
For the higher integrity levels, EN61508 assumes that products have a high proportion of
|
||||||
self checking features.
|
self checking features.
|
||||||
%
|
%
|
||||||
This gives us four level failure mode classifications:
|
This gives us four level failure mode classifications:
|
||||||
@ -454,7 +464,7 @@ analysis, the
|
|||||||
% and guess how it will affect an ENTIRE complex SYSTEM
|
% and guess how it will affect an ENTIRE complex SYSTEM
|
||||||
% Admission of failure of the process really !!!!
|
% Admission of failure of the process really !!!!
|
||||||
next step is to investigate using an actual working SYSTEM.
|
next step is to investigate using an actual working SYSTEM.
|
||||||
|
%
|
||||||
Failures are deliberately caused (by physical intervention), and any new SYSTEM level
|
Failures are deliberately caused (by physical intervention), and any new SYSTEM level
|
||||||
failures are added to the model.
|
failures are added to the model.
|
||||||
Heuristics and MTTF failure rates for the components
|
Heuristics and MTTF failure rates for the components
|
||||||
@ -464,38 +474,39 @@ $\lambda_{SD}$, $\lambda_{SU}$, $\lambda_{DD}$, $\lambda_{DU}$).
|
|||||||
These new failures are added to the model.
|
These new failures are added to the model.
|
||||||
%SD, SU, DD, DU.
|
%SD, SU, DD, DU.
|
||||||
|
|
||||||
With these classifications, and statistics for each component
|
%With these classifications, and statistics for each component
|
||||||
we can now calculate statistics for the diagnostic coverage (how good at `self checking' the system is)
|
%we can now calculate statistics for the diagnostic coverage (how good at `self checking' the system is)
|
||||||
and its safe failure fraction (how many of its failures are self detected or safe compared to
|
%and its safe failure fraction (how many of its failures are self detected or safe compared to
|
||||||
all failures possible).
|
%all failures possible).
|
||||||
|
%
|
||||||
The calculations for these are described below.
|
%The calculations for these are described below.
|
||||||
|
|
||||||
\paragraph{Diagnostic Coverage.}
|
|
||||||
The diagnostic coverage is simply the ratio
|
|
||||||
of the dangerous detected probabilities
|
|
||||||
against the probability of all dangerous failures,
|
|
||||||
and is normally expressed as a percentage. $\Sigma\lambda_{DD}$ represents
|
|
||||||
the percentage of dangerous detected base component failure modes, and
|
|
||||||
$\Sigma\lambda_D$ the total number of dangerous base component failure modes.
|
|
||||||
|
|
||||||
$$ DiagnosticCoverage = \Sigma\lambda_{DD} / \Sigma\lambda_D $$
|
|
||||||
|
|
||||||
The diagnostic coverage for safe failures, where $\Sigma\lambda_{SD}$ represents the percentage of
|
|
||||||
safe detected base component failure modes,
|
|
||||||
and $\Sigma\lambda_S$ the total number of safe base component failure modes,
|
|
||||||
is given as
|
|
||||||
|
|
||||||
$$ SF = \frac{\Sigma\lambda_{SD}}{\Sigma\lambda_S} $$
|
|
||||||
|
|
||||||
|
|
||||||
|
%\paragraph{Diagnostic Coverage.}
|
||||||
|
%The diagnostic coverage is simply the ratio
|
||||||
|
%of the dangerous detected probabilities
|
||||||
|
%against the probability of all dangerous failures,
|
||||||
|
%and is normally expressed as a percentage.
|
||||||
|
%%$\Sigma\lambda_{DD}$ represents
|
||||||
|
%the percentage of dangerous detected base component failure modes, and
|
||||||
|
%$\Sigma\lambda_D$ the total number of dangerous base component failure modes.
|
||||||
|
%
|
||||||
|
%$$ DiagnosticCoverage = \Sigma\lambda_{DD} / \Sigma\lambda_D $$
|
||||||
|
%
|
||||||
|
%The diagnostic coverage for safe failures, where $\Sigma\lambda_{SD}$ represents the percentage of
|
||||||
|
%safe detected base component failure modes,
|
||||||
|
%and $\Sigma\lambda_S$ the total number of safe base component failure modes,
|
||||||
|
%is given as
|
||||||
|
%
|
||||||
|
%$$ SF = \frac{\Sigma\lambda_{SD}}{\Sigma\lambda_S} $$
|
||||||
|
%
|
||||||
|
%
|
||||||
\paragraph{Safe Failure Fraction.}
|
\paragraph{Safe Failure Fraction.}
|
||||||
A key concept in FMEDA is Safe Failure Fraction (SFF).
|
A key concept in FMEDA is Safe Failure Fraction (SFF).
|
||||||
This is the ratio of safe and dangerous detected failures
|
This is the ratio of safe and dangerous detected failures
|
||||||
against all safe and dangerous failure probabilities.
|
against all safe and dangerous failure probabilities.
|
||||||
Again this is usually expressed as a percentage.
|
%Again this is usually expressed as a percentage.
|
||||||
|
|
||||||
$$ SFF = \big( \Sigma\lambda_S + \Sigma\lambda_{DD} \big) / \big( \Sigma\lambda_S + \Sigma\lambda_D \big) $$
|
%$$ SFF = \big( \Sigma\lambda_S + \Sigma\lambda_{DD} \big) / \big( \Sigma\lambda_S + \Sigma\lambda_D \big) $$
|
||||||
|
|
||||||
%This is the ratio of
|
%This is the ratio of
|
||||||
%Step 4 Calculate SFF, SIL and PFD
|
%Step 4 Calculate SFF, SIL and PFD
|
||||||
@ -610,7 +621,8 @@ and its international analog standard IOC5108.
|
|||||||
\subsubsection{ FMEDA weaknesses }
|
\subsubsection{ FMEDA weaknesses }
|
||||||
\begin{itemize}
|
\begin{itemize}
|
||||||
\item Possibility to miss the effects of failure modes at SYSTEM level.
|
\item Possibility to miss the effects of failure modes at SYSTEM level.
|
||||||
\item Statistical nature allows a proportion of undetected failures for given S.I.L. level.
|
\item Statistical nature allows a proportion of undetected failures for given S.I.L. level. These could be catostophic failures, as long as the percieved probability is low enough, they are considered acceptable for EN61508.
|
||||||
|
\item Complex component interaction effects are more likely to be seen (because self diagnostic capability is considered), than FMEA or FMECA but can still be missed.
|
||||||
\item Allows a small proportion of `undetectable' error conditions.
|
\item Allows a small proportion of `undetectable' error conditions.
|
||||||
\item No possibility to model base component level double failure modes.
|
\item No possibility to model base component level double failure modes.
|
||||||
\end{itemize}
|
\end{itemize}
|
||||||
|
@ -81,6 +81,7 @@
|
|||||||
\newcommand{\pin}{\ensuremath{\stackrel{pi}{\longleftrightarrow}}}
|
\newcommand{\pin}{\ensuremath{\stackrel{pi}{\longleftrightarrow}}}
|
||||||
%\newcommand{\pic}{\em pure~intersection~chain}
|
%\newcommand{\pic}{\em pure~intersection~chain}
|
||||||
\newcommand{\pic}{\em pair-wise~intersection~chain}
|
\newcommand{\pic}{\em pair-wise~intersection~chain}
|
||||||
|
\newcommand{\wrt}{\em with-respect~to}
|
||||||
|
|
||||||
%----- Display example text (#1) in typewriter font
|
%----- Display example text (#1) in typewriter font
|
||||||
|
|
||||||
|
@ -12,20 +12,391 @@ A survey of Static Failure Mode analysis Methodologies applicable to saefty crit
|
|||||||
A survey of Static Failure Mode analysis Methodologies applicable to saefty critical systems.
|
A survey of Static Failure Mode analysis Methodologies applicable to saefty critical systems.
|
||||||
}
|
}
|
||||||
|
|
||||||
\section{FMEA}
|
There are four methodologies in common use for failure mode modelling.
|
||||||
|
These are FTA, FMEA, FMECA
|
||||||
|
and FMEDA (a form of statistical assessment).
|
||||||
|
%
|
||||||
|
These methodologies date from the 1940's onwards, and were designed for
|
||||||
|
different application areas and reasons; all have drawbacks and
|
||||||
|
advantages that are discussed in the next section.
|
||||||
|
%In short
|
||||||
|
%FTA, due to its top down nature, can overlook error conditions. FMEA and the Statistical Methods
|
||||||
|
%lack precision in predicting failure modes at the SYSTEM level.
|
||||||
|
|
||||||
Two meanings, a general one Fault Mode Effects Analysis, meaning general statics diagnosis of a design, looking
|
\ifthenelse {\boolean{paper}}
|
||||||
at faults that can occur and their effect.
|
{
|
||||||
|
paper
|
||||||
|
}
|
||||||
|
{
|
||||||
|
chapter
|
||||||
|
}
|
||||||
|
presents the design considerations that motivated and provided the specification for
|
||||||
|
the FMMD methodology.
|
||||||
|
%
|
||||||
|
|
||||||
|
\subsection { FTA }
|
||||||
|
|
||||||
|
This, like all top~down methodologies introduces the very serious problem
|
||||||
|
of missing component failure modes \cite{faa}[Ch.9].
|
||||||
|
%, or modelling at
|
||||||
|
%a too high level of failure mode abstraction.
|
||||||
|
FTA was invented for use on the minuteman nuclear defence missile
|
||||||
|
systems in the early 1960s and was not designed as a rigorous
|
||||||
|
fault/failure mode methodology.
|
||||||
|
It was designed to look for disastrous top level hazards and
|
||||||
|
determine how they could be caused.
|
||||||
|
It is more like a procedure to
|
||||||
|
be applied when discussing the safety of a system, with a top down hierarchical
|
||||||
|
notation using logic symbols, that guides the analysis.
|
||||||
|
This methodology was designed for
|
||||||
|
experienced engineers sitting around a large diagram and discussing the safety aspects.
|
||||||
|
Also the nature of a large rocket with red wire, and remote detonation
|
||||||
|
failsafes meant that the objective was to iron out common failures
|
||||||
|
not to rigorously detect all possible failures.
|
||||||
|
Consequently it was not designed to guarantee to covering all component failure modes,
|
||||||
|
and has no rigorous in-built safeguards to ensure coverage of all possible
|
||||||
|
system level outcomes.
|
||||||
|
|
||||||
|
\subsubsection{ FTA weaknesses }
|
||||||
|
\begin{itemize}
|
||||||
|
\item Possibility to miss component failure modes.
|
||||||
|
\item Possibility to miss environmental affects.
|
||||||
|
\item No possibility to model base component level double failure modes.
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
\subsection { FMEA }
|
||||||
|
|
||||||
|
\label{pfmea}
|
||||||
|
This is an early static analysis methodology, and concentrates
|
||||||
|
on SYSTEM level errors which have been investigated.
|
||||||
|
The investigation will typically point to a particular failure
|
||||||
|
of a component.
|
||||||
|
The methodology is now applied to find the significance of the failure.
|
||||||
|
It is based on a simple equation where $S$ ranks the severity (or cost \cite{bfmea}) of the identified SYSTEM failure,
|
||||||
|
$O$ its occurrence\footnote{The occurrence $O$ is the
|
||||||
|
probability of the failure happening.},
|
||||||
|
and $D$ giving the failures detectability\footnote{Detectability: often failures
|
||||||
|
may occur but not be noticed or cause an effect.
|
||||||
|
Consider an unused feature failing.}. Muliplying these
|
||||||
|
together,
|
||||||
|
gives a risk probability number (RPN), given by $RPN = S \times O \times D$.
|
||||||
|
This gives in effect
|
||||||
|
a prioritised `todo list', with higher $RPN$ values being the most urgent.
|
||||||
|
|
||||||
|
|
||||||
\subsection{Manufacturing Cost Reduction FMEA}
|
\subsubsection{ FMEA weaknesses }
|
||||||
|
\begin{itemize}
|
||||||
|
\item Possibility to miss the effects of failure modes at SYSTEM level.
|
||||||
|
\item Possibility to miss environmental effects.
|
||||||
|
\item No possibility to model base component level double failure modes.
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
Second a methodology for reducing cost in manufacturing by taking fauls, their frequency
|
\paragraph{Note.} FMEA is sometimes used in its literal sense, that is to say
|
||||||
and their cost, multiplying these together, and then coming up with a priority list
|
Failure Mode Effects analysis, simply looking at a systems' internal failure
|
||||||
for fixing knmown faults.
|
modes and determining what may happen as a result.
|
||||||
"The basics of FMEA by Robin E. McDermott et all"
|
FMEA described in this section (\ref{pfmea}) is sometimes called `production FMEA'.
|
||||||
ISBN 0-527-76320-9.
|
|
||||||
|
|
||||||
|
\subsection{FMECA}
|
||||||
|
|
||||||
|
Failure mode, effects, and criticality analysis (FMECA) extends FMEA.
|
||||||
|
This is a bottom up methodology, which takes component failure modes
|
||||||
|
and traces them to the SYSTEM level failures.
|
||||||
|
%
|
||||||
|
Reliability data for components is used to predict the
|
||||||
|
failure statistics in the design stage.
|
||||||
|
An openly published source for the reliability of generic
|
||||||
|
electronic components was published by the DOD
|
||||||
|
in 1991 (MIL HDK 1991 \cite{mil1991}) and is a typical
|
||||||
|
source for MTFF data.
|
||||||
|
%
|
||||||
|
FMECA has a probability factor for a component error becoming % causing
|
||||||
|
a SYSTEM level error.
|
||||||
|
This is termed the $\beta$ factor.
|
||||||
|
%\footnote{for a given component failure mode there will be a $\beta$ value, the
|
||||||
|
%probability that the component failure mode will cause a given SYSTEM failure}.
|
||||||
|
%
|
||||||
|
This lacks precision, or in other words, determinability prediction accuracy \cite{fafmea},
|
||||||
|
as often the component failure mode cannot be proven to cause a SYSTEM level failure, but is
|
||||||
|
assigned a probability $\beta$ factor by the design engineer. The use of a $\beta$ factor
|
||||||
|
is often justified using Bayes theorem \cite{probstat}.
|
||||||
|
%Also, it can miss combinations of failure modes that will cause SYSTEM level errors.
|
||||||
|
%
|
||||||
|
The results of FMECA are similar to FMEA, in that component errors are
|
||||||
|
listed according to importance, based on
|
||||||
|
probability of occurrence and criticallity.
|
||||||
|
% to prevent the SYSTEM fault of given criticallity.
|
||||||
|
Again this essentially produces a prioritised `todo' list.
|
||||||
|
|
||||||
|
%%-WIKI- Failure mode, effects, and criticality analysis (FMECA) is an extension of failure mode and effects analysis (FMEA).
|
||||||
|
%%-WIKI- FMEA is a a bottom-up, inductive analytical method which may be performed at either the functional or
|
||||||
|
%%-WIKI- piece-part level. FMECA extends FMEA by including a criticality analysis, which is used to chart the
|
||||||
|
%%-WIKI- probability of failure modes against the severity of their consequences. The result highlights failure modes with relatively high probability
|
||||||
|
%%-WIKI- and severity of consequences, allowing remedial effort to be directed where it will produce the greatest value.
|
||||||
|
%%-WIKI- FMECA tends to be preferred over FMEA in space and North Atlantic Treaty Organization (NATO) military applications,
|
||||||
|
%%-WIKI- while various forms of FMEA predominate in other industries.
|
||||||
|
|
||||||
|
|
||||||
|
\subsubsection{ FMECA weaknesses }
|
||||||
|
\begin{itemize}
|
||||||
|
\item Possibility to miss the effects of failure modes at SYSTEM level.
|
||||||
|
\item Possibility to miss environmental affects.
|
||||||
|
\item No possibility to model base component level double failure modes.
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
|
||||||
|
\subsection { FMEDA or Statistical Analyis }
|
||||||
|
|
||||||
|
Failure Modes, Effects, and Diagnostic Analysis (FMEDA)
|
||||||
|
% This
|
||||||
|
is a process that takes all the components in a system,
|
||||||
|
and using the failure modes of those components, the investigating engineer
|
||||||
|
ties them to possible SYSTEM level events/failure modes.
|
||||||
|
%
|
||||||
|
This technique
|
||||||
|
evaluates a products statistical level of safety
|
||||||
|
taking into account its self-diagnostic ability.
|
||||||
|
The calculations and procedures for FMEDA are
|
||||||
|
described in EN61508 %Part 2 Appendix C
|
||||||
|
\cite{en61508}[Part 2 App C].
|
||||||
|
The following gives an outline of the procedure.
|
||||||
|
|
||||||
|
|
||||||
|
\subsubsection{Two statistical perspectives}
|
||||||
|
FMEDA is a statistical analysis methodology and is used from one of two perspectives,
|
||||||
|
Probability of Failure on Demand (PFD), and Probability of Failure
|
||||||
|
in continuous Operation, or Failure in Time (FIT).
|
||||||
|
\label{survey:fit}
|
||||||
|
\paragraph{Failure in Time (FIT).} Continuous operation is measured in failures per billion ($10^9$) hours of operation.
|
||||||
|
For a continuously running nuclear powerstation, industrial burner or aircraft engine
|
||||||
|
we would be interested in its operational FIT values.
|
||||||
|
\label{survey:pfd}
|
||||||
|
\paragraph{Probability of Failure on Demand (PFD).} For instance with an anti-lock system in
|
||||||
|
automobile braking, or other fail safe measure applied in an emergency, we would be interested in PFD.
|
||||||
|
That is to say the ratio of it failing
|
||||||
|
to succeeding to operate correctly on demand.
|
||||||
|
|
||||||
|
\subsubsection{The FMEDA Analysis Process}
|
||||||
|
|
||||||
|
\paragraph{Determine SYSTEM level failures from base components}
|
||||||
|
The first stage is to apply FMEA to the SYSTEM.
|
||||||
|
%
|
||||||
|
Each component is analysed in terms of how its failure
|
||||||
|
would affect the system.
|
||||||
|
Failure rates of individual components in the SYSTEM
|
||||||
|
are calculated based on component type and
|
||||||
|
environmental conditions. The SYSTEM errors are categorised as `safe' or `dangerous'.
|
||||||
|
%
|
||||||
|
%Statistical data exists for most component types \cite{mil1992}.
|
||||||
|
%
|
||||||
|
This phase is typically implemented on a spreadsheet
|
||||||
|
with rows representing each component. A typical component spreadsheet row would
|
||||||
|
comprise of
|
||||||
|
component type, placement,
|
||||||
|
part number, environmental stress factors, MTTF, safe/dangerous etc.
|
||||||
|
%will be a determination of whether the component failing will lead to a `safe'
|
||||||
|
%or `unsafe' condition.
|
||||||
|
|
||||||
|
\paragraph{Overall SYSTEM failure rate.}
|
||||||
|
The product failure rate is the sum of all component
|
||||||
|
failure rates. Typically the sum of all MTTF rates for all
|
||||||
|
components in an FMEDA spreadsheet.
|
||||||
|
%This is the sum of safe and unsafe
|
||||||
|
%failures.
|
||||||
|
|
||||||
|
\paragraph{Self Diagnostics.}
|
||||||
|
We next evaluate the SYSTEM's self-diagnostic ability.
|
||||||
|
|
||||||
|
%Each component’s failure modes and failure rate are now available.
|
||||||
|
Failure modes are now classified as safe or dangerous.
|
||||||
|
This is done by taking a component failure mode and determining
|
||||||
|
if the SYSTEM error it is tied to is dangerous or safe.
|
||||||
|
The decision for this may be
|
||||||
|
based on heuristics or field data.
|
||||||
|
EN61508 uses the $\lambda$ symbol to represent probabilities.
|
||||||
|
Because we have statistics for each component failure mode,
|
||||||
|
we can now now classify these in terms of safe and dangerous lambda values.
|
||||||
|
Detectable failure probabilities are labelled `$\lambda_D$' (for
|
||||||
|
dangerous) and `$\lambda_S$' (for safe) \cite{en61508}.
|
||||||
|
|
||||||
|
\paragraph{Determine Detectable and Undetectable Failures.}
|
||||||
|
Each safe and dangerous failure mode is now
|
||||||
|
classified as detectable or un-detectable.
|
||||||
|
EN61508 assumes that products have a high level of
|
||||||
|
self checking features.
|
||||||
|
%
|
||||||
|
This gives us four level failure mode classifications:
|
||||||
|
Safe-Detected (SD), Safe-Undetected (SU), Dangerous-Detected (DD) or Dangerous-Undetected (DU),
|
||||||
|
and the probablistic failure rate of each classification
|
||||||
|
is represented by lambda variables
|
||||||
|
(i.e. $\lambda_{SD}$, $\lambda_{SU}$, $\lambda_{DD}$, $\lambda_{DU}$).
|
||||||
|
|
||||||
|
Because it is recognised that some failure modes may not be discovered theoretically during the static
|
||||||
|
analysis, the
|
||||||
|
% admission of how daft it is to take a component failure mode on its own
|
||||||
|
% and guess how it will affect an ENTIRE complex SYSTEM
|
||||||
|
% Admission of failure of the process really !!!!
|
||||||
|
next step is to investigate using an actual working SYSTEM.
|
||||||
|
|
||||||
|
Failures are deliberately caused (by physical intervention), and any new SYSTEM level
|
||||||
|
failures are added to the model.
|
||||||
|
Heuristics and MTTF failure rates for the components
|
||||||
|
are used to calculate probabilities for these new failure modes
|
||||||
|
along with their safety and detectability classifications (i.e.
|
||||||
|
$\lambda_{SD}$, $\lambda_{SU}$, $\lambda_{DD}$, $\lambda_{DU}$).
|
||||||
|
These new failures are added to the model.
|
||||||
|
%SD, SU, DD, DU.
|
||||||
|
|
||||||
|
With these classifications, and statistics for each component
|
||||||
|
we can now calculate statistics for the diagnostic coverage (how good at `self checking' the system is)
|
||||||
|
and its safe failure fraction (how many of its failures are self detected or safe compared to
|
||||||
|
all failures possible).
|
||||||
|
|
||||||
|
The calculations for these are described below.
|
||||||
|
|
||||||
|
\paragraph{Diagnostic Coverage.}
|
||||||
|
The diagnostic coverage is simply the ratio
|
||||||
|
of the dangerous detected probabilities
|
||||||
|
against the probability of all dangerous failures,
|
||||||
|
and is normally expressed as a percentage. $\Sigma\lambda_{DD}$ represents
|
||||||
|
the percentage of dangerous detected base component failure modes, and
|
||||||
|
$\Sigma\lambda_D$ the total number of dangerous base component failure modes.
|
||||||
|
|
||||||
|
$$ DiagnosticCoverage = \Sigma\lambda_{DD} / \Sigma\lambda_D $$
|
||||||
|
|
||||||
|
The diagnostic coverage for safe failures, where $\Sigma\lambda_{SD}$ represents the percentage of
|
||||||
|
safe detected base component failure modes,
|
||||||
|
and $\Sigma\lambda_S$ the total number of safe base component failure modes,
|
||||||
|
is given as
|
||||||
|
|
||||||
|
$$ SF = \frac{\Sigma\lambda_{SD}}{\Sigma\lambda_S} $$
|
||||||
|
|
||||||
|
|
||||||
|
\paragraph{Safe Failure Fraction.}
|
||||||
|
A key concept in FMEDA is Safe Failure Fraction (SFF).
|
||||||
|
This is the ratio of safe and dangerous detected failures
|
||||||
|
against all safe and dangerous failure probabilities.
|
||||||
|
Again this is usually expressed as a percentage.
|
||||||
|
|
||||||
|
$$ SFF = \big( \Sigma\lambda_S + \Sigma\lambda_{DD} \big) / \big( \Sigma\lambda_S + \Sigma\lambda_D \big) $$
|
||||||
|
|
||||||
|
%This is the ratio of
|
||||||
|
%Step 4 Calculate SFF, SIL and PFD
|
||||||
|
%The SIL level of the product is finally determined from the Safe Failure Fraction (SFF) and the Probability of Failure on Demand (PFD). The following formulas are used.
|
||||||
|
%SFF = (lSD + lSU + lDD) / (lSD + lSU + lDD + lDU)
|
||||||
|
%PFD = (lDU)(Proof Test Interval)/2 + (lDD)(Down Time or Repair Time)
|
||||||
|
|
||||||
|
% Often a given component failure mode there will be a $\beta$ value, the
|
||||||
|
% probability that the component failure mode will cause a given SYSTEM failure.
|
||||||
|
|
||||||
|
%\paragraph{Risk Mitigation}
|
||||||
|
%
|
||||||
|
%The component may be have its risk factor
|
||||||
|
%reduced by the checking interval (or $\tau$ time between self checking procedures).
|
||||||
|
%
|
||||||
|
%Ultimately this technique calculates a risk factor for each component.
|
||||||
|
%The risk factors of all the components are summed and
|
||||||
|
%%give a value for the `safety level' for the equipment in a given environment.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
\paragraph{Classification into Safety Integrity Levels (SIL).}
|
||||||
|
There are four SIL levels, from 1 to 4 with 4 being the highest safety level.
|
||||||
|
In addition to probablistic risk factors, the
|
||||||
|
diagnostic coverage and SFF
|
||||||
|
have threshold bands beoming stricter for each level.
|
||||||
|
Demanded software verification and specification techniques and constraints
|
||||||
|
(such as language subsets, s/w redundancy etc)
|
||||||
|
become stricter for each SIL level.
|
||||||
|
%%
|
||||||
|
%% Andrew asked me to expand on this here, but it would take at least two
|
||||||
|
%% pages. I think its more appropriate for the survey.tex chapter.
|
||||||
|
%%
|
||||||
|
|
||||||
|
Thus FMEDA uses statistical methods to determine
|
||||||
|
a safety level (SIL), typically used to meet an acceptable risk
|
||||||
|
value, specified for the environment the SYSTEM must work in.
|
||||||
|
EN61508 defines in general terms,
|
||||||
|
risk assessment and required SIL levels \cite{en61508} [5 Annex A].
|
||||||
|
|
||||||
|
%the probability of
|
||||||
|
%failures occurring, and provide an adaquate risk level.
|
||||||
|
%
|
||||||
|
%A component failure mode, given its MTTF
|
||||||
|
%the probability of detecting the fault and its safety relevant validation time $\tau$,
|
||||||
|
%contributes a simple risk factor that is summed
|
||||||
|
%in to give a final risk result.
|
||||||
|
%
|
||||||
|
Thus an FMEDA
|
||||||
|
model can be implemented on a spreadsheet, where each component
|
||||||
|
has a calculated risk, a fault detection time (if any), an estimated risk importance
|
||||||
|
and other factors such as de-rating and environmental stress.
|
||||||
|
With one component failure mode per row,
|
||||||
|
all the statistical factors for SIL rating can be produced\footnote{A SIL rating will apply
|
||||||
|
to an installed plant, i.e. a complete installed and working SYSTEM. SIL ratings for individual components or
|
||||||
|
sub-systems are meaningless, and the nearest equivalent would be the FIT/PFD and SFF and diagnostic coverage figures.}.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
\subsubsection{FMEDA and failure outcome prediction accuracy.}
|
||||||
|
FMEDA suffers from the same problems of
|
||||||
|
lack of component failure mode outcome prediction accuracy, as FMEA in section \ref{pfmea}.
|
||||||
|
%
|
||||||
|
This is because the analyst has to decide how particular components failing will impact on the SYSTEM or top level.
|
||||||
|
This involves a `leap of faith'. For instance, a resistor failing in a sensor circuit
|
||||||
|
may be part of a critical monitoring function.
|
||||||
|
The analyst is now put in a position
|
||||||
|
where he probably should assign a dangerous failure classification to it.
|
||||||
|
%
|
||||||
|
There is no analysis
|
||||||
|
of how that resistor would/could affect the components close to it, but because the circuitry
|
||||||
|
is part of critical section it will most likely
|
||||||
|
be linked to a dangerous system level failure in an FMEDA study.
|
||||||
|
%
|
||||||
|
%%- IS THIS TRUE IS THERE A BETA FACTOR IN FMEDA????
|
||||||
|
%%-
|
||||||
|
%A $\beta$ factor, the heuristically defined probability
|
||||||
|
%of the failure causing the system fault may be applied.
|
||||||
|
%
|
||||||
|
%In FMEDA there is no detailed analysis of the failure mode behaviour
|
||||||
|
%of the component in its local environment
|
||||||
|
%Component failure modes are traceable directly to the SYSTEM level.
|
||||||
|
%it becomes more
|
||||||
|
%guess work than science.
|
||||||
|
%
|
||||||
|
With FMEDA, there is no rigorous cause and effect analysis for the failure modes
|
||||||
|
and how they interact on the micro scale (the components adjacent to them in terms of functionality).
|
||||||
|
Unintended side effects that lead to failure can be missed.
|
||||||
|
Also component failure modes that are not
|
||||||
|
dangerous, may be wrongly assigned as dangerous simply because they exist in a critical
|
||||||
|
section of the product.
|
||||||
|
|
||||||
|
% some critical component failure
|
||||||
|
%modes, but we can only guess, in most cases what the safety case outcome
|
||||||
|
%will be if it occurs.
|
||||||
|
|
||||||
|
This leads to the practise of having components within a SYSTEM partitioned into different
|
||||||
|
safety level zones as recomended in EN61508\cite{en61508}. This is a vague way of determining
|
||||||
|
safety, as it can miss unexpected effects due to `unexpected' component interaction.
|
||||||
|
|
||||||
|
The Statistical Analysis methodology is the core philosophy
|
||||||
|
of the Safety Integrity Levels (SIL) embodied in EN61508 \cite{en61508}
|
||||||
|
and its international analog standard IOC5108.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
\subsubsection{ FMEDA weaknesses }
|
||||||
|
\begin{itemize}
|
||||||
|
\item Possibility to miss the effects of failure modes at SYSTEM level.
|
||||||
|
\item Statistical nature allows a proportion of undetected failures for given S.I.L. level.
|
||||||
|
\item Allows a small proportion of `undetectable' error conditions.
|
||||||
|
\item No possibility to model base component level double failure modes.
|
||||||
|
\end{itemize}
|
||||||
|
%AND then how we can solve all there problems
|
||||||
|
|
||||||
\subsection{Deterministic FMEA}
|
\subsection{Deterministic FMEA}
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user