580 lines
27 KiB
TeX
580 lines
27 KiB
TeX
\label{sec:chap8}
|
|
|
|
This study has examined the processes and state of the art of the four main FMEA variants.
|
|
%
|
|
It has exposed shortcomings in these methodologies, which can be summed up as an inability to
|
|
model hybrid software and hardware systems in a satisfactory manner, a problem with state explosion
|
|
and difficulty of re-use of analysis because there is no support for modularity.
|
|
%
|
|
The FMECA and FMEDA variants also suffer from embedding subjective and objective assessments of failure modes.
|
|
%
|
|
A modularised FMEA---Failure Mode Modular De-composition (FMMD)---had been proposed.
|
|
%
|
|
This modularised version had been supported by the work already established by the definition of
|
|
{\fms} for {\bc} in the literature~\cite{fmd91,mil1991,en298,en230}.
|
|
%
|
|
A selection of electronic examples was analysed using FMMD
|
|
which deliberately introduced varying circuit
|
|
topologies with conventional and circular signal paths
|
|
and mixed digital and analogue designs.
|
|
%
|
|
For all these examples, the state explosion related performance was compared with that of
|
|
traditional FMEA.
|
|
%
|
|
In all cases there was a performance gain,
|
|
that is to say that for all but trivial cases,
|
|
the number of manual analysis operations to perform
|
|
was significantly reduced.
|
|
%
|
|
Not only this, but the analysis naturally provided modules which could be re-used,
|
|
re-used not only in the circuit under analysis but potentially in different and future projects as well.
|
|
|
|
Traditional FMEA methods have been applied to software, but analysis has always been performed separately from
|
|
the electronic FMEA~\cite{sfmeaa,sfmea}. %, and while modular kept strictly to a bottom-up approach.
|
|
%
|
|
Using established concepts from contract programming~\cite{dbcbe} FMMD was extended to analyse software,
|
|
which facilitated a solution to the software/hardware interfacing problem~\cite{sfmeainterface}.
|
|
%
|
|
Two examples of mixed software and hardware systems were analysed as integrated FMMD models
|
|
as proof of concept. The first example in chapter~\ref{sec:chap6}, was
|
|
presented to the System Safety IET conference in 2012~\cite{syssafe2012}.
|
|
%
|
|
Chapter~\ref{sec:chap7} viewed FMMD from a formal perspective and looked at problems and constraints
|
|
necessary to perform FMEA and FMMD.
|
|
%
|
|
Theoretical performance models were developed (see section~\ref{sec:theoreticalperfmodel}) which showed that with increasing modularisation
|
|
the number of manual checks to perform for analysis fell, which was validated by examining the reasoning distance performance of
|
|
the examples from chapter~\ref{sec:chap5}. % in this regard.
|
|
%
|
|
A unitary state failure mode concept was developed (see section~\ref{sec:unitarystate}), and it was shown that
|
|
the FMMD process naturally enforced this throughout the hierarchy of a model.
|
|
%
|
|
Finally the FMMD process was described algorithmically using set theory in appendix~\ref{sec:algorithmfmmd}.%{app:alg}.
|
|
|
|
In conclusion then, a new method of failure analysis has been devised which improves on established techniques in the following ways:
|
|
% \begin{itemize}
|
|
% \item Must be able to analyse hybrid software/hardware systems,
|
|
% \item no state explosion (which has rendered exhaustive analysis impractical),
|
|
% \item exhaustive checking at a modular level, %(total failure coverage within {\fgs} all interacting component and failure modes checked),
|
|
% \item traceable reasoning system models,% to aid repeatability and checking,
|
|
% \item re-usable i.e. it should be possible to re-use analysis,
|
|
% \item possibility to analyse simultaneous/multiple failures,
|
|
% \item modular --- i.e. usable in a distributed system.
|
|
% % \item
|
|
% \end{itemize}
|
|
|
|
\begin{itemize}
|
|
\item FMMD provides the means to create failure models that integrate software and hardware,
|
|
\item the state explosion related to exhaustive FMEA solved,
|
|
\item a modular approach to FMEA means that analysis work is re-usable,
|
|
%\item FMMD encourages
|
|
\item distributed systems, and smart instruments, can now be analysed and assessed,
|
|
\item multiple failures can be analysed (without an undue state explosion cost).
|
|
\end{itemize}
|
|
These benefits fall under the following assumptions and constraints:
|
|
\begin{itemize}
|
|
\item Failure modes are available for all {\bcs},
|
|
\item Analysts are capable of finding suitable {\fgs} from electronic schematics,
|
|
\item Software is hierarchical and its elements (functions) can be modelled using contract programming.
|
|
%\item
|
|
\end{itemize}
|
|
|
|
|
|
|
|
Whilst investigating FMMD a number of further areas for research revealed themselves.
|
|
These are presented below.
|
|
|
|
%\section{Conclusion}
|
|
|
|
% It is the authors belief that the practise of FMEA would be improved by taking a modular approach
|
|
% and that it is necessary that software and hardware should be included in the same failure mode models.
|
|
% %
|
|
% The proposed methodology, FMMD, provides the means to do this, and it is the authors hope that this
|
|
% or a variant thereof is taken up and used to improve system safety.
|
|
|
|
\section{Further Work}
|
|
%This section describes areas that the study has revealed where the FMMD methodology may be extended or improved.
|
|
\subsection{How traditional FMEA reports can be derived from an FMMD model.}
|
|
%
|
|
An FMMD model has a data structure (described by UML diagrams, see figure~\ref{fig:cfg}), and by traversing an FMMD hierarchy
|
|
we can map system level failures back to {\bc} {\fms} (or combinations thereof).
|
|
%
|
|
Because we can determine these mappings we can produce reports in the traditional FMEA format ({\bc}~{\fm}~$\mapsto$~{system failure}).
|
|
%
|
|
With the addition of {\bc} {\fm} statistics~\cite{mil1991} we can provide reliability predictions for system level failures.
|
|
%
|
|
The Pt100 example is revisited for this purpose and analysed for single and double failures, with statistics for {\bcs}
|
|
taken from MIL1991 %~\cite{mil1991},
|
|
in section~\ref{sec:bcstats}.
|
|
%
|
|
With an FMMD failure mode model a top down perspective is possible.
|
|
%
|
|
We could for instance take each system level failure and produce a causation tree for it, tracing back
|
|
to all {\bc} {\fms}.
|
|
%
|
|
This is very closely related to the structure of FTA (top down) failure causation graphs.
|
|
%
|
|
The possibility of automatically producing FTA diagrams from FMMD models
|
|
is examined in section~\ref{sec:fta}.
|
|
%
|
|
|
|
|
|
\subsection{Statistics: From base component failure modes to System level events/failures.}
|
|
\label{sec:bcstats}
|
|
Knowing the statistical likelihood of a component failing can give a good indication
|
|
of the reliability of a system, or in the case of dangerous failures, the Safety Integrity Level
|
|
of a system.
|
|
%
|
|
EN61508~\cite{en61508} requires that statistical data is available and used for all component failure modes
|
|
analysed by FMEDA.
|
|
%
|
|
FMMD, as a bottom up methodology can use component failure mode statistical data, and incorporate it
|
|
into its hierarchical model.
|
|
%By way of example, the Pt100 analysis %example
|
|
%from section~\{sec:pt100} has been used to demonstrate this.
|
|
Because we can use an FMMD model to generate an FMEA report, with additional {\bc} failure mode statistics
|
|
we can %therefore
|
|
use FMMD to produce an FMEDA report.
|
|
|
|
|
|
\paragraph{Pt100 Example: Single Failures and statistical data} %Mean Time to Failure}
|
|
|
|
From an earlier example, the model for the failure mode behaviour of the Pt100 circuit,
|
|
we can add {\bc} {\fm} statistics and determine the probability of symptoms of failure.
|
|
%
|
|
The DOD electronic reliability of components
|
|
document MIL-HDBK-217F\cite{mil1991} gives formulae for calculating
|
|
the
|
|
%$\frac{failures}{{10}^6}$
|
|
${failures}/{{10}^6}$ % looks better
|
|
in hours for a wide range of generic components
|
|
\footnote{These figures are based on components from the 1980's and MIL-HDBK-217F
|
|
can give conservative reliability figures when applied to
|
|
modern components}.
|
|
%
|
|
Using the MIL-HDBK-217F\cite{mil1991} specifications for resistor and thermistor
|
|
failure statistics, we calculate the reliability of the Pt100 example (see section~\ref{sec:Pt100}).
|
|
|
|
|
|
\paragraph{Resistor FIT Calculations}
|
|
|
|
The formula given in MIL-HDBK-217F\cite{mil1991}[9.2] for a generic fixed film non-power resistor
|
|
is reproduced in equation \ref{resistorfit}. The meanings
|
|
and values assigned to its co-efficients are described in table \ref{tab:resistor}.
|
|
\glossary{name={FIT}, description={Failure in Time (FIT). The number of times a particular
|
|
failure is expected to occur in a $10^{9}$ hour time period.}}
|
|
|
|
|
|
\fmodegloss
|
|
|
|
\begin{equation}
|
|
% fixed comp resistor{\lambda}_p = {\lambda}_{b}{\pi}_{R}{\pi}_Q{\pi}_E
|
|
resistor{\lambda}_p = {\lambda}_{b}{\pi}_{R}{\pi}_Q{\pi}_E
|
|
\label{resistorfit}
|
|
\end{equation}
|
|
|
|
\begin{table}[ht]
|
|
\caption{Fixed film resistor Failure in time assessment} % title of Table
|
|
\centering % used for centering table
|
|
\begin{tabular}{||c|c|l||}
|
|
\hline \hline
|
|
\em{Parameter} & \em{Value} & \em{Comments} \\
|
|
& & \\ \hline \hline
|
|
${\lambda}_{b}$ & 0.00092 & stress/temp base failure rate $60^o$ C \\ \hline
|
|
%${\pi}_T$ & 4.2 & max temp of $60^o$ C\\ \hline
|
|
${\pi}_R$ & 1.0 & Resistance range $< 0.1M\Omega$\\ \hline
|
|
${\pi}_Q$ & 15.0 & Non-Mil spec component\\ \hline
|
|
${\pi}_E$ & 1.0 & benign ground environment\\ \hline
|
|
|
|
\hline \hline
|
|
\end{tabular}
|
|
\label{tab:resistor}
|
|
\end{table}
|
|
|
|
Applying equation \ref{resistorfit} with the parameters from table \ref{tab:resistor}
|
|
give the following failures in ${10}^6$ hours:
|
|
|
|
\begin{equation}
|
|
0.00092 \times 1.0 \times 15.0 \times 1.0 = 0.0138 \;{failures}/{{10}^{6} Hours}
|
|
\label{eqn:resistor}
|
|
\end{equation}
|
|
|
|
While MIL-HDBK-217F gives MTTF for a wide range of common components,
|
|
it does not specify how the components will fail (in this case OPEN or SHORT).
|
|
%
|
|
Some standards, notably EN298 only consider most types of resistor as failing in OPEN mode.
|
|
%FMD-97 gives 27\% OPEN and 3\% SHORTED, for resistors under certain electrical and environmental stresses.
|
|
% FMD-91 gives parameter change as a third failure mode, luvvverly 08FEB2011
|
|
This example
|
|
compromises and uses a 9:1 OPEN:SHORT ratio, for resistor failure.
|
|
%
|
|
Thus for this example resistors are expected to fail OPEN in 90\% of cases and SHORTED
|
|
in the other 10\%.
|
|
A standard fixed film resistor, for use in a benign environment, non military specification at
|
|
temperatures up to {60\oc} is given a probability of 13.8 failures per billion ($10^9$)
|
|
hours of operation (see equation \ref{eqn:resistor}).
|
|
In EN61508 terminology, this figure is referred to as a Failure in Time FIT\footnote{FIT values are measured as the number of
|
|
failures per Billion (${10}^9$) hours of operation, (roughly 114,000 years). The smaller the
|
|
FIT number the more reliable the component.}.
|
|
%
|
|
The formula given for a thermistor in MIL-HDBK-217F\cite{mil1991}[9.8] is reproduced in
|
|
equation \ref{thermistorfit}. The variable meanings and values are described in table \ref{tab:thermistor}.
|
|
%
|
|
\begin{equation}
|
|
% fixed comp resistor{\lambda}_p = {\lambda}_{b}{\pi}_{R}{\pi}_Q{\pi}_E
|
|
resistor{\lambda}_p = {\lambda}_{b}{\pi}_Q{\pi}_E
|
|
\label{thermistorfit}
|
|
\end{equation}
|
|
%
|
|
\begin{table}[ht]
|
|
\caption{Bead type Thermistor Failure in time assessment} % title of Table
|
|
\centering % used for centering table
|
|
\begin{tabular}{||c|c|l||}
|
|
\hline \hline
|
|
\em{Parameter} & \em{Value} & \em{Comments} \\
|
|
& & \\ \hline \hline
|
|
${\lambda}_{b}$ & 0.021 & stress/temp base failure rate bead thermistor \\ \hline
|
|
%${\pi}_T$ & 4.2 & max temp of $60^o$ C\\ \hline
|
|
%${\pi}_R$ & 1.0 & Resistance range $< 0.1M\Omega$\\ \hline
|
|
${\pi}_Q$ & 15.0 & Non-Mil spec component\\ \hline
|
|
${\pi}_E$ & 1.0 & benign ground environment\\ \hline
|
|
|
|
\hline \hline
|
|
\end{tabular}
|
|
\label{tab:thermistor}
|
|
\end{table}
|
|
%
|
|
\begin{equation}
|
|
0.021 \times 1.0 \times 15.0 \times 1.0 = 0.315 \; {failures}/{{10}^{6} Hours}
|
|
\label{eqn:thermistor}
|
|
\end{equation}
|
|
%
|
|
Thus thermistor, bead type, `non~military~spec' is given a FIT of 315.0
|
|
%
|
|
Using the RIAC finding we can draw up the following table (table \ref{tab:stat_single}),
|
|
showing the FIT values for all faults considered.
|
|
\glossary{name={FIT}, description={Failure in Time (FIT). The number of times a particular failure is expected to occur in a $10^{9}$ hour time period.}}
|
|
|
|
\begin{table}[h+]
|
|
\caption{Pt100 FMEA Single // Fault Statistics} % title of Table
|
|
\centering % used for centering table
|
|
\begin{tabular}{||l|c|c|l|l||}
|
|
\hline \hline
|
|
\textbf{Test} & \textbf{Result} & \textbf{Result } & \textbf{MTTF} \\
|
|
\textbf{Case} & \textbf{sense +} & \textbf{sense -} & \textbf{per $10^9$ hours of operation} \\
|
|
% R & wire & res + & res - & description
|
|
\hline
|
|
\hline
|
|
TC:1 $R_1$ SHORT & High Fault & - & 1.38 \\ \hline
|
|
TC:2 $R_1$ OPEN & Low Fault & Low Fault & 12.42\\ \hline
|
|
\hline
|
|
TC:3 $R_3$ SHORT & Low Fault & High Fault & 31.5 \\ \hline
|
|
TC:4 $R_3$ OPEN & High Fault & Low Fault & 283.5 \\ \hline
|
|
\hline
|
|
TC:5 $R_2$ SHORT & - & Low Fault & 1.38 \\
|
|
TC:6 $R_2$ OPEN & High Fault & High Fault & 12.42 \\ \hline
|
|
\hline
|
|
\end{tabular}
|
|
\label{tab:stat_single}
|
|
\end{table}
|
|
|
|
The FIT for the circuit as a whole is the sum of MTTF values for all the
|
|
test cases. The Pt100 circuit here has a FIT of 342.6. This is a MTTF of
|
|
about 360 years per circuit.
|
|
|
|
A probabilistic tree can now be drawn, with a FIT value for the Pt100
|
|
circuit and FIT values for all the component fault modes from which it was calculated.
|
|
We can see from this that the most likely fault is the thermistor going OPEN.
|
|
This circuit is around 10 times more likely to fail in this way than in any other.
|
|
Were we to need a more reliable temperature sensor, this would probably
|
|
be the fault~mode we would scrutinise first.
|
|
|
|
|
|
\begin{figure}[h+]
|
|
\centering
|
|
\includegraphics[width=400pt,bb=0 0 856 327,keepaspectratio=true]{./CH5_Examples/stat_single.png}
|
|
% stat_single.jpg: 856x327 pixel, 72dpi, 30.20x11.54 cm, bb=0 0 856 327
|
|
\caption{Probablistic Fault Tree : Pt100 Single Faults}
|
|
\label{fig:stat_single}
|
|
\end{figure}
|
|
|
|
|
|
The Pt100 analysis presents a simple result for single faults.
|
|
The next analysis phase looks at how the circuit will behave under double simultaneous failure
|
|
conditions.
|
|
|
|
|
|
\paragraph{Pt100 Example: Double Failures and statistical data}
|
|
Because we can perform double simultaneous failure analysis under FMMD
|
|
we can also apply failure rate statistics to double failures.
|
|
%
|
|
%%
|
|
%% Need to talk abou the `detection time'
|
|
%% or `Safety Relevant Validation Time' ref can book
|
|
%% EN61508 gives detection calculations to reduce
|
|
%% statistical impacts of failures.
|
|
%%
|
|
%
|
|
If we consider the failure modes to be statistically independent we can calculate
|
|
the FIT values for all the combinations failures in the electronic examples chapter~\ref{sec:chap5} table~\ref{tab:ptfmea2}.
|
|
%
|
|
The failure mode of concern, the undetectable {\textbf{FLOATING}} condition
|
|
requires that resistors $R_1$ and $R_2$ fail.
|
|
%
|
|
We can multiply the MTTF
|
|
together and find an MTTF for both failing.
|
|
%
|
|
The FIT value of 12.42 corresponds to
|
|
$12.42 \times {10}^{-9}$ failures per hour. Squaring this gives $ 154.3 \times {10}^{-18} $.
|
|
%
|
|
This is an astronomically small MTTF, and so small that it would
|
|
probably fall below a threshold to sensibly consider.
|
|
%
|
|
However, it is very interesting from a failure analysis perspective,
|
|
because here we have found a fault that we cannot detect (at least at this
|
|
level in the FMMD hierarchy).
|
|
%
|
|
This means that should we wish to cope with
|
|
this fault, we need to devise a new way of detecting this
|
|
condition, perhaps in higher levels of the system/FMMD hierarchy.
|
|
%
|
|
\glossary{name={FIT}, description={Failure in Time (FIT). The number of times a particular failure is expected to occur in a $10^{9}$ hour time period. Associated with continuous demand systems under EN61508~\cite{en61508}}}
|
|
%
|
|
%
|
|
\subsection{Deriving FTA diagrams from FMMD models}
|
|
\label{sec:fta}
|
|
%
|
|
Fault Tree Analysis (FTA)~\cite{ftahistory} is a top down methodology that
|
|
draws a fault tree---or top down fault causation diagram---for each given top-level
|
|
failure. With an FMMD model, we can trace all the causes of system failures
|
|
down to the base component level.
|
|
%
|
|
This would be enough to create a fault causation tree, but FTA introduces
|
|
concepts of operational and environmental states, and inhibit gates.
|
|
%
|
|
The FMEA philosophy in relation to these three concepts are to assume that they are worst cases, that they
|
|
{\em may} occur,
|
|
and determine what system failures may arise.
|
|
%
|
|
The FTA perspective is that some safety can be built in
|
|
by preventing certain things happening (inhibit gates), and by considering
|
|
different behaviour due to environmental or operational states~\cite{nucfta,nasafta}.
|
|
%
|
|
If we require FMMD to produce full FTA diagrams, we need to add these
|
|
attributes to the FMMD UML model\footnote{Top down failure mode models, such as FTA, are additionally
|
|
useful in guiding diagnostic analysis.}.
|
|
|
|
|
|
\paragraph{Environment, operational states and inhibit gates: additions to the UML model.}
|
|
|
|
FTA, in addition to using symbols borrowed from digital logic introduces three new symbols to
|
|
model environmental, operational state and inhibit gates; we discuss here how these can be incorporated into
|
|
the FMMD model.
|
|
|
|
A system will be expected to perform in a given environment.
|
|
%
|
|
Environment in the context of this study
|
|
means external influences under which the system could be expected to work. % under.
|
|
%
|
|
A typical data sheet for an electrical component will give
|
|
a working temperature range: %, for instance.
|
|
mechanical components could be specified for stress and loading limits.
|
|
It is unusual to have failure modes described in product literature, although
|
|
for complicated components with firmware, errata documents~\cite{pic18f25k80erratta} are sometimes produced.
|
|
|
|
Systems may have distinct operational states. For instance, a safety critical controller
|
|
may have a LOCKOUT state where it has detected a serious problem and will not continue to operate until
|
|
authorised human intervention takes place.
|
|
A safety critical circuit may have a self test mode which could be operated externally:
|
|
a micro-processor may have a SLEEP mode etc.
|
|
%
|
|
To make FMMD compatible with FTA operational states and environmental conditions should %can %must
|
|
be factored into the UML model.
|
|
%
|
|
An undesired condition may occur where it could be necessary to inhibit some action of the system.
|
|
This is rather like a logical guard criterion. For instance in the gas burner standard EN298 it
|
|
states that a flame detector must confirm that a pilot flame has been established before the main burner fuel can be applied.
|
|
In FTA terms this would be an inhibit condition on the main fuel, i.e. PILOT\_NOT\_CONFIRMED.
|
|
|
|
We now look at the nature of these three attributes and decide how they should fit into the UML
|
|
model for FMMD developed in section~\ref{sec:fmmd_uml}.
|
|
|
|
\paragraph{Environmental Modelling.} The external influences/environment could typically be temperature ranges,
|
|
levels of electrical interference, high voltage contamination on supply
|
|
lines, radiation levels etc.
|
|
Environmental influences will affect specific components in specific ways\footnote{A good example of a part
|
|
affected by environmental conditions, in this case temperature, is the opto-isolator~\cite{tlp181}
|
|
which is typically affected at around {60 \oc}. Most electrical components are more robust to temperature variations.}.
|
|
Environmental analysis is thus applicable to components.
|
|
Environmental influences, such as over-stress due to voltage
|
|
can be eliminated by down-rating components as discussed in section~\ref{sec:determine_fms}.
|
|
With given environmental constraints, we can therefore eliminate some failure modes from the model.
|
|
|
|
|
|
\paragraph{Operational states.}
|
|
Within the field of safety critical engineering, we often encounter
|
|
elements that include test or self-test facilities.
|
|
%
|
|
We also encounter degraded performance
|
|
(such as only performing certain functions in an emergency) and lockout/emergency conditions.
|
|
These can be broadly termed operational states. %, and apply to the
|
|
%functional groups.
|
|
%
|
|
We need to determine which UML class is most appropriate to hold a relationship
|
|
to operational states.
|
|
%
|
|
Consider for instance an electrical circuit that has a TEST line.
|
|
When the TEST line is activated, it supplies a test signal
|
|
which will validate the circuit. This circuit will have two operational states,
|
|
NORMAL and TEST mode.
|
|
%
|
|
It seems more appropriate to apply the operational states to {\fgs}
|
|
which %
|
|
%Functional groupings
|
|
by definition implement functionality, or purpose.
|
|
On this basis we associate operational states with {\fgs}.
|
|
%therefore are the best objects to model
|
|
%operational states.% with.
|
|
|
|
\paragraph{Inhibit Conditions.}
|
|
Inhibit conditions and the symbols used for them are described in~\cite{nasafta}[p.40]. % is required. %desired.
|
|
%
|
|
Some failure modes may only be active given specific environmental conditions
|
|
or when other failures are already active.
|
|
%
|
|
To model this, an `inhibit' class has been added.
|
|
%
|
|
This is an optional attribute of
|
|
a failure mode.
|
|
%
|
|
This inhibit class can be triggered
|
|
on a combination of environmental or failure modes.
|
|
%
|
|
In the UML diagram, we therefore link this with
|
|
both environmental conditions and failure modes.
|
|
|
|
|
|
|
|
|
|
|
|
\paragraph{UML Diagram Additional Objects.}
|
|
The additional objects System, Environment, Inhibit and Operational States
|
|
are added to UML diagram in figure \ref{fig:cfg} are represented in figure \ref{fig:cfg2}.
|
|
|
|
\label{completeumlfurtherwork}
|
|
|
|
\begin{figure}[h]
|
|
\centering
|
|
\includegraphics[width=400pt,keepaspectratio=true]{./CH8_Conclusion/master_uml_further_work.png}
|
|
% cfg2.png: 702x464 pixel, 72dpi, 24.76x16.37 cm, bb=0 0 702 464
|
|
\caption{FMMD UML diagram, incorporating Environmental, Operational State and Inhibit gates}
|
|
\label{fig:cfg2}
|
|
\end{figure}
|
|
|
|
\clearpage
|
|
|
|
\subsection{Retrospective Failure Mode analysis and FMMD}
|
|
|
|
The reasons for applying retrospective failure mode analysis could be approving previously un-assessed
|
|
systems to a safety standard, or to determine the failure mode behaviour of an instrument used in
|
|
safety critical verification. % verification.
|
|
%
|
|
FMMD can be applied retrospectively to a project, and because of its modular nature, coupled with
|
|
its `bottom-up~work~flow' it
|
|
can reveal previously undetected system failure modes.
|
|
%
|
|
This is because the analyst
|
|
is forced to deal with all component failure modes when applying the FMMD process, and
|
|
all failure modes of the resultant {\dcs} as we progress up a hierarchy.
|
|
%
|
|
FMMD requires that all failure modes of components in a {\fg} are resolved to
|
|
a symptom in the resulting {\dc}.
|
|
%
|
|
Because we can enforce a `complete' analysis, FMMD can find failure modes were missed by
|
|
other FMEA processes; meaning that the FMMD process can expose un-handled
|
|
failure modes.
|
|
%come to light.
|
|
|
|
We can apply retrospective FMMD to electronic and software hybrid systems as well.
|
|
%
|
|
The electronic components {\fms} are established in the literature~\cite{fmd91,mil1991,en298,en230}.
|
|
%
|
|
Each function in the software would have to be assigned a `design~contract'~\cite{dbcbe} (where violations of
|
|
contract clauses will be treated as failure modes in FMMD).
|
|
%
|
|
% By %doing
|
|
% applying contracts and seeing how calling functions deal with
|
|
% the failures in the functions they call, we reveal un-handled the error conditions in
|
|
% the software.
|
|
% By treating hardware interfaces to software as {\dcs}, we automatically have a list of the failure modes
|
|
% of the electronics.
|
|
%%
|
|
With the contracts in place for the software functions, we can then integrate them into the FMMD model.
|
|
%
|
|
FMMD models both software and hardware;
|
|
we can thus verify that all
|
|
failure modes from the electronics module have been dealt
|
|
with by the controlling software.
|
|
%
|
|
If not they are an un-handled error condition relating to the software hardware interface.
|
|
%
|
|
% That is the hardware interfaces to software in FMMD is a {\dc},
|
|
% the failure modes of this {\dc} are the list of all known failure modes
|
|
% of the electronics.
|
|
%
|
|
By performing FMMD on a software electronic hybrid system,
|
|
we thus reveal design deficiencies in both the software, the electronics and the software/electronics interface.
|
|
%in the hardware/software interface.
|
|
%
|
|
FMEDA does not handle software ---or---the software/hardware interface.
|
|
It thus potentially misses many undetected failures (in EN61508 terms undetected-dangerous and undetected safe failures).
|
|
In Safety Integrity Level (SIL)~\cite{en61508} terms, by identifying undetectable faults and fixing them, we raise
|
|
the safe failure fraction (SFF).
|
|
|
|
|
|
%
|
|
|
|
\section{Objective and Subjective Reasoning stages}
|
|
%Opportunity for formal definitions and perhaps an interface or process for achieving it....
|
|
The act of applying failure mode effects analysis, in terms of cause and effect is viewed from
|
|
an `engineering' mentality cause and effect perspective. This is the realm of the objective.
|
|
%
|
|
The executive decisions about deploying systems are in the domain of management and politics.
|
|
%
|
|
The dangers, or potential negative effects of a safety critical system depend not only on the system itself,
|
|
but on the environment in which they are used
|
|
and other human factors such as the training level of operatives, psychological and logical factors in
|
|
the Human Machine Interface~(HMI)~\cite{stranks2007human}.
|
|
%
|
|
\paragraph{Objective and Subjective Reasoning in FMEA: Three Mile Island nuclear accident example.}
|
|
An example of objective and subjective factors is demonstrated in the accident report on the 1979 Three Mile Island
|
|
nuclear accident~\cite{safeware}[App.D]. Here, a vent valve for the primary reactor coolant (pressurised water) became stuck open.
|
|
This condition causes an objectively derived failure mode --- `leakage~of~coolant' --- due to a stuck valve.
|
|
%
|
|
This, if recognised correctly by the operators, would have lead quickly to
|
|
to a reactor shut-down and
|
|
a maintenance procedure to replace the valve.
|
|
%
|
|
The failure was not recognised in time however, and coolant was lost
|
|
until a partial meltdown of the reactor fuel occurred, with a resulting
|
|
leak of radioactive material into the environment.
|
|
%
|
|
For the objective failure mode determined by
|
|
FMEA, that of leakage of coolant,
|
|
we would not reasonably expect this to go unchecked and unresolved for an extended period and cause such a critical failure.
|
|
%
|
|
The criticality level of that accident was therefore subjective. It was not known how the operators
|
|
would have reacted, and deficiencies in the Human Machine Interface (HMI) were not a factor in the failure analysis.
|
|
|
|
|
|
\paragraph{Further Work: Objective and Subjective Reasoning in FMEA.}
|
|
%
|
|
We could term the criticality prediction to be in the domain of subjective reasoning. With an objectively defined system level failure
|
|
we often are next required to determine its level of criticality, or how serious the risk posed would be.
|
|
%
|
|
Two methodologies have started to consider this aspect, FMECA~\cite{fmeca} with its criticality and probability factors, and
|
|
FMEDA~\cite{en61508,fmeda} with its classification of dangerous and safe failures.
|
|
%
|
|
It is the author's opinion that more work is required to clarify this area. The scope of FMMD is the objective level only.
|
|
|