Robin_PHD/submission_thesis/CH8_Conclusion/copy.tex
2013-04-02 14:59:08 +01:00

359 lines
15 KiB
TeX

\label{sec:chap8}
\section{Further Work}
\subsection{Environment, operational states and inhibit gates: additions to the UML model.}
FTA~\cite{nasafta,nucfta} models environmental, operational state and inhibit gates, and these can be incorporated into
the FMMD model.
A system will be expected to perform in a given environment.
%
Environment in the context of this study
means external influences under which the System could be expected to work. % under.
%
A typical data sheet for an electrical component will give
a working temperature range: %, for instance.
mechanical components could be specified for stress and loading limits.
It is unusual to have failure modes described in product literature, although
for complicated components with firmware errata documents are sometimes produced.
Systems may have distinct operational states. For instance, a safety critical controller
may have a LOCKOUT state where it has detected a serious problem and will not continue to operate until
authorised human intervention takes place.
A safety critical circuit may have a self test mode which could be operated externally:
a micro-processor may have a SLEEP mode etc.
%
Operational states and environmental conditions can %must
be factored into the UML model.
\paragraph{Environmental Modelling.} The external influences/environment could typically be temperature ranges,
levels of electrical interference, high voltage contamination on supply
lines, radiation levels etc.
Environmental influences will affect specific components in specific ways.\footnote{A good example of a part
affected by environmental conditions, in this case temperature, is the opto-isolator~\cite{tlp181}
which is typically affected at around {60 \oc}. Most electrical components are more robust to temperature variations.}.
Environmental analysis is thus applicable to components.
Environmental influences, such as over stress due to voltage
can be eliminated by down-rating components as discussed in section~\ref{sec:determine_fms}.
With given environmental constraints, we can therefore eliminate some failure modes from the model.
\paragraph{Operational states.}
Within the field of safety critical engineering, we often encounter
elements that include test or self-test facilities.
%
We also encounter degraded performance
(such as only performing functions in an emergency) and lockout/emergency conditions.
These can be broadly termed operational states. %, and apply to the
%functional groups.
%
We need to determine which UML class is most appropriate to hold a relationship
to operational states.
%
Consider for instance an electrical circuit that has a TEST line.
When the TEST line is activated, it supplies a test signal
which will validate the circuit. This circuit will have two operational states,
NORMAL and TEST mode.
%
It seems more appropriate to apply the operational states to {\fgs}
which %
%Functional groupings
by definition implement functionality, or purpose.
On this basis we associate operational states with {\fgs}.
%therefore are the best objects to model
%operational states.% with.
\paragraph{Inhibit Conditions.}
A third data class may be required if modelling inhibit conditions~\cite{nasafta}[p.40] is required. %desired.
Some failure modes may only be active given specific environmental conditions
or when other failures are already active.
To model this, an `inhibit' class has been added.
This is an optional attribute of
a failure mode. This inhibit class can be triggered
on a combination of environmental or failure modes.
\paragraph{UML Diagram Additional Objects.}
The additional objects System, Environment and Operational States
are added to UML diagram in figure \ref{fig:cfg} are represented in figure \ref{fig:cfg2}.
\label{completeumlfurtherwork}
\begin{figure}[h]
\centering
\includegraphics[width=400pt,keepaspectratio=true]{./CH8_Conclusion/master_uml_further_work.png}
% cfg2.png: 702x464 pixel, 72dpi, 24.76x16.37 cm, bb=0 0 702 464
\caption{FMMD UML diagram, incorporating Environmental, Operational State and Inhibit gates}
\label{fig:cfg2}
\end{figure}
%% 31JAN2012
\section{Statistics: From base component failure modes to System level events/failures.}
Knowing the statistical likelihood of a component failing can give a good indication
of the reliability of a system, or in the case of dangerous failures, the Safety Integrity Level
of a system.
EN61508~\cite{en61508} requires that statistical data is available and used for all component failure modes
analysed in a system assigned a SIL level.
FMMD, as a bottom up methodology can use component failure mode statistical data, and incorporate it
into its hierarchical model.
By way of example, the Pt100 analysis %example
from section~\{sec:pt100} has been used to demonstrate this.
\subsection{Pt100 Example: Single Failures and statistical data}. %Mean Time to Failure}
Now that we have a model for the failure mode behaviour of the Pt100 circuit
we can look at the statistics associated with each of the failure modes.
The DOD electronic reliability of components
document MIL-HDBK-217F\cite{mil1991} gives formulae for calculating
the
%$\frac{failures}{{10}^6}$
${failures}/{{10}^6}$ % looks better
in hours for a wide range of generic components
\footnote{These figures are based on components from the 1980's and MIL-HDBK-217F
can give conservative reliability figures when applied to
modern components}.
%
Using the MIL-HDBK-217F\cite{mil1991} specifications for resistor and thermistor
failure statistics, we calculate the reliability of this circuit.
\paragraph{Resistor FIT Calculations}
The formula for given in MIL-HDBK-217F\cite{mil1991}[9.2] for a generic fixed film non-power resistor
is reproduced in equation \ref{resistorfit}. The meanings
and values assigned to its co-efficients are described in table \ref{tab:resistor}.
\glossary{name={FIT}, description={Failure in Time (FIT). The number of times a particular
failure is expected to occur in a $10^{9}$ hour time period.}}
\fmodegloss
\begin{equation}
% fixed comp resistor{\lambda}_p = {\lambda}_{b}{\pi}_{R}{\pi}_Q{\pi}_E
resistor{\lambda}_p = {\lambda}_{b}{\pi}_{R}{\pi}_Q{\pi}_E
\label{resistorfit}
\end{equation}
\begin{table}[ht]
\caption{Fixed film resistor Failure in time assessment} % title of Table
\centering % used for centering table
\begin{tabular}{||c|c|l||}
\hline \hline
\em{Parameter} & \em{Value} & \em{Comments} \\
& & \\ \hline \hline
${\lambda}_{b}$ & 0.00092 & stress/temp base failure rate $60^o$ C \\ \hline
%${\pi}_T$ & 4.2 & max temp of $60^o$ C\\ \hline
${\pi}_R$ & 1.0 & Resistance range $< 0.1M\Omega$\\ \hline
${\pi}_Q$ & 15.0 & Non-Mil spec component\\ \hline
${\pi}_E$ & 1.0 & benign ground environment\\ \hline
\hline \hline
\end{tabular}
\label{tab:resistor}
\end{table}
Applying equation \ref{resistorfit} with the parameters from table \ref{tab:resistor}
give the following failures in ${10}^6$ hours:
\begin{equation}
0.00092 \times 1.0 \times 15.0 \times 1.0 = 0.0138 \;{failures}/{{10}^{6} Hours}
\label{eqn:resistor}
\end{equation}
While MIL-HDBK-217F gives MTTF for a wide range of common components,
it does not specify how the components will fail (in this case OPEN or SHORT). {Some standards, notably EN298 only consider resistors failing in OPEN mode}.
%FMD-97 gives 27\% OPEN and 3\% SHORTED, for resistors under certain electrical and environmental stresses.
% FMD-91 gives parameter change as a third failure mode, luvvverly 08FEB2011
This example
compromises and uses a 90:10 ratio, for resistor failure.
Thus for this example resistors are expected to fail OPEN in 90\% of cases and SHORTED
in the other 10\%.
A standard fixed film resistor, for use in a benign environment, non military spec at
temperatures up to {60\oc} is given a probability of 13.8 failures per billion ($10^9$)
hours of operation (see equation \ref{eqn:resistor}).
This figure is referred to as a FIT\footnote{FIT values are measured as the number of
failures per Billion (${10}^9$) hours of operation, (roughly 114,000 years). The smaller the
FIT number the more reliable the fault~mode} Failure in time.
The formula given for a thermistor in MIL-HDBK-217F\cite{mil1991}[9.8] is reproduced in
equation \ref{thermistorfit}. The variable meanings and values are described in table \ref{tab:thermistor}.
\begin{equation}
% fixed comp resistor{\lambda}_p = {\lambda}_{b}{\pi}_{R}{\pi}_Q{\pi}_E
resistor{\lambda}_p = {\lambda}_{b}{\pi}_Q{\pi}_E
\label{thermistorfit}
\end{equation}
\begin{table}[ht]
\caption{Bead type Thermistor Failure in time assessment} % title of Table
\centering % used for centering table
\begin{tabular}{||c|c|l||}
\hline \hline
\em{Parameter} & \em{Value} & \em{Comments} \\
& & \\ \hline \hline
${\lambda}_{b}$ & 0.021 & stress/temp base failure rate bead thermistor \\ \hline
%${\pi}_T$ & 4.2 & max temp of $60^o$ C\\ \hline
%${\pi}_R$ & 1.0 & Resistance range $< 0.1M\Omega$\\ \hline
${\pi}_Q$ & 15.0 & Non-Mil spec component\\ \hline
${\pi}_E$ & 1.0 & benign ground environment\\ \hline
\hline \hline
\end{tabular}
\label{tab:thermistor}
\end{table}
\begin{equation}
0.021 \times 1.0 \times 15.0 \times 1.0 = 0.315 \; {failures}/{{10}^{6} Hours}
\label{eqn:thermistor}
\end{equation}
Thus thermistor, bead type, `non~military~spec' is given a FIT of 315.0
Using the RIAC finding we can draw up the following table (table \ref{tab:stat_single}),
showing the FIT values for all faults considered.
\glossary{name={FIT}, description={Failure in Time (FIT). The number of times a particular failure is expected to occur in a $10^{9}$ hour time period.}}
\begin{table}[h+]
\caption{Pt100 FMEA Single // Fault Statistics} % title of Table
\centering % used for centering table
\begin{tabular}{||l|c|c|l|l||}
\hline \hline
\textbf{Test} & \textbf{Result} & \textbf{Result } & \textbf{MTTF} \\
\textbf{Case} & \textbf{sense +} & \textbf{sense -} & \textbf{per $10^9$ hours of operation} \\
% R & wire & res + & res - & description
\hline
\hline
TC:1 $R_1$ SHORT & High Fault & - & 1.38 \\ \hline
TC:2 $R_1$ OPEN & Low Fault & Low Fault & 12.42\\ \hline
\hline
TC:3 $R_3$ SHORT & Low Fault & High Fault & 31.5 \\ \hline
TC:4 $R_3$ OPEN & High Fault & Low Fault & 283.5 \\ \hline
\hline
TC:5 $R_2$ SHORT & - & Low Fault & 1.38 \\
TC:6 $R_2$ OPEN & High Fault & High Fault & 12.42 \\ \hline
\hline
\end{tabular}
\label{tab:stat_single}
\end{table}
The FIT for the circuit as a whole is the sum of MTTF values for all the
test cases. The Pt100 circuit here has a FIT of 342.6. This is a MTTF of
about 360 years per circuit.
A probabilistic tree can now be drawn, with a FIT value for the Pt100
circuit and FIT values for all the component fault modes from which it was calculated.
We can see from this that the most likely fault is the thermistor going OPEN.
This circuit is around 10 times more likely to fail in this way than in any other.
Were we to need a more reliable temperature sensor, this would probably
be the fault~mode we would scrutinise first.
\begin{figure}[h+]
\centering
\includegraphics[width=400pt,bb=0 0 856 327,keepaspectratio=true]{./CH5_Examples/stat_single.png}
% stat_single.jpg: 856x327 pixel, 72dpi, 30.20x11.54 cm, bb=0 0 856 327
\caption{Probablistic Fault Tree : Pt100 Single Faults}
\label{fig:stat_single}
\end{figure}
The Pt100 analysis presents a simple result for single faults.
The next analysis phase looks at how the circuit will behave under double simultaneous failure
conditions.
\subsection{Pt100 Example: Double Failures and statistical data}
Because we can perform double simultaneous failure analysis under FMMD
we can also apply failure rate statistics to double failures.
%
%%
%% Need to talk abou the `detection time'
%% or `Safety Relevant Validation Time' ref can book
%% EN61508 gives detection calculations to reduce
%% statistical impacts of failures.
%%
%
If we consider the failure modes to be statistically independent we can calculate
the FIT values for all the combinations failures in table~\ref{tab:ptfmea2}.
The failure mode of concern, the undetectable {\textbf{FLOATING}} condition
requires that resistors $R_1$ and $R_2$ fail. We can multiply the MTTF
together and find an MTTF for both failing. The FIT value of 12.42 corresponds to
$12.42 \times {10}^{-9}$ failures per hour. Squaring this gives $ 154.3 \times {10}^{-18} $.
This is an astronomically small MTTF, and so small that it would
probably fall below a threshold to sensibly consider.
However, it is very interesting from a failure analysis perspective,
because here we have found a fault that we cannot detect at this
level. This means that should we wish to cope with
this fault, we need to devise a way of detecting this
condition in higher levels of the system.
\glossary{name={FIT}, description={Failure in Time (FIT). The number of times a particular failure is expected to occur in a $10^{9}$ hour time period. Associated with continuous demand systems under EN61508~\cite{en61508}}}
\section{Retrospective Failure Mode analysis and FMMD}
The reasons for applying retrospective failure mode analysis could be approving previously un-assessed
systems to a safety standard, or to determine the failure mode behaviour of an instrument used in
safety critical verification. % verification.
%
FMMD can be applied retrospectively to a project, and because of its modular nature, coupled with
its work flow it
can reveal undetected failure modes.
%
FMMD requires that all failure modes of components in a {\fg} are resolved to
a symptom in the resulting {\dc}.
%
%
FMMD can find failure modes that are not
dealt with as a symptom, i.e. were unintentionally ignored
or forgotten. This means that FMMD will route out un-handled
failure modes.
%come to light.
%
We can apply retrospective FMMD to electronic and software hybrid systems as well.
Each function in the software will have to be assigned a `design~contract'~\cite{dbcbe} (where violations of
contract clauses will be treated as failure modes in FMMD).
%
By %doing
applying contracts and seeing how calling functions deal with
the failures in the functions they call, we reveal un-handled the error conditions in
the software.
By treating hardware interfaces to software as {\dcs}, we automatically have a list of the failure modes
of the electronics.
%
FMMD models both software and hardware;
we can thus verify that all
failure modes from the electronics module, have been dealt
by the controlling software. If not they are an un-handled error condition.
That is the hardware interfaces to software in FMMD is a {\dc},
the failure modes of this {\dc} are the list of all known failure modes
of the electronics.
%
By performing FMMD on a software electronic hybrid system,
we thus reveal design deficiencies.
%in the hardware/software interface.
In Safety Integrity Level (SIL)~\cite{en61508} terms, by identifying undetectable faults and fixing them, we raise
the safe failure fraction (SFF).
\section{Conclusion}
It is the authors belfief that the practise of FMEA would be imporoved by taking a modular approach
and that it is necessary that software and hardware should be included n the same failure mode models.
%
The proposed methodology, FMMD, provides the means to do this, and it is the authors hope that this
or a variant thereof is taken up and used to improve system safety.