360 lines
16 KiB
TeX
360 lines
16 KiB
TeX
\label{sec:chap8}
|
|
\section{Further Work}
|
|
|
|
\subsection{Environment, operational states and inhibit gates: additions to the UML model.}
|
|
|
|
FTA~\cite{nasafta,nucfta} models environmental, operational state and inhibit gates, and these can be incorporated into
|
|
the FMMD model.
|
|
|
|
A system will be expected to perform in a given environment.
|
|
%
|
|
Environment in the context of this study
|
|
means external influences under which the System could be expected to work. % under.
|
|
%
|
|
A typical data sheet for an electrical component will give
|
|
a working temperature range: %, for instance.
|
|
mechanical components could be specified for stress and loading limits.
|
|
It is unusual to have failure modes described in product literature, although
|
|
for complicated components with firmware errata documents are sometimes produced.
|
|
|
|
Systems may have distinct operational states. For instance, a safety critical controller
|
|
may have a LOCKOUT state where it has detected a serious problem and will not continue to operate until
|
|
authorised human intervention takes place.
|
|
A safety critical circuit may have a self test mode which could be operated externally:
|
|
a micro-processor may have a SLEEP mode etc.
|
|
%
|
|
Operational states and environmental conditions can %must
|
|
be factored into the UML model.
|
|
|
|
\paragraph{Environmental Modelling.} The external influences/environment could typically be temperature ranges,
|
|
levels of electrical interference, high voltage contamination on supply
|
|
lines, radiation levels etc.
|
|
Environmental influences will affect specific components in specific ways.\footnote{A good example of a part
|
|
affected by environmental conditions, in this case temperature, is the opto-isolator~\cite{tlp181}
|
|
which is typically affected at around {60 \oc}. Most electrical components are more robust to temperature variations.}.
|
|
Environmental analysis is thus applicable to components.
|
|
Environmental influences, such as over stress due to voltage
|
|
can be eliminated by down-rating components as discussed in section~\ref{sec:determine_fms}.
|
|
With given environmental constraints, we can therefore eliminate some failure modes from the model.
|
|
|
|
|
|
\paragraph{Operational states.}
|
|
Within the field of safety critical engineering, we often encounter
|
|
elements that include test or self-test facilities.
|
|
%
|
|
We also encounter degraded performance
|
|
(such as only performing functions in an emergency) and lockout/emergency conditions.
|
|
These can be broadly termed operational states. %, and apply to the
|
|
%functional groups.
|
|
%
|
|
We need to determine which UML class is most appropriate to hold a relationship
|
|
to operational states.
|
|
%
|
|
Consider for instance an electrical circuit that has a TEST line.
|
|
When the TEST line is activated, it supplies a test signal
|
|
which will validate the circuit. This circuit will have two operational states,
|
|
NORMAL and TEST mode.
|
|
%
|
|
It seems more appropriate to apply the operational states to {\fgs}
|
|
which %
|
|
%Functional groupings
|
|
by definition implement functionality, or purpose.
|
|
On this basis we associate operational states with {\fgs}.
|
|
%therefore are the best objects to model
|
|
%operational states.% with.
|
|
|
|
\paragraph{Inhibit Conditions.}
|
|
A third data class may be required if modelling inhibit conditions~\cite{nasafta}[p.40] is required. %desired.
|
|
Some failure modes may only be active given specific environmental conditions
|
|
or when other failures are already active.
|
|
To model this, an `inhibit' class has been added.
|
|
This is an optional attribute of
|
|
a failure mode. This inhibit class can be triggered
|
|
on a combination of environmental or failure modes.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
\paragraph{UML Diagram Additional Objects.}
|
|
The additional objects System, Environment and Operational States
|
|
are added to UML diagram in figure \ref{fig:cfg} are represented in figure \ref{fig:cfg2}.
|
|
|
|
\label{completeumlfurtherwork}
|
|
|
|
\begin{figure}[h]
|
|
\centering
|
|
\includegraphics[width=400pt,keepaspectratio=true]{./CH8_Conclusion/master_uml_further_work.png}
|
|
% cfg2.png: 702x464 pixel, 72dpi, 24.76x16.37 cm, bb=0 0 702 464
|
|
\caption{FMMD UML diagram, incorporating Environmental, Operational State and Inhibit gates}
|
|
\label{fig:cfg2}
|
|
\end{figure}
|
|
|
|
|
|
|
|
|
|
%% 31JAN2012
|
|
|
|
\section{Statistics: From base component failure modes to System level events/failures.}
|
|
|
|
Knowing the statistical likelihood of a component failing can give a good indication
|
|
of the reliability of a system, or in the case of dangerous failures, the Safety Integrity Level
|
|
of a system.
|
|
EN61508~\cite{en61508} requires that statistical data is available and used for all component failure modes
|
|
analysed in a system assigned a SIL level.
|
|
FMMD, as a bottom up methodology can use component failure mode statistical data, and incorporate it
|
|
into its hierarchical model.
|
|
By way of example, the Pt100 analysis %example
|
|
from section~\{sec:pt100} has been used to demonstrate this.
|
|
|
|
\subsection{Pt100 Example: Single Failures and statistical data}. %Mean Time to Failure}
|
|
|
|
Now that we have a model for the failure mode behaviour of the Pt100 circuit
|
|
we can look at the statistics associated with each of the failure modes.
|
|
|
|
The DOD electronic reliability of components
|
|
document MIL-HDBK-217F\cite{mil1991} gives formulae for calculating
|
|
the
|
|
%$\frac{failures}{{10}^6}$
|
|
${failures}/{{10}^6}$ % looks better
|
|
in hours for a wide range of generic components
|
|
\footnote{These figures are based on components from the 1980's and MIL-HDBK-217F
|
|
can give conservative reliability figures when applied to
|
|
modern components}.
|
|
%
|
|
Using the MIL-HDBK-217F\cite{mil1991} specifications for resistor and thermistor
|
|
failure statistics, we calculate the reliability of this circuit.
|
|
|
|
|
|
\paragraph{Resistor FIT Calculations}
|
|
|
|
The formula for given in MIL-HDBK-217F\cite{mil1991}[9.2] for a generic fixed film non-power resistor
|
|
is reproduced in equation \ref{resistorfit}. The meanings
|
|
and values assigned to its co-efficients are described in table \ref{tab:resistor}.
|
|
\glossary{name={FIT}, description={Failure in Time (FIT). The number of times a particular
|
|
failure is expected to occur in a $10^{9}$ hour time period.}}
|
|
|
|
|
|
\fmodegloss
|
|
|
|
\begin{equation}
|
|
% fixed comp resistor{\lambda}_p = {\lambda}_{b}{\pi}_{R}{\pi}_Q{\pi}_E
|
|
resistor{\lambda}_p = {\lambda}_{b}{\pi}_{R}{\pi}_Q{\pi}_E
|
|
\label{resistorfit}
|
|
\end{equation}
|
|
|
|
\begin{table}[ht]
|
|
\caption{Fixed film resistor Failure in time assessment} % title of Table
|
|
\centering % used for centering table
|
|
\begin{tabular}{||c|c|l||}
|
|
\hline \hline
|
|
\em{Parameter} & \em{Value} & \em{Comments} \\
|
|
& & \\ \hline \hline
|
|
${\lambda}_{b}$ & 0.00092 & stress/temp base failure rate $60^o$ C \\ \hline
|
|
%${\pi}_T$ & 4.2 & max temp of $60^o$ C\\ \hline
|
|
${\pi}_R$ & 1.0 & Resistance range $< 0.1M\Omega$\\ \hline
|
|
${\pi}_Q$ & 15.0 & Non-Mil spec component\\ \hline
|
|
${\pi}_E$ & 1.0 & benign ground environment\\ \hline
|
|
|
|
\hline \hline
|
|
\end{tabular}
|
|
\label{tab:resistor}
|
|
\end{table}
|
|
|
|
Applying equation \ref{resistorfit} with the parameters from table \ref{tab:resistor}
|
|
give the following failures in ${10}^6$ hours:
|
|
|
|
\begin{equation}
|
|
0.00092 \times 1.0 \times 15.0 \times 1.0 = 0.0138 \;{failures}/{{10}^{6} Hours}
|
|
\label{eqn:resistor}
|
|
\end{equation}
|
|
|
|
While MIL-HDBK-217F gives MTTF for a wide range of common components,
|
|
it does not specify how the components will fail (in this case OPEN or SHORT). {Some standards, notably EN298 only consider resistors failing in OPEN mode}.
|
|
%FMD-97 gives 27\% OPEN and 3\% SHORTED, for resistors under certain electrical and environmental stresses.
|
|
% FMD-91 gives parameter change as a third failure mode, luvvverly 08FEB2011
|
|
This example
|
|
compromises and uses a 90:10 ratio, for resistor failure.
|
|
Thus for this example resistors are expected to fail OPEN in 90\% of cases and SHORTED
|
|
in the other 10\%.
|
|
A standard fixed film resistor, for use in a benign environment, non military spec at
|
|
temperatures up to {60\oc} is given a probability of 13.8 failures per billion ($10^9$)
|
|
hours of operation (see equation \ref{eqn:resistor}).
|
|
This figure is referred to as a FIT\footnote{FIT values are measured as the number of
|
|
failures per Billion (${10}^9$) hours of operation, (roughly 114,000 years). The smaller the
|
|
FIT number the more reliable the fault~mode} Failure in time.
|
|
|
|
The formula given for a thermistor in MIL-HDBK-217F\cite{mil1991}[9.8] is reproduced in
|
|
equation \ref{thermistorfit}. The variable meanings and values are described in table \ref{tab:thermistor}.
|
|
|
|
\begin{equation}
|
|
% fixed comp resistor{\lambda}_p = {\lambda}_{b}{\pi}_{R}{\pi}_Q{\pi}_E
|
|
resistor{\lambda}_p = {\lambda}_{b}{\pi}_Q{\pi}_E
|
|
\label{thermistorfit}
|
|
\end{equation}
|
|
|
|
\begin{table}[ht]
|
|
\caption{Bead type Thermistor Failure in time assessment} % title of Table
|
|
\centering % used for centering table
|
|
\begin{tabular}{||c|c|l||}
|
|
\hline \hline
|
|
\em{Parameter} & \em{Value} & \em{Comments} \\
|
|
& & \\ \hline \hline
|
|
${\lambda}_{b}$ & 0.021 & stress/temp base failure rate bead thermistor \\ \hline
|
|
%${\pi}_T$ & 4.2 & max temp of $60^o$ C\\ \hline
|
|
%${\pi}_R$ & 1.0 & Resistance range $< 0.1M\Omega$\\ \hline
|
|
${\pi}_Q$ & 15.0 & Non-Mil spec component\\ \hline
|
|
${\pi}_E$ & 1.0 & benign ground environment\\ \hline
|
|
|
|
\hline \hline
|
|
\end{tabular}
|
|
\label{tab:thermistor}
|
|
\end{table}
|
|
|
|
|
|
\begin{equation}
|
|
0.021 \times 1.0 \times 15.0 \times 1.0 = 0.315 \; {failures}/{{10}^{6} Hours}
|
|
\label{eqn:thermistor}
|
|
\end{equation}
|
|
|
|
|
|
Thus thermistor, bead type, `non~military~spec' is given a FIT of 315.0
|
|
|
|
Using the RIAC finding we can draw up the following table (table \ref{tab:stat_single}),
|
|
showing the FIT values for all faults considered.
|
|
\glossary{name={FIT}, description={Failure in Time (FIT). The number of times a particular failure is expected to occur in a $10^{9}$ hour time period.}}
|
|
|
|
|
|
|
|
|
|
\begin{table}[h+]
|
|
\caption{Pt100 FMEA Single // Fault Statistics} % title of Table
|
|
\centering % used for centering table
|
|
\begin{tabular}{||l|c|c|l|l||}
|
|
\hline \hline
|
|
\textbf{Test} & \textbf{Result} & \textbf{Result } & \textbf{MTTF} \\
|
|
\textbf{Case} & \textbf{sense +} & \textbf{sense -} & \textbf{per $10^9$ hours of operation} \\
|
|
% R & wire & res + & res - & description
|
|
\hline
|
|
\hline
|
|
TC:1 $R_1$ SHORT & High Fault & - & 1.38 \\ \hline
|
|
TC:2 $R_1$ OPEN & Low Fault & Low Fault & 12.42\\ \hline
|
|
\hline
|
|
TC:3 $R_3$ SHORT & Low Fault & High Fault & 31.5 \\ \hline
|
|
TC:4 $R_3$ OPEN & High Fault & Low Fault & 283.5 \\ \hline
|
|
\hline
|
|
TC:5 $R_2$ SHORT & - & Low Fault & 1.38 \\
|
|
TC:6 $R_2$ OPEN & High Fault & High Fault & 12.42 \\ \hline
|
|
\hline
|
|
\end{tabular}
|
|
\label{tab:stat_single}
|
|
\end{table}
|
|
|
|
The FIT for the circuit as a whole is the sum of MTTF values for all the
|
|
test cases. The Pt100 circuit here has a FIT of 342.6. This is a MTTF of
|
|
about 360 years per circuit.
|
|
|
|
A probabilistic tree can now be drawn, with a FIT value for the Pt100
|
|
circuit and FIT values for all the component fault modes from which it was calculated.
|
|
We can see from this that the most likely fault is the thermistor going OPEN.
|
|
This circuit is around 10 times more likely to fail in this way than in any other.
|
|
Were we to need a more reliable temperature sensor, this would probably
|
|
be the fault~mode we would scrutinise first.
|
|
|
|
|
|
\begin{figure}[h+]
|
|
\centering
|
|
\includegraphics[width=400pt,bb=0 0 856 327,keepaspectratio=true]{./CH5_Examples/stat_single.png}
|
|
% stat_single.jpg: 856x327 pixel, 72dpi, 30.20x11.54 cm, bb=0 0 856 327
|
|
\caption{Probablistic Fault Tree : Pt100 Single Faults}
|
|
\label{fig:stat_single}
|
|
\end{figure}
|
|
|
|
|
|
The Pt100 analysis presents a simple result for single faults.
|
|
The next analysis phase looks at how the circuit will behave under double simultaneous failure
|
|
conditions.
|
|
|
|
|
|
\subsection{Pt100 Example: Double Failures and statistical data}
|
|
Because we can perform double simultaneous failure analysis under FMMD
|
|
we can also apply failure rate statistics to double failures.
|
|
%
|
|
%%
|
|
%% Need to talk abou the `detection time'
|
|
%% or `Safety Relevant Validation Time' ref can book
|
|
%% EN61508 gives detection calculations to reduce
|
|
%% statistical impacts of failures.
|
|
%%
|
|
%
|
|
If we consider the failure modes to be statistically independent we can calculate
|
|
the FIT values for all the combinations failures in table~\ref{tab:ptfmea2}.
|
|
The failure mode of concern, the undetectable {\textbf{FLOATING}} condition
|
|
requires that resistors $R_1$ and $R_2$ fail. We can multiply the MTTF
|
|
together and find an MTTF for both failing. The FIT value of 12.42 corresponds to
|
|
$12.42 \times {10}^{-9}$ failures per hour. Squaring this gives $ 154.3 \times {10}^{-18} $.
|
|
This is an astronomically small MTTF, and so small that it would
|
|
probably fall below a threshold to sensibly consider.
|
|
However, it is very interesting from a failure analysis perspective,
|
|
because here we have found a fault that we cannot detect at this
|
|
level. This means that should we wish to cope with
|
|
this fault, we need to devise a way of detecting this
|
|
condition in higher levels of the system.
|
|
\glossary{name={FIT}, description={Failure in Time (FIT). The number of times a particular failure is expected to occur in a $10^{9}$ hour time period. Associated with continuous demand systems under EN61508~\cite{en61508}}}
|
|
|
|
\section{Retrospective Failure Mode analysis and FMMD}
|
|
|
|
The reasons for applying retrospective failure mode analysis could be approving previously un-assessed
|
|
systems to a safety standard, or to determine the failure mode behaviour of an instrument used in
|
|
safety critical verification. % verification.
|
|
%
|
|
FMMD can be applied retrospectively to a project, and because of its modular nature, coupled with
|
|
its work flow it
|
|
can reveal undetected failure modes.
|
|
%
|
|
FMMD requires that all failure modes of components in a {\fg} are resolved to
|
|
a symptom in the resulting {\dc}.
|
|
%
|
|
%
|
|
FMMD can find failure modes that are not
|
|
dealt with as a symptom, i.e. were unintentionally ignored
|
|
or forgotten. This means that FMMD will route out un-handled
|
|
failure modes.
|
|
%come to light.
|
|
%
|
|
We can apply retrospective FMMD to electronic and software hybrid systems as well.
|
|
Each function in the software will have to be assigned a `design~contract'~\cite{dbcbe} (where violations of
|
|
contract clauses will be treated as failure modes in FMMD).
|
|
%
|
|
By %doing
|
|
applying contracts and seeing how calling functions deal with
|
|
the failures in the functions they call, we reveal un-handled the error conditions in
|
|
the software.
|
|
By treating hardware interfaces to software as {\dcs}, we automatically have a list of the failure modes
|
|
of the electronics.
|
|
%
|
|
FMMD models both software and hardware;
|
|
we can thus verify that all
|
|
failure modes from the electronics module, have been dealt
|
|
by the controlling software. If not they are an un-handled error condition.
|
|
That is the hardware interfaces to software in FMMD is a {\dc},
|
|
the failure modes of this {\dc} are the list of all known failure modes
|
|
of the electronics.
|
|
%
|
|
By performing FMMD on a software electronic hybrid system,
|
|
we thus reveal design deficiencies.
|
|
%in the hardware/software interface.
|
|
In Safety Integrity Level (SIL)~\cite{en61508} terms, by identifying undetectable faults and fixing them, we raise
|
|
the safe failure fraction (SFF).
|
|
|
|
\section{Objective and Subjective Reasoning stages}
|
|
Opportunity for formal definitions and perhaps an interface or process for achieving it....
|
|
|
|
\section{Conclusion}
|
|
|
|
It is the authors belfief that the practise of FMEA would be imporoved by taking a modular approach
|
|
and that it is necessary that software and hardware should be included n the same failure mode models.
|
|
%
|
|
The proposed methodology, FMMD, provides the means to do this, and it is the authors hope that this
|
|
or a variant thereof is taken up and used to improve system safety. |