went though red-penning of CH2 and CH3

This commit is contained in:
Robin Clark 2013-03-15 16:19:58 +00:00
parent aa843fc851
commit 7eec6c8de4
4 changed files with 300 additions and 192 deletions

View File

@ -23,7 +23,7 @@ for a large proportion of safety critical products sold in the European Union.
The acronym FMEA can be expanded as follows:
\begin{itemize}
\item \textbf{F - Failures of given component,} Consider a particular component in a system;
\item \textbf{M - Failure Mode,} Look at one of the ways in which it can fail (i.e. determine a component `failure~mode');
\item \textbf{M - Failure Mode,} Choose a component `failure~mode');
\item \textbf{E - Effects,} Determine the effects this failure mode will cause to the system we are examining;
\item \textbf{A - Analysis,} Analyse how much impact this symptom will have on the environment/people/the system its-self.
\end{itemize}
@ -33,7 +33,7 @@ how failures could affect some equipment in %an initial
a brain-storming session
%in product design,
to formal submission as part of safety critical certification.
FMEA is a time intensive process. To reduce amount of work to perform,
FMEA is a manual and therefore time intensive process. To reduce amount of work to perform,
software packages~\cite{931423, 1778436820050601} and analysis strategies have
been developed~\cite{incrementalfmea, automatingFMEA1281774}.
%
@ -72,12 +72,13 @@ under given conditions.
How base components could fail internally, is not of interest to an FMEA investigation.
The FMEA investigator needs to know what failure behaviour a component may exhibit. %, or in other words, its modes of failure.
%
A large body of literature exists which gives guidance for determining component {\fms}.
A large body of literature exists giving guidance for the determination of component {\fms}.
%
For this study FMD-91~\cite{fmd91} and the gas burner standard EN298~\cite{en298} are examined.
%Some standards prescribe specific failure modes for generic component types.
In EN298 failure modes for most generic component types are listed, or if not listed,
determined by considering all pins OPEN and all adjacent pins shorted.
are determined using a procedure where we consider
all pins open and then all adjacent pins shorted.
%a procedure where failure scenarios of all pins OPEN and all adjacent pins shorted
%are examined.
%
@ -118,11 +119,11 @@ requires statistics for Meantime to Failure (MTTF) for all {\bc} failure modes.
\section{Determining the failure modes of Components.}
The starting point for FMEA are the failure modes of {\bcs}.
The starting point in the FMEA process are the failure modes of {\bcs}.
In order the define FMEA we must start with a discussion on how these failure modes are chosen.
%
In this section we look in detail at two common electrical components and examine how
the two sources of information define their failure mode behaviour.
the two chosen sources of {\fm} information define their failure mode behaviour.
We look at the reasons why some known failure modes % are omitted, or presented in
%specific but unintuitive ways.
%We compare the US. military published failure mode specifications wi
@ -172,7 +173,7 @@ as shown below.
\item Lead damage 1.9\% $\mapsto$ OPEN.
\end{itemize}
%
The main causes of drift are overloading of components.
We note that the main causes of resistor value drift are overloading. % of components.
This is borne out in in the FMD-91~\cite{fmd91}[232] entry for a resistor network where the failure
modes do not include drift.
%
@ -235,7 +236,7 @@ $$ fm(R) = \{ OPEN, SHORT \} . $$
\centering
\includegraphics[width=200pt]{CH5_Examples/lm258pinout.jpg}
% lm258pinout.jpg: 478x348 pixel, 96dpi, 12.65x9.21 cm, bb=0 0 359 261
\caption{Pinout for an LM358 dual OpAmp}
\caption{Pinout for an LM358 dual Op-Amp}
\label{fig:lm258}
\end{figure}
@ -247,10 +248,10 @@ For the purpose of example, we look at
a typical op-amp designed for instrumentation and measurement, the dual packaged version of the LM358~\cite{lm358}
(see figure~\ref{fig:lm258}), and use this to compare the failure mode derivations from FMD-91 and EN298.
\paragraph{ Failure Modes of an OpAmp according to FMD-91 }
\paragraph{ Failure Modes of an Op-Amp according to FMD-91 }
%Literature suggests, latch up, latch down and oscillation.
For OpAmp failures modes, FMD-91\cite{fmd91}{3-116] states,
For Op-Amp failures modes, FMD-91\cite{fmd91}{3-116] states,
\begin{itemize}
\item Degraded Output 50\% Low Slew rate - poor die attach
\item No Operation - overstress 31.3\%
@ -260,11 +261,11 @@ For OpAmp failures modes, FMD-91\cite{fmd91}{3-116] states,
Again these are mostly internal causes of failure, more of interest to the component manufacturer
than a designer looking for the symptoms of failure.
We need to translate these failure causes within the OpAmp into {\fms}.
We need to translate these failure causes within the Op-Amp into {\fms}.
We can look at each failure cause in turn, and map it to potential {\fms} suitable for use in FMEA
investigations.
\paragraph{OpAmp failure cause: Poor Die attach}
\paragraph{Op-Amp failure cause: Poor Die attach}
The symptom for this is given as a low slew rate. This means that the op-amp
will not react quickly to changes on its input terminals.
This is a failure symptom that may not be of concern in a slow responding system like an
@ -273,7 +274,7 @@ a signal may entirely be lost.
We can map this failure cause to a {\fm}, and we can call it $LOW_{slew}$.
\paragraph{No Operation - over stress}
Here the OP\_AMP has been damaged, and the output may be held HIGH or LOW, or may be
Here the OP-Amp has been damaged, and the output may be held HIGH or LOW, or may be
effectively tri-stated, i.e. not able to drive circuitry in along the next stages of
the signal path: we can call this state NOOP (no Operation).
%
@ -286,18 +287,18 @@ We map this failure cause to $HIGH$ or $LOW$.
\paragraph{Open $V_+$}
This failure cause will mean that the minus input will have the very high gain
of the OpAmp applied to it, and the output will be forced HIGH or LOW.
of the Op-Amp applied to it, and the output will be forced HIGH or LOW.
We map this failure cause to $HIGH$ or $LOW$.
\paragraph{Collecting OpAmp failure modes from FMD-91}
We can define an OpAmp, under FMD-91 definitions to have the following {\fms}.
\paragraph{Collecting Op-Amp failure modes from FMD-91}
We can define an Op-Amp, under FMD-91 definitions to have the following {\fms}.
\begin{equation}
\label{eqn:opampfms}
fm(OpAmp) = \{ HIGH, LOW, NOOP, LOW_{slew} \}
\end{equation}
\paragraph{Failure Modes of an OpAmp according to EN298}
\paragraph{Failure Modes of an Op-Amp according to EN298}
EN298 does not specifically define OP\_AMPS failure modes; these can be determined
by following a procedure for `integrated~circuits' outlined in
@ -377,7 +378,7 @@ that we got from FMD-91, listed in equation~\ref{eqn:opampfms}.
%\clearpage
\subsubsection{Failure modes of an OpAmp}
\subsubsection{Failure modes of an Op-Amp}
\label{sec:opamp_fms}
For the purpose of the examples to follow, the op-amp will
@ -397,7 +398,7 @@ component {\fms} in FMEA or FMMD and require interpretation.
%For our OpAmp example could have come up with different symptoms for both sides. Cannot predict the effect of internal errors, for instance ($LOW_{slew}$)
%For our Op-Amp example could have come up with different symptoms for both sides. Cannot predict the effect of internal errors, for instance ($LOW_{slew}$)
%is missing from the EN298 failure modes set.
@ -463,11 +464,11 @@ component {\fms} in FMEA or FMMD and require interpretation.
FMEA is a procedure which starts with the failure modes of the low level components of a system, an example
FMEA is a bottom-up procedure which starts with the failure modes of the low level components of a system, an example
analysis will serve to demonstrate it in practise.
\paragraph{ FMEA Example: Milli-volt reader}
Example: Let us consider a system, in this case a milli-volt reader, consisting
\paragraph{ FMEA Example: Milli-volt reader.}
Example: Let us consider a system, in this case a simple milli-volt reader, consisting
of instrumentation amplifiers connected to a micro-processor
that reports its readings via RS-232.
\begin{figure}
@ -585,6 +586,7 @@ and our equipment can react by raising an alarm or compensating for the resultin
%
Some failure modes may cause undetectable failures, for instance a component that causes
a measured reading to change could have adverse consequences yet not be flagged as a failure.
%
This type of failure would not be flagged as a failure by the system, because
it has no way of knowing the reading is invalid.
%
@ -624,6 +626,9 @@ methodologies such as FTA~\cite{nucfta,nasafta} are backward searches.
Forward search types of fault analysis is said to be `deductive'.
Backward (or bottom-up) searches are said to be inductive (i.e. the results of failure are
induced).
\paragraph{Reasoning distance.}
\label{reasoningdistance}
A reasoning distance is the number of stages of logic and reasoning
@ -641,6 +646,9 @@ in that system.
If the milli-volt reader had say 100 components, with three failure modes each, this
would give a reasoning distance of 3 * 100 * 99.
The discussion on reasoning distance leads provides us with a metric to examine
the state explosion problems associated with forward search failure investigation
methodologies.
%.... general concept... simple ideas about how complex a
%failure analysis is the more modules and components are involved
@ -652,7 +660,12 @@ would give a reasoning distance of 3 * 100 * 99.
FMEA for a safety critical certification~\cite{en298,en61508} will have to be applied
to all known failure modes of all components within a system.
FMEA does not define or specify the scope of the investigation of each component failure mode.
Should we follow the signal path, and all components we encounter along that, or should the scope be wider?
If we were to examine the effect of a component {\fm} against all other components
in a system, this could be said to be exhaustive analysis.
\paragraph{Exhaustive Single Failure FMEA.}
To perform FMEA exhaustively (i.e. to examine every possible interaction
of a failure mode with all other components in a system). Or in other words,
---we would need to look at all possible failure scenarios.
@ -691,7 +704,7 @@ For our theoretical 100 components with 3 failure modes each example, this is
$100*99*98*3=2,910,600$ failure mode scenarios.
\paragraph{Reliance of experts for meaningful FMEA Analysis.}
\paragraph{Reliance on experts for meaningful FMEA Analysis.}
Current FMEA methodologies cannot consider---for the reason of state explosion---an exhaustive approach.
We define exhaustive FMEA ({\XFMEA}) as examining the effect of every component failure mode
against the remaining components in the system under investigation.
@ -700,7 +713,9 @@ Because we cannot perform XFMEA,
we rely on experts in the system under investigation
to perform a meaningful FMEA analysis.
%
In practise these experts have to select the areas they see as most critical for detailed FMEA analysis.
In practise these experts have to select the areas they see as most critical for detailed FMEA analysis:
its is usually impossible to perform a detail level of analysis on all component {\fms}
on anything but a non-trivial system.
\subsection{Component Tolerance}
@ -714,10 +729,10 @@ is given in section~\ref{sec:resistortolerance}.
\paragraph{Five main Variants of FMEA}
\begin{itemize}
\item \textbf{PFMEA - Production} Car Manufacture etc
\item \textbf{FMECA - Criticality} Military/Space
\item \textbf{FMEDA - Statistical safety} EN61508/IOC1508 Safety Integrity Levels
\item \textbf{DFMEA - Design or static/theoretical} EN298/EN230/UL1998
\item \textbf{PFMEA - Production} Emphasis on cost reduction and product improvement;
\item \textbf{FMECA - Criticality} Emphasis on minimising the effect of critial systems failing; % Military/Space
\item \textbf{FMEDA - Statistical safety} Statistical analysis giving Safety Integrity Levels;
\item \textbf{DFMEA - Design or static/theoretical} Approval of safety critical systems using FMEA and single or double failure prevention;% EN298/EN230/UL1998
\item \textbf{SFMEA - Software FMEA --- only used in highly critical systems at present}
\end{itemize}
@ -766,11 +781,16 @@ will return most cost benefit.
% \caption{A10 Thunderbolt}
% \label{fig:f16missile}
% \end{figure}
Emphasis on determining criticality rather than the cost of system failures.
Applies some Bayesian statistics (probabilities of component failures and those
FMECA places emphasis on determining criticality rather than the cost of system failures.
%
Applies some Bayesian statistics (probabilities of component failures
thereby causing given system level failures).
%
Also the probability of the system failure causing a critical event.
%
Applying Bayesian statistics to failure analysis, suffers the
problem that correlation does not imply causation~\cite{bayesfrequentist}.
problem that correlation does not imply causation~\cite{bayesfrequentist}.
%
However, correlation is evidence for causation, and maybe the only evidence to hand
and this is the justification behind its use.
A history of the usage and development of FMECA may be found in~\cite{FMECAresearch}.
@ -829,7 +849,7 @@ for a project manager.
\subsection{ FMEDA - Failure Modes Effects and Diagnostic Analysis}
%\subsection{ FMEDA - Failure Modes Effects and Diagnostic Analysis}
% \begin{figure}
% \centering
% \includegraphics[width=200pt]{./SIL.png}
@ -859,16 +879,18 @@ type standards (EN61508/IOC5108).
It provides a statistical overall level of safety
and allows diagnostic mitigation for self checking etc.
It provides guidelines for the design and architecture
of computer/software systems for the four levels of
safety Integrity.
of computer/software systems for four levels of
safety Integrity, referred to as Safety Integrity Levels (SIL).
%For Hardware
%
FMEDA does force the user to consider all hardware components in a system
by requiring that a MTTF value is assigned for each failure~mode;
by requiring that a MTTF value is assigned for each base component failure~mode;
the MTTF may be statistically mitigated (improved)
if it can be shown that self-checking will detect failure modes.
For software it provides procedural quality guidelines and constraints (such as forbidding certain
programming languages and/or features.
%
EN61508 in relation to software provides procedural quality guidelines and constraints (such as forbidding certain
programming languages and/or features): it does not provide a means to trace failure mode effects in software
or across the software/hardware interface.
%\subsection{ FMEDA - Failure Modes Effects and Diagnostic Analysis}
@ -988,11 +1010,14 @@ It has a simple final result, a Safety Integrity Level (SIL) from 1 to 4 (wher
\caption{FMEA Meeting}
\label{fig:tech_meeting}
\end{figure}
Static FMEA, Design FMEA, Approvals FMEA
%Static FMEA, Design FMEA, Approvals FMEA
Experts from Approval House and Equipment Manufacturer
discuss selected component failure modes
judged to be in critical sections of the product.
%
This could be considered as a design check method, deliberately
looking for weaknesses at a theoretical level.
@ -1010,127 +1035,128 @@ judged to be in critical sections of the product.
% \end{figure}
\begin{itemize}
\item Impossible to look at all component failures let alone apply FMEA rigorously.
\item Impossible to look at all component failures let alone apply FMEA exhaustively/rigorously.
\item In practice, failure scenarios for critical sections are contested, and either justified or extra safety measures implemented.
\item Often Meeting notes or minutes only. Unusual for detailed arguments to be documented.
\item Often Meeting notes or minutes only. Unusual for detailed technical arguments to be documented.
\end{itemize}
\section{Conclusions on current FMEA Methodologies}
%% FOCUS
The focus of this chapter %literature review
is to establish the current practice and applications
of FMEA.
%, and to examine its strengths and weaknesses.
%% GOAL
Its
goal is to identify central issues and to criticise and assess the current
FMEA methodologies.
%% PERSPECTIVE
The perspective of the author, is as a practitioner of static failure mode analysis techniques
concerning approval of product
to European safety standards, both the prescriptive~\cite{en298,en230} and statistical~\cite{en61508}.
A second perspective is that of a software engineer trained to use formal methods.
Examining FMEA methodologies for mathematical properties, influenced by
formal methods applied to software, should provide a perspective not traditionally considered.
%% COVERAGE
The literature reviewed, has been restricted to published books, European safety standards (as examples
of current safety measures applied), and traditional research, from journal and conference papers.
%% ORGANISATION
The review is organised by concept, that is, FMEA can be applied to hardware, software, software~interfacing and
to multiple failure scenarios etc. Methodologies related to FMEA are briefly covered for the sake of context.
%% AUDIENCE
% Well duh! PhD supervisors and examiners....
% \subsection{Related Methodologies}
% FTA --- HAZOP --- ALARP --- Event Tree Analysis --- bow tie concept
% \subsection{Hardware FMEA (HFMEA)}
% \subsection{Multiple Failure scenarios and FMEA}
% \subsection{Software FMEA (SFMEA)}
\paragraph{Current work on Software FMEA}
SFMEA usually does not seek to integrate
hardware and software models, but to perform
FMEA on the software in isolation~\cite{procsfmea}.
%
Work has been performed using databases
to track the relationships between variables
and system failure modes~\cite{procsfmeadb}, to %work has been performed to
introduce automation into the FMEA process~\cite{appswfmea} and to provide code analysis
automation~\cite{modelsfmea}. Although the SFMEA and hardware FMEAs are performed separately,
some schools of thought aim for Fault Tree Analysis (FTA)~\cite{nasafta,nucfta} (top down - deductive)
and FMEA (bottom-up inductive)
to be performed on the same system to provide insight into the
software hardware/interface~\cite{embedsfmea}.
%
Although this
would give a better picture of the failure mode behaviour, it
is by no means a rigorous approach to tracing errors that may occur in hardware
through to the top (and therefore ultimately controlling) layer of software.
\paragraph{Current FMEA techniques are not suitable for software}
The main FMEA methodologies are all based on the concept of taking
base component {\fms}, and translating them into system level events/failures~\cite{sfmea,sfmeaa}.
%
In a complicated system, mapping a component failure mode to a system level failure
will mean a long reasoning distance; that is to say the actions of the
failed component will have to be traced through
several sub-systems, gauging its effects with and on other components.
%
With software at the higher levels of these sub-systems,
we have yet another layer of complication.
%
%In order to integrate software, %in a meaningful way
%we need to re-think the
%FMEA concept of simply mapping a base component failure to a system level event.
%
SFMEA regards, in place of hardware components, the variables used by the programs to be their equivalent~\cite{procsfmea}.
The failure modes of these variables, are that they could become erroneously over-written,
calculated incorrectly (due to a mistake by the programmer, or a fault in the micro-processor on which it is running), or
external influences such as
ionising radiation causing bits to be erroneously altered.
\paragraph{FMEA and Modularity}
From the 1940's onwards, software has evolved from a simple procedural languages (i.e. assembly language/Fortran~\cite{f77} call return)
to structured programming ( C~\cite{DBLP:books/ph/KernighanR88}, pascal etc) and then to object oriented models (Java C++...).
FMEA has undergone no such evolution.
%
In a world where sensor systems, often including embedded software components, are brought in to
create complex systems, FMEA still follows a rigid {\bc} {\fm} to system level error model,
that is only suitable for simple electro mechanical systems.
%
%
% MAYBE MOVE THIS TO CH3, FMEA CRITICISM
% 30JAN2013
%
\subsection{Where FMEA is now.}
FMEA useful tool for basic safety --- provides statistics on safety where field data impractical ---
very good with single failure modes linked to top level events.
FMEA has become part of the safety critical and safety certification industries.
%
SFMEA is in its infancy, and there are corresponding gaps in
certification for software, EN61508~\cite{en61508}, recommends hardware redundancy architectures in conjunction
with FMEDA for hardware: for software it recommends language constraints and quality procedures
but no inductive fault finding technique.
%
FMEA has adapted from a cost saving exercise for mass produced items~\cite{bfmea,generic_automotive_fmea_6034891}, to incorporating statistical techniques
(FMECA) to allowing for self diagnostic mitigation (FMEDA).
%
However, it is still based on the concept of single component failures mapped to top~level/system~failures.
All these FMEA based methodologies have the following short comings:
\begin{itemize}
\item Impossible to integrate Software and hardware models,
\item State explosion problem exacerbated by increasing complexity due to density of modern electronics,
\item Impossibility to consider all multiple component failure modes~\cite{FMEAmultiple653556}
\end{itemize}
% MOVED TO CH3: 15MAR2013
%
% \section{Conclusions on current FMEA Methodologies}
%
% %% FOCUS
% The focus of this chapter %literature review
% is to establish the current practice and applications
% of FMEA.
% %, and to examine its strengths and weaknesses.
% %% GOAL
% Its
% goal is to identify central issues and to criticise and assess the current
% FMEA methodologies.
% %% PERSPECTIVE
% The perspective of the author, is as a practitioner of static failure mode analysis techniques
% concerning approval of product
% to European safety standards, both the prescriptive~\cite{en298,en230} and statistical~\cite{en61508}.
% A second perspective is that of a software engineer trained to use formal methods.
% Examining FMEA methodologies for mathematical properties, influenced by
% formal methods applied to software, should provide a perspective not traditionally considered.
% %% COVERAGE
% The literature reviewed, has been restricted to published books, European safety standards (as examples
% of current safety measures applied), and traditional research, from journal and conference papers.
% %% ORGANISATION
% The review is organised by concept, that is, FMEA can be applied to hardware, software, software~interfacing and
% to multiple failure scenarios etc. Methodologies related to FMEA are briefly covered for the sake of context.
% %% AUDIENCE
% % Well duh! PhD supervisors and examiners....
%
% % \subsection{Related Methodologies}
% % FTA --- HAZOP --- ALARP --- Event Tree Analysis --- bow tie concept
% % \subsection{Hardware FMEA (HFMEA)}
% % \subsection{Multiple Failure scenarios and FMEA}
% % \subsection{Software FMEA (SFMEA)}
%
% \paragraph{Current work on Software FMEA}
%
% SFMEA usually does not seek to integrate
% hardware and software models, but to perform
% FMEA on the software in isolation~\cite{procsfmea}.
% %
% Work has been performed using databases
% to track the relationships between variables
% and system failure modes~\cite{procsfmeadb}, to %work has been performed to
% introduce automation into the FMEA process~\cite{appswfmea} and to provide code analysis
% automation~\cite{modelsfmea}. Although the SFMEA and hardware FMEAs are performed separately,
% some schools of thought aim for Fault Tree Analysis (FTA)~\cite{nasafta,nucfta} (top down - deductive)
% and FMEA (bottom-up inductive)
% to be performed on the same system to provide insight into the
% software hardware/interface~\cite{embedsfmea}.
% %
% Although this
% would give a better picture of the failure mode behaviour, it
% is by no means a rigorous approach to tracing errors that may occur in hardware
% through to the top (and therefore ultimately controlling) layer of software.
%
% \paragraph{Current FMEA techniques are not suitable for software}
%
% The main FMEA methodologies are all based on the concept of taking
% base component {\fms}, and translating them into system level events/failures~\cite{sfmea,sfmeaa}.
% %
% In a complicated system, mapping a component failure mode to a system level failure
% will mean a long reasoning distance; that is to say the actions of the
% failed component will have to be traced through
% several sub-systems, gauging its effects with and on other components.
% %
% With software at the higher levels of these sub-systems,
% we have yet another layer of complication.
% %
% %In order to integrate software, %in a meaningful way
% %we need to re-think the
% %FMEA concept of simply mapping a base component failure to a system level event.
% %
% SFMEA regards, in place of hardware components, the variables used by the programs to be their equivalent~\cite{procsfmea}.
% The failure modes of these variables, are that they could become erroneously over-written,
% calculated incorrectly (due to a mistake by the programmer, or a fault in the micro-processor on which it is running), or
% external influences such as
% ionising radiation causing bits to be erroneously altered.
%
%
% \paragraph{FMEA and Modularity}
% From the 1940's onwards, software has evolved from a simple procedural languages (i.e. assembly language/Fortran~\cite{f77} call return)
% to structured programming ( C~\cite{DBLP:books/ph/KernighanR88}, pascal etc) and then to object oriented models (Java C++...).
% FMEA has undergone no such evolution.
% %
% In a world where sensor systems, often including embedded software components, are brought in to
% create complex systems, FMEA still follows a rigid {\bc} {\fm} to system level error model,
% that is only suitable for simple electro mechanical systems.
%
%
%
% %
%
% %
% % MAYBE MOVE THIS TO CH3, FMEA CRITICISM
% % 30JAN2013
% %
%
% \subsection{Where FMEA is now.}
% FMEA useful tool for basic safety --- provides statistics on safety where field data impractical ---
% very good with single failure modes linked to top level events.
% FMEA has become part of the safety critical and safety certification industries.
% %
% SFMEA is in its infancy, and there are corresponding gaps in
% certification for software, EN61508~\cite{en61508}, recommends hardware redundancy architectures in conjunction
% with FMEDA for hardware: for software it recommends language constraints and quality procedures
% but no inductive fault finding technique.
% %
% FMEA has adapted from a cost saving exercise for mass produced items~\cite{bfmea,generic_automotive_fmea_6034891}, to incorporating statistical techniques
% (FMECA) to allowing for self diagnostic mitigation (FMEDA).
% %
% However, it is still based on the concept of single component failures mapped to top~level/system~failures.
% All these FMEA based methodologies have the following short comings:
% \begin{itemize}
% \item Impossible to integrate Software and hardware models,
% \item State explosion problem exacerbated by increasing complexity due to density of modern electronics,
% \item Impossibility to consider all multiple component failure modes~\cite{FMEAmultiple653556}
% \end{itemize}

View File

@ -10,6 +10,7 @@ In the 1980s FMEA was extended again (FMEDA~\cite{fmeda}) to provide statistics
for predicting failure rates.
However a typical entry in each of the above methodologies, starts with a
particular component failure mode and associates it with a system---or top level---failure symptom.
This means that we have one analysis case per component failure mode for all the components in the system under investigation.
This analysis philosophy has not changed since FMEA was first used.
@ -187,6 +188,122 @@ utterly anachronistic in the distributed real time system environment.
FMEA is no longer fit for purpose!
%
\section{Conclusions on current FMEA Methodologies}
%% FOCUS
The focus of this chapter %literature review
is to establish the current practice and applications
of FMEA.
%, and to examine its strengths and weaknesses.
%% GOAL
Its
goal is to identify central issues and to criticise and assess the current
FMEA methodologies.
%% PERSPECTIVE
The perspective of the author, is as a practitioner of static failure mode analysis techniques
concerning approval of product
to European safety standards, both the prescriptive~\cite{en298,en230} and statistical~\cite{en61508}.
A second perspective is that of a software engineer trained to use formal methods.
Examining FMEA methodologies for mathematical properties, influenced by
formal methods applied to software, should provide a perspective not traditionally considered.
%% COVERAGE
The literature reviewed, has been restricted to published books, European safety standards (as examples
of current safety measures applied), and traditional research, from journal and conference papers.
%% ORGANISATION
The review is organised by concept, that is, FMEA can be applied to hardware, software, software~interfacing and
to multiple failure scenarios etc. Methodologies related to FMEA are briefly covered for the sake of context.
%% AUDIENCE
% Well duh! PhD supervisors and examiners....
% \subsection{Related Methodologies}
% FTA --- HAZOP --- ALARP --- Event Tree Analysis --- bow tie concept
% \subsection{Hardware FMEA (HFMEA)}
% \subsection{Multiple Failure scenarios and FMEA}
% \subsection{Software FMEA (SFMEA)}
\paragraph{Current work on Software FMEA}
SFMEA usually does not seek to integrate
hardware and software models, but to perform
FMEA on the software in isolation~\cite{procsfmea}.
%
Work has been performed using databases
to track the relationships between variables
and system failure modes~\cite{procsfmeadb}, to %work has been performed to
introduce automation into the FMEA process~\cite{appswfmea} and to provide code analysis
automation~\cite{modelsfmea}. Although the SFMEA and hardware FMEAs are performed separately,
some schools of thought aim for Fault Tree Analysis (FTA)~\cite{nasafta,nucfta} (top down - deductive)
and FMEA (bottom-up inductive)
to be performed on the same system to provide insight into the
software hardware/interface~\cite{embedsfmea}.
%
Although this
would give a better picture of the failure mode behaviour, it
is by no means a rigorous approach to tracing errors that may occur in hardware
through to the top (and therefore ultimately controlling) layer of software.
\paragraph{Current FMEA techniques are not suitable for software}
The main FMEA methodologies are all based on the concept of taking
base component {\fms}, and translating them into system level events/failures~\cite{sfmea,sfmeaa}.
%
In a complicated system, mapping a component failure mode to a system level failure
will mean a long reasoning distance; that is to say the actions of the
failed component will have to be traced through
several sub-systems, gauging its effects with and on other components.
%
With software at the higher levels of these sub-systems,
we have yet another layer of complication.
%
%In order to integrate software, %in a meaningful way
%we need to re-think the
%FMEA concept of simply mapping a base component failure to a system level event.
%
SFMEA regards, in place of hardware components, the variables used by the programs to be their equivalent~\cite{procsfmea}.
The failure modes of these variables, are that they could become erroneously over-written,
calculated incorrectly (due to a mistake by the programmer, or a fault in the micro-processor on which it is running), or
external influences such as
ionising radiation causing bits to be erroneously altered.
\paragraph{FMEA and Modularity}
From the 1940's onwards, software has evolved from a simple procedural languages (i.e. assembly language/Fortran~\cite{f77} call return)
to structured programming ( C~\cite{DBLP:books/ph/KernighanR88}, pascal etc) and then to object oriented models (Java C++...).
FMEA has undergone no such evolution.
%
In a world where sensor systems, often including embedded software components, are brought in to
create complex systems, FMEA still follows a rigid {\bc} {\fm} to system level error model,
that is only suitable for simple electro mechanical systems.
%
%
% MAYBE MOVE THIS TO CH3, FMEA CRITICISM
% 30JAN2013
%
\subsection{Where FMEA is now.}
FMEA useful tool for basic safety --- provides statistics on safety where field data impractical ---
very good with single failure modes linked to top level events.
FMEA has become part of the safety critical and safety certification industries.
%
SFMEA is in its infancy, and there are corresponding gaps in
certification for software, EN61508~\cite{en61508}, recommends hardware redundancy architectures in conjunction
with FMEDA for hardware: for software it recommends language constraints and quality procedures
but no inductive fault finding technique.
%
FMEA has adapted from a cost saving exercise for mass produced items~\cite{bfmea,generic_automotive_fmea_6034891}, to incorporating statistical techniques
(FMECA) to allowing for self diagnostic mitigation (FMEDA).
%
However, it is still based on the concept of single component failures mapped to top~level/system~failures.
All these FMEA based methodologies have the following short comings:
\begin{itemize}
\item Impossible to integrate Software and hardware models,
\item State explosion problem exacerbated by increasing complexity due to density of modern electronics,
\item Impossibility to consider all multiple component failure modes~\cite{FMEAmultiple653556}
\end{itemize}

View File

@ -1,24 +0,0 @@
# Makefile to create all graphics file etc
#
# Place all .dia files here as .png targets
#
DIA =
doc: $(DIA)
#
#bib:
#
# bibtex HR230003_combined_o2_co_sensor
#
%.png:%.dia
dia -t png $<
copy:
echo $@

View File

@ -1,11 +0,0 @@
\label{sec:chap8}
%%
%% CH8 finishing up and appendixes
%%
\printglossary
\addcontentsline{toc}{chapter}{Glossary}