Go through Chris Garret CH6 note First half CH7 notes, and remove allot of formal defs from CH7
214 lines
10 KiB
TeX
214 lines
10 KiB
TeX
\label{sec:chap3}
|
|
|
|
\section{Historical Origins of FMEA}
|
|
|
|
\subsection{FMEA designed for simple electro-mechanical systems}
|
|
FMEA traces it roots to the 1940s when it was used to identify the most costly
|
|
failures arising from car mass-production~\cite{bfmea}.
|
|
It was later modified slightly to include severity of the top level failure (FMECA~\cite{fmeca}).
|
|
In the 1980s FMEA was extended again (FMEDA~\cite{fmeda}) to provide statistics
|
|
for predicting failure rates.
|
|
However a typical entry in each of the above methodologies, starts with a
|
|
particular component failure mode and associates it with a system---or top level---failure symptom.
|
|
This analysis philosophy has not changed since FMEA was first used.
|
|
|
|
|
|
\subsection{FMEA does not support modularity.}
|
|
It is a common practise in the process control industry to buy in sub-systems, typically sensors and actuators connected to an industrially hardened computer bus, i.e. CANbus~\cite{can,canspec}, modbus~\cite{modbus} etc.
|
|
Most sensor systems now are `smart', that is to say, they contain programmatic elements
|
|
even if their outputs are %they supply
|
|
analogue signals. For instance a liquid level sensor that
|
|
supplies a {\ft} output, would have been typically have been implemented
|
|
in analogue electronics before the 1980s. After that time, it would be common to use a micro-processor
|
|
based system to perform the functions of reading the sensor and converting it to a current (\ft) output.
|
|
For the non-safety critical systems integrator this brings with it the advantages
|
|
that come with using a digital system (increased accuracy, self checking and ease of
|
|
calibration etc. ). For a safety critical systems integrator this can be very problematic when it
|
|
comes to approvals. Even if the sensor manufacturer will let you see the internal workings and software
|
|
we have a problem with tracing the FMEA reasoning through the sensor, through the sensors software
|
|
and then though the system being integrated.
|
|
This problem is compounded by the fact that traditional FMEA cannot integrate software into FMEA models~\cite{sfmea,safeware}.
|
|
|
|
|
|
\section{Reasoning Distance used to measure Comparison Complexity}
|
|
|
|
Traditional FMEA cannot ensure that each failure mode of all its
|
|
components are checked against any other components in the system which
|
|
it may affect, due to state explosion.
|
|
%
|
|
FMEA is therefore performed using heuristics to decide
|
|
which components to check the effect of a component failure mode on.
|
|
We could term the number of checks made for each failure mode
|
|
on aspects of the system to be the reasoning distance.
|
|
%
|
|
In practise FMEA may be performed by following the signal path
|
|
of the component failure mode to its system level effect. This is less than ideal
|
|
and it can easily miss interactions with adjacent components, that could cause
|
|
other system level symptoms.
|
|
%
|
|
Were we to compare the reasoning distance with the theoretical maximum, the sum of all failure
|
|
modes in a system, multiplied by the number of components in it, we could arrive at a comparison complexity figure.
|
|
This figure would mean we could compare the maximum number of checks (i.e. exhaustive%rigorous
|
|
analysis) with the number actually performed.
|
|
|
|
\paragraph{The ideal of exhaustive FMEA (XFMEA)}
|
|
Obviously, exhaustively checking every component failure mode in a system,
|
|
against all other components is the ideal for finding all possible system level failures.
|
|
While this is impossible for all but trivial systems, it should be possible
|
|
for small groups of components that work together to provide a well defined function.
|
|
We could term such a group a `{\fg}'.
|
|
|
|
\section{Re-use of FMEA analysis}
|
|
|
|
Given the {\bc} {\fm} to system level failure mode paradigm it is
|
|
difficult to re-use FMEA analysis.
|
|
Several strategies to aid re-use have been proposed~\cite{rudov2009language, reuse_of_fmea}, but
|
|
the fundamental problem remains, that, with any changes
|
|
to the component base in a system, it is very difficult to
|
|
determine which FMEA test scenarios must be re-worked.
|
|
|
|
|
|
\section{software and FMEA}
|
|
|
|
Traditional FMEA deals only with electrical and mechanical components, i.e. it does not have provision for software.
|
|
Modern control systems nearly always have a significant software/firmware element,
|
|
and not being able to model software with current FMEA methodologies
|
|
is a cause for criticism~\cite{safeware}[Ch.12]. Similar difficulties in integrating mechanical and electronic/software
|
|
failure models are discussed in ~\cite{SMR:SMR580}.
|
|
|
|
|
|
\paragraph{Current work on Software FMEA}
|
|
|
|
SFMEA usually does not seek to integrate
|
|
hardware and software models, but to perform
|
|
FMEA on the software in isolation~\cite{procsfmea}.
|
|
%
|
|
Work has been performed using databases
|
|
to track the relationships between variables
|
|
and system failure modes~\cite{procsfmeadb}, to %work has been performed to
|
|
introduce automation into the FMEA process~\cite{appswfmea} and to provide code analysis
|
|
automation~\cite{modelsfmea}. Although the SFMEA and hardware FMEAs are performed separately,
|
|
some schools of thought aim for Fault Tree Analysis (FTA)~\cite{nasafta,nucfta} (top down - deductive)
|
|
and FMEA (bottom-up inductive)
|
|
to be performed on the same system to provide insight into the
|
|
software hardware/interface~\cite{embedsfmea}.
|
|
%
|
|
Although this
|
|
would give a better picture of the failure mode behaviour, it
|
|
is by no means a rigorous approach to tracing errors that may occur in hardware
|
|
through to the top (and therefore ultimately controlling) layer of software.
|
|
|
|
|
|
\subsection{The rise of the smart instrument}
|
|
%% AWE --- Atomic Weapons Establishment have this problem....
|
|
A smart instrument is defined as one that uses a micro-processor and software
|
|
in conjunction with its sensing electronics, rather than
|
|
analogue electronics only.
|
|
%
|
|
It is termed `smart' because it has some software, or intelligence incorporated into it.
|
|
%
|
|
An AVO-8 multi-meter circa 1970, uses only analogue electronics, and we can determine
|
|
using FMEA how component failures within it could affect readings.
|
|
%
|
|
A modern multi-meter will have a small dedicated micro-processor and sensing electronics, all on the same chip,
|
|
with firmware to read the user controls, and display results on an LCD.
|
|
%
|
|
For quality control, many safety critical processes require regular inspections
|
|
and measurements of physical characteristics of materials and machinery.
|
|
%
|
|
For highly critical systems i.e. the nuclear industry, the instruments used to perform these measurements, must be analysed for
|
|
FMEA, to ensure that failure modes within the instrument cannot lead to invalid measurements.
|
|
%
|
|
Most modern instruments now use highly integrated electronics coupled to micro-controllers, which read and filter the measurements,
|
|
and interface to an LCD readout.
|
|
%
|
|
For the highly critical systems, that means they cannot use traditional FMEA to validate
|
|
the design of instruments.
|
|
%
|
|
While noting that being more modern, these instruments are likely to be more reliable and
|
|
accurate than the analogue instruments in use some twenty years ago but this cannot be validated
|
|
to a high level of reliability by traditional FMEA.
|
|
|
|
\subsection{Distributed real time systems}
|
|
|
|
Distributed real time systems are control systems where
|
|
smart sensors communicate over a communications bus to
|
|
a master controller.
|
|
%
|
|
Most modern cars follow this information technology pattern and use CANbus~\cite{canspec,can}.
|
|
%
|
|
For instance, in a modern car there will be no mechanical linkage from the pedal to the engine, instead the throttle pedal will be linked to a sensor to determine how
|
|
far the pedal is pressed.
|
|
This sensor will be read by a micro-controller, and passed, via CANbus, to the Engine Control Unit (ECU)
|
|
which will use that information (along with information from other sensors) to adjust the power required from the engine.
|
|
This adjustment could be direct, or could be another CANbus message passed to a micro-controller regulating engine function.
|
|
In terms of FMEA, see figure~\ref{fig:distcon}, our reasoning path spans four interface layers of electronics to software.
|
|
Traditional FMEA does not cater for the software hardware interface, and here we have the addition complications
|
|
%with the additional complications
|
|
of the communications protocol used to transmit data, and the failure mode characteristics
|
|
of the communications physical layer.
|
|
|
|
%(figure~\ref{fig:distcon}
|
|
The failure reasoning paths for a distributed real time system, with its multiple passes of the hardware/software
|
|
interface mean traditional FMEA, for these systems,
|
|
is impossible to perform.
|
|
%
|
|
The base component failure mode to system failure paradigm is
|
|
utterly anachronistic in the distributed real time system environment.
|
|
|
|
|
|
\begin{figure}[h]
|
|
\centering
|
|
\includegraphics[width=400pt]{./CH3_FMEA_criticism/distcon.png}
|
|
% distcon.png: 1622x656 pixel, 72dpi, 57.22x23.14 cm, bb=0 0 1622 656
|
|
\caption{Distributed Control System FMEA reasoning path for a single failure.}
|
|
\label{fig:distcon}
|
|
\end{figure}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
\section{FMEA ---- general criticism --- conclusion}
|
|
|
|
%\subsection{FMEA - General Criticism}
|
|
|
|
\begin{itemize}
|
|
\item FMEA type methodologies were designed for simple electro-mechanical systems of the 1940's to 1960's.
|
|
\item Reasoning Distance - component failure to system level symptom
|
|
\item State explosion - impossible to perform FMEA exhaustively %rigorously
|
|
\item Difficult to re-use previous analysis work
|
|
\item Very Difficult to model simultaneous failures.
|
|
\item Software and hardware models are separate.
|
|
\item Distributed real time systems are very difficult to meaningfully analyse with FMEA.
|
|
\end{itemize}
|
|
|
|
FMEA is no longer fit for purpose!
|
|
%
|
|
|
|
|
|
|
|
|
|
%\subsection{FMEA - Better Methodology - Wish List}
|
|
|
|
|
|
\subsection{FMEA - Better Methodology - Wish List}
|
|
|
|
We now form a wish list, stating the features that we would want
|
|
in an improved FMEA methodology,
|
|
\begin{itemize}
|
|
\item No state explosion making analysis impractical,
|
|
\item Rigorous (total failure coverage within {\fgs} all interacting component and failure modes checked),
|
|
\item Reasoning Traceable in system models,
|
|
\item Re-useable i.e. it should be possible to re-use analysis performed previously,
|
|
\item It must be possible to analyse simultaneous/multiple failures,
|
|
\item Modular --- i.e. usable in a distributed system.
|
|
% \item
|
|
\end{itemize}
|
|
|
|
%FMEDA is a modern extension of FMEA, in that it will allow for
|
|
%self checking features, and provides detailed recommendations for computer/software architecture,
|
|
%but
|
|
|