diff --git a/submission_thesis/CH3_FMEA_criticism/Makefile b/submission_thesis/CH3_FMEA_criticism/Makefile index 5743eec..cdaf773 100644 --- a/submission_thesis/CH3_FMEA_criticism/Makefile +++ b/submission_thesis/CH3_FMEA_criticism/Makefile @@ -3,7 +3,7 @@ # # Place all .dia files here as .png targets # -DIA = +DIA = distcon.png doc: $(DIA) diff --git a/submission_thesis/CH3_FMEA_criticism/copy.tex b/submission_thesis/CH3_FMEA_criticism/copy.tex index 478e72e..c1b5ea5 100644 --- a/submission_thesis/CH3_FMEA_criticism/copy.tex +++ b/submission_thesis/CH3_FMEA_criticism/copy.tex @@ -1,9 +1,10 @@ \label{sec:chap3} \section{Historical Origins of FMEA} + \subsection{FMEA designed for simple electro-mechanical systems} FMEA traces it roots to the 1940s when it was used to identify the most costly -failures arising from car mass-production~\cite{pfmea}. +failures arising from car mass-production~\cite{bfmea}. It was later modified slightly to include severity of the top level failure (FMECA~\cite{fmeca}). In the 1980s FMEA was extended again (FMEDA~\cite{fmeda}) to provide statistics for predicting failure rates. @@ -31,11 +32,119 @@ This problem is compounded by the fact that traditional FMEA cannot integrate so \section{Reasoning Distance used to measure Comparison Complexity} +Traditional FMEA cannot ensure that each failure mode of all its +components are checked against any other components in the system which +it may affect, due to state explosion. +FMEA is therefore performed using heuristics to decide +which components to check the effect of a component failure mode on. +We could term the number of checks made for each failure mode +on aspects of the system to be the reasoning distance. +Were we to compare the reasoning distance with the theoretical maximum, the sum of all failure +modes in a system, multiplied by the number of components in it, we could arrive at a comparison complexity figure. +This figure would mean we could compare the maximum number of checks (i.e. rigorous analysis) +with the number actually performed. + +\section{software and FMEA} + +Traditional FMEA deals only with electrical and mechanical components, i.e. it does not have provision for software. +Modern control systems nearly always have a significant software/firmware element, +and not being able to model software with current FMEA methodologies +is a cause for criticism~\cite{safeware}[Ch.12]. Similar difficulties in integrating mechanical and electronic/software +failure models are discussed in ~\cite{SMR:SMR580}. -\section{FMEA - General Criticism} +\paragraph{Current work on Software FMEA} -\subsection{FMEA - General Criticism} +SFMEA usually does not seek to integrate +hardware and software models, but to perform +FMEA on the software in isolation~\cite{procsfmea}. +% +Work has been performed using databases +to track the relationships between variables +and system failure modes~\cite{procsfmeadb}, to %work has been performed to +introduce automation into the FMEA process~\cite{appswfmea} and to provide code analysis +automation~\cite{modelsfmea}. Although the SFMEA and hardware FMEAs are performed separately, +some schools of thought aim for Fault Tree Analysis (FTA)~\cite{nasafta,nucfta} (top down - deductive) +and FMEA (bottom-up inductive) +to be performed on the same system to provide insight into the +software hardware/interface~\cite{embedsfmea}. +% +Although this +would give a better picture of the failure mode behaviour, it +is by no means a rigorous approach to tracing errors that may occur in hardware +through to the top (and therefore ultimately controlling) layer of software. + + +\subsection{The rise of the smart instrument} +%% AWE --- Atomic Weapons Establishment have this problem.... +A smart instrument is defined as one that uses a micro-processor and software +in conjunction with its sensing electronics, rather than +analogue electronics only. +% +It is termed `smart' because it has some software, or intelligence incorporated into it. +% +An AVO-8 multi-meter circa 1970, uses only analogue electronics, and we can determine +using FMEA how component failures within it could affect readings. +% +A modern multi-meter will have a small dedicated micro-processor and sensing electronics, all on the same chip, +with firmware to read the user controls, and display results on an LCD. +% +For quality control, many safety critical processes require regular inspections +and measurements of physical characteristics of materials and machinery. +% +For highly critical systems i.e. the nuclear industry, the instruments used to perform these measurements, must be analysed for +FMEA, to ensure that failure modes within the instrument cannot lead to invalid measurements. +% +Most modern instruments now use highly integrated electronics coupled to micro-controllers, which read and filter the measurements, +and interface to an LCD readout. +% +For the highly critical systems, that means they cannot use traditional FMEA to validate +the design of instruments. +% +While noting that being more modern, these instruments are likely to be more reliable and +accurate than the analogue instruments in use some twenty years ago but this cannot be validated +to a high level of reliability by traditional FMEA. + +\subsection{Distributed real time systems} + +Distributed real time systems are control systems where +smart sensors communicate over a communications bus to +a master controller. +% +Most modern cars follow this pattern and use CANbus~\cite{canspec,can}. +% +For instance, the throttle pedal will be linked to a sensor to determine how +far the pedal is pressed. This sensor will be read by a micro-controller, and passed, via CANbus, to the Engine Control Unit (ECU) +which will use that information (along with information from other sensors) to adjust the power required from the engine. +In terms of FMEA, see figure~\ref{fig:distcon}, our reasoning path spans four interface layers of electronics to software. +Traditional FMEA does not cater for the software hardware interface, and here we have the addition complications +%with the additional complications +of the communications protocol used to transmit data, and the failure mode characteristics +of the communications physical layer. + +(figure~\ref{fig:distcon} +The failure reasoning paths for a typical section of a distributed real time system, mean that traditional FMEA +is almost impossible to perform. +% +The base component failure mode to system failure paradigm is utterly anachronistic in the distributed real time system environment. + + +\begin{figure}[h] + \centering + \includegraphics[width=400pt]{./CH3_FMEA_criticism/distcon.png} + % distcon.png: 1622x656 pixel, 72dpi, 57.22x23.14 cm, bb=0 0 1622 656 + \caption{Distributed Control System FMEA reasoning path for a single failure.} + \label{fig:distcon} +\end{figure} + + + + + + +\section{FMEA ---- general criticism --- conclusion} + +%\subsection{FMEA - General Criticism} \begin{itemize} \item FMEA type methodologies were designed for simple electro-mechanical systems of the 1940's to 1960's. @@ -43,26 +152,30 @@ This problem is compounded by the fact that traditional FMEA cannot integrate so \item State explosion - impossible to perform rigorously \item Difficult to re-use previous analysis work \item Very Difficult to model simultaneous failures. - + \item Software and hardware models are separate. + \item Distributed real time systemsare very difficult to meaningfully analyse with FMEA. \end{itemize} +FMEA is no longer fit for purpose! % -\subsection{FMEA - Better Methodology - Wish List} +%\subsection{FMEA - Better Methodology - Wish List} \subsection{FMEA - Better Methodology - Wish List} +We now form a wish list, stating the features that we would want +in an improved FMEA methodology, \begin{itemize} - - \item State explosion - \item Rigorous (total coverage) - \item Reasoning Traceable - \item Re-useable - \item Simultaneous failures + \item No state explosion making analysis impractical, + \item Rigorous (total failure coverage within {\fgs} all interacting component and failure modes checked), + \item Reasoning Traceable in system models, + \item Re-useable i.e. it should be possible to re-use analysis performed previously, + \item It must be possible to analyse simultaneous/multiple failures, + \item Modular --- i.e. usable in a distributed system. % \item \end{itemize} diff --git a/submission_thesis/CH3_FMEA_criticism/distcon.dia b/submission_thesis/CH3_FMEA_criticism/distcon.dia new file mode 100644 index 0000000..ba7ef63 Binary files /dev/null and b/submission_thesis/CH3_FMEA_criticism/distcon.dia differ