Sunday edit, Chapter 3, criticism of FMEA

leading to wish list.
2013-02-17 14:07:17 +00:00 · 2013-02-17 14:07:17 +00:00 · b2edbec678
commit b2edbec678
parent f9a8f958d4
3 changed files with 125 additions and 12 deletions
--- a/submission_thesis/CH3_FMEA_criticism/Makefile
+++ b/submission_thesis/CH3_FMEA_criticism/Makefile
@ -3,7 +3,7 @@
 #
 # Place all .dia files here as .png targets
 #
-DIA =
+DIA = distcon.png
 doc: $(DIA)
--- a/submission_thesis/CH3_FMEA_criticism/copy.tex
+++ b/submission_thesis/CH3_FMEA_criticism/copy.tex
@ -1,9 +1,10 @@
 \label{sec:chap3}
 \section{Historical Origins of FMEA}
 \subsection{FMEA designed for simple electro-mechanical systems}
 FMEA traces it roots to the 1940s when it was used to identify the most costly
-failures arising from car mass-production~\cite{pfmea}.
+failures arising from car mass-production~\cite{bfmea}.
 It was later modified slightly to include severity of the top level failure (FMECA~\cite{fmeca}).
 In the 1980s FMEA was extended again (FMEDA~\cite{fmeda}) to provide statistics
 for predicting failure rates.
@ -31,11 +32,119 @@ This problem is compounded by the fact that traditional FMEA cannot integrate so
 \section{Reasoning Distance used to measure Comparison Complexity}
 Traditional FMEA cannot ensure that each failure mode of all its
 components are checked against any other components in the system which
 it may affect, due to state explosion.
 FMEA is therefore performed using heuristics to decide
 which components to check the effect of a component failure mode on.
 We could term the number of checks made for each failure mode
 on aspects of the system to be the reasoning distance.
 Were we to compare the reasoning distance with the theoretical maximum, the sum of all failure
 modes in a system, multiplied by the number of components in it, we could arrive at a comparison complexity figure.
 This figure would mean we could compare the maximum number of checks (i.e. rigorous analysis)
 with the number actually performed.
 \section{software and FMEA}
 Traditional FMEA deals only with electrical and mechanical components, i.e. it does not have provision for software.
 Modern control systems nearly always have a significant software/firmware element,
 and not being able to model software with current FMEA methodologies 
 is a cause for criticism~\cite{safeware}[Ch.12]. Similar difficulties in integrating mechanical and electronic/software
 failure models are discussed in ~\cite{SMR:SMR580}.
-\section{FMEA - General Criticism}
+\paragraph{Current work on Software FMEA}
-\subsection{FMEA - General Criticism}
+SFMEA usually does not seek to integrate
 hardware and software models, but to perform
 FMEA on the software in isolation~\cite{procsfmea}.
 %
 Work has been performed using databases
 to track the relationships between variables 
 and system failure modes~\cite{procsfmeadb}, to %work has been performed to 
 introduce automation into the FMEA process~\cite{appswfmea} and to provide code analysis
 automation~\cite{modelsfmea}. Although the SFMEA and hardware FMEAs are performed separately,
 some schools of thought aim for Fault Tree Analysis (FTA)~\cite{nasafta,nucfta} (top down - deductive) 
 and FMEA (bottom-up inductive)
 to be performed on the same system to provide insight into the
 software hardware/interface~\cite{embedsfmea}.
 %
 Although this
 would give a better picture of the failure mode behaviour, it
 is by no means a rigorous approach to tracing errors that may occur in hardware
 through to the top (and therefore ultimately controlling) layer of software.
 \subsection{The rise of the smart instrument}
 %% AWE --- Atomic Weapons Establishment have this problem....
 A smart instrument is defined as one that uses a micro-processor and software 
 in conjunction with its sensing electronics, rather than
 analogue electronics only.
 %
 It is termed `smart' because it has some software, or intelligence incorporated into it.
 %
 An AVO-8 multi-meter circa 1970, uses only analogue electronics, and we can determine
 using FMEA how component failures within it could affect readings.
 %
 A modern multi-meter will have a small dedicated micro-processor and sensing electronics, all on the same chip,
 with firmware to read the user controls, and display results on an LCD.
 %
 For quality control, many safety critical processes require regular inspections
 and measurements of physical characteristics of materials and machinery.
 %
 For highly critical systems i.e. the nuclear industry, the instruments used to perform these measurements, must be analysed for
 FMEA, to ensure that failure modes within the instrument cannot lead to invalid measurements.
 %
 Most modern instruments now use highly integrated electronics coupled to micro-controllers, which read and filter the measurements,
 and interface to an LCD readout.
 %
 For the highly critical systems, that means they cannot use traditional FMEA to validate
 the design of instruments.
 %
 While noting that being more modern, these instruments are likely to be more reliable and 
 accurate than the analogue instruments in use some twenty years ago but this cannot be validated 
 to a high level of reliability by traditional FMEA.
 \subsection{Distributed real time systems}
 Distributed real time systems are control systems where 
 smart sensors communicate over a communications bus to
 a master controller. 
 %
 Most modern cars follow this pattern and use CANbus~\cite{canspec,can}.
 %
 For instance, the throttle pedal will be linked to a sensor to determine how
 far the pedal is pressed. This sensor will be read by a micro-controller, and passed, via CANbus, to the Engine Control Unit (ECU)
 which will use that information (along with information from other sensors) to adjust the power required from the engine.
 In terms of FMEA, see figure~\ref{fig:distcon}, our reasoning path spans four interface layers of electronics to software.
 Traditional FMEA does not cater for the software hardware interface, and here we have the addition complications
 %with the additional complications
 of the communications protocol used to transmit data, and the failure mode characteristics
 of the communications physical layer.
 (figure~\ref{fig:distcon}
 The failure reasoning paths for a typical section of a distributed real time system, mean that traditional FMEA
 is almost impossible to perform.
 %
 The base component failure mode to system failure paradigm is utterly anachronistic in the distributed real time system environment.
 \begin{figure}[h]
 \centering
 \includegraphics[width=400pt]{./CH3_FMEA_criticism/distcon.png}
 % distcon.png: 1622x656 pixel, 72dpi, 57.22x23.14 cm, bb=0 0 1622 656
 \caption{Distributed Control System FMEA reasoning path for a single failure.}
 \label{fig:distcon}
 \end{figure}
 \section{FMEA ---- general criticism --- conclusion}
 %\subsection{FMEA - General Criticism}
 \begin{itemize}
   \item FMEA type methodologies were designed for simple electro-mechanical systems of the 1940's to 1960's.
@ -43,26 +152,30 @@ This problem is compounded by the fact that traditional FMEA cannot integrate so
   \item State explosion - impossible to perform rigorously
   \item Difficult to re-use previous analysis work
   \item Very Difficult to model simultaneous failures.
-  
+   \item Software and hardware models are separate.
   \item Distributed real time systemsare very difficult to meaningfully analyse with FMEA.
 \end{itemize}
 FMEA is no longer fit for purpose!
 %
-\subsection{FMEA - Better Methodology - Wish List}
+%\subsection{FMEA - Better Methodology - Wish List}
 \subsection{FMEA - Better Methodology - Wish List}
 We now form a wish list, stating the features that we would want
 in an improved FMEA methodology,
 \begin{itemize}
-  
+    \item No state explosion making analysis impractical,  
-    \item State explosion  
+   \item  Rigorous (total failure coverage within {\fgs} all interacting component and failure modes checked),
-   \item  Rigorous (total coverage)
+    \item Reasoning Traceable in system models,  
-    \item Reasoning Traceable  
+    \item Re-useable i.e. it should be possible to re-use analysis performed previously,
-    \item Re-useable
+    \item It must be possible to analyse simultaneous/multiple failures, 
-  \item Simultaneous failures
+    \item Modular --- i.e. usable in a distributed system.
  % \item  
 \end{itemize}
--- a/submission_thesis/CH3_FMEA_criticism/distcon.dia
+++ b/submission_thesis/CH3_FMEA_criticism/distcon.dia