Sunday edit, Chapter 3, criticism of FMEA

leading to wish list.
2013-02-17 14:07:17 +00:00 · 2013-02-17 14:07:17 +00:00 · b2edbec678
commit b2edbec678
parent f9a8f958d4
3 changed files with 125 additions and 12 deletions
--- a/submission_thesis/CH3_FMEA_criticism/Makefile
+++ b/submission_thesis/CH3_FMEA_criticism/Makefile
@ -3,7 +3,7 @@
 #
 # Place all .dia files here as .png targets
 #
-DIA =
+DIA = distcon.png


 doc: $(DIA)
--- a/submission_thesis/CH3_FMEA_criticism/copy.tex
+++ b/submission_thesis/CH3_FMEA_criticism/copy.tex
@ -1,9 +1,10 @@
 \label{sec:chap3}

 \section{Historical Origins of FMEA}
+
 \subsection{FMEA designed for simple electro-mechanical systems}
 FMEA traces it roots to the 1940s when it was used to identify the most costly
-failures arising from car mass-production~\cite{pfmea}.
+failures arising from car mass-production~\cite{bfmea}.
 It was later modified slightly to include severity of the top level failure (FMECA~\cite{fmeca}).
 In the 1980s FMEA was extended again (FMEDA~\cite{fmeda}) to provide statistics
 for predicting failure rates.
@ -31,11 +32,119 @@ This problem is compounded by the fact that traditional FMEA cannot integrate so

 \section{Reasoning Distance used to measure Comparison Complexity}

+Traditional FMEA cannot ensure that each failure mode of all its
+components are checked against any other components in the system which
+it may affect, due to state explosion.
+FMEA is therefore performed using heuristics to decide
+which components to check the effect of a component failure mode on.
+We could term the number of checks made for each failure mode
+on aspects of the system to be the reasoning distance.
+Were we to compare the reasoning distance with the theoretical maximum, the sum of all failure
+modes in a system, multiplied by the number of components in it, we could arrive at a comparison complexity figure.
+This figure would mean we could compare the maximum number of checks (i.e. rigorous analysis)
+with the number actually performed.
+
+\section{software and FMEA}
+
+Traditional FMEA deals only with electrical and mechanical components, i.e. it does not have provision for software.
+Modern control systems nearly always have a significant software/firmware element,
+and not being able to model software with current FMEA methodologies 
+is a cause for criticism~\cite{safeware}[Ch.12]. Similar difficulties in integrating mechanical and electronic/software
+failure models are discussed in ~\cite{SMR:SMR580}.


-\section{FMEA - General Criticism}
+\paragraph{Current work on Software FMEA}

-\subsection{FMEA - General Criticism}
+SFMEA usually does not seek to integrate
+hardware and software models, but to perform
+FMEA on the software in isolation~\cite{procsfmea}.
+%
+Work has been performed using databases
+to track the relationships between variables 
+and system failure modes~\cite{procsfmeadb}, to %work has been performed to 
+introduce automation into the FMEA process~\cite{appswfmea} and to provide code analysis
+automation~\cite{modelsfmea}. Although the SFMEA and hardware FMEAs are performed separately,
+some schools of thought aim for Fault Tree Analysis (FTA)~\cite{nasafta,nucfta} (top down - deductive) 
+and FMEA (bottom-up inductive)
+to be performed on the same system to provide insight into the
+software hardware/interface~\cite{embedsfmea}.
+%
+Although this
+would give a better picture of the failure mode behaviour, it
+is by no means a rigorous approach to tracing errors that may occur in hardware
+through to the top (and therefore ultimately controlling) layer of software.
+
+
+\subsection{The rise of the smart instrument}
+%% AWE --- Atomic Weapons Establishment have this problem....
+A smart instrument is defined as one that uses a micro-processor and software 
+in conjunction with its sensing electronics, rather than
+analogue electronics only.
+%
+It is termed `smart' because it has some software, or intelligence incorporated into it.
+%
+An AVO-8 multi-meter circa 1970, uses only analogue electronics, and we can determine
+using FMEA how component failures within it could affect readings.
+%
+A modern multi-meter will have a small dedicated micro-processor and sensing electronics, all on the same chip,
+with firmware to read the user controls, and display results on an LCD.
+%
+For quality control, many safety critical processes require regular inspections
+and measurements of physical characteristics of materials and machinery.
+%
+For highly critical systems i.e. the nuclear industry, the instruments used to perform these measurements, must be analysed for
+FMEA, to ensure that failure modes within the instrument cannot lead to invalid measurements.
+%
+Most modern instruments now use highly integrated electronics coupled to micro-controllers, which read and filter the measurements,
+and interface to an LCD readout.
+%
+For the highly critical systems, that means they cannot use traditional FMEA to validate
+the design of instruments.
+%
+While noting that being more modern, these instruments are likely to be more reliable and 
+accurate than the analogue instruments in use some twenty years ago but this cannot be validated 
+to a high level of reliability by traditional FMEA.
+
+\subsection{Distributed real time systems}
+
+Distributed real time systems are control systems where 
+smart sensors communicate over a communications bus to
+a master controller. 
+%
+Most modern cars follow this pattern and use CANbus~\cite{canspec,can}.
+%
+For instance, the throttle pedal will be linked to a sensor to determine how
+far the pedal is pressed. This sensor will be read by a micro-controller, and passed, via CANbus, to the Engine Control Unit (ECU)
+which will use that information (along with information from other sensors) to adjust the power required from the engine.
+In terms of FMEA, see figure~\ref{fig:distcon}, our reasoning path spans four interface layers of electronics to software.
+Traditional FMEA does not cater for the software hardware interface, and here we have the addition complications
+%with the additional complications
+of the communications protocol used to transmit data, and the failure mode characteristics
+of the communications physical layer.
+
+(figure~\ref{fig:distcon}
+The failure reasoning paths for a typical section of a distributed real time system, mean that traditional FMEA
+is almost impossible to perform.
+%
+The base component failure mode to system failure paradigm is utterly anachronistic in the distributed real time system environment.
+
+
+\begin{figure}[h]
+ \centering
+ \includegraphics[width=400pt]{./CH3_FMEA_criticism/distcon.png}
+ % distcon.png: 1622x656 pixel, 72dpi, 57.22x23.14 cm, bb=0 0 1622 656
+ \caption{Distributed Control System FMEA reasoning path for a single failure.}
+ \label{fig:distcon}
+\end{figure}
+
+
+
+
+
+
+\section{FMEA ---- general criticism --- conclusion}
+
+%\subsection{FMEA - General Criticism}

 \begin{itemize}
   \item FMEA type methodologies were designed for simple electro-mechanical systems of the 1940's to 1960's.
@ -43,26 +152,30 @@ This problem is compounded by the fact that traditional FMEA cannot integrate so
   \item State explosion - impossible to perform rigorously
   \item Difficult to re-use previous analysis work
   \item Very Difficult to model simultaneous failures.
-  
+   \item Software and hardware models are separate.
+   \item Distributed real time systemsare very difficult to meaningfully analyse with FMEA.
 \end{itemize}

+FMEA is no longer fit for purpose!
 %




-\subsection{FMEA - Better Methodology - Wish List}
+%\subsection{FMEA - Better Methodology - Wish List}
 

 \subsection{FMEA - Better Methodology - Wish List}

+We now form a wish list, stating the features that we would want
+in an improved FMEA methodology,
 \begin{itemize}
-  
-    \item State explosion  
-   \item  Rigorous (total coverage)
-    \item Reasoning Traceable  
-    \item Re-useable
-  \item Simultaneous failures
+    \item No state explosion making analysis impractical,  
+   \item  Rigorous (total failure coverage within {\fgs} all interacting component and failure modes checked),
+    \item Reasoning Traceable in system models,  
+    \item Re-useable i.e. it should be possible to re-use analysis performed previously,
+    \item It must be possible to analyse simultaneous/multiple failures, 
+    \item Modular --- i.e. usable in a distributed system.
  % \item  
 \end{itemize}

--- a/submission_thesis/CH3_FMEA_criticism/distcon.dia
+++ b/submission_thesis/CH3_FMEA_criticism/distcon.dia