From 7eec6c8de4559399b2208d406e45f65324c8e1a4 Mon Sep 17 00:00:00 2001 From: Robin Clark Date: Fri, 15 Mar 2013 16:19:58 +0000 Subject: [PATCH] went though red-penning of CH2 and CH3 --- submission_thesis/CH2_FMEA/copy.tex | 340 ++++++++++-------- submission_thesis/CH3_FMEA_criticism/copy.tex | 117 ++++++ .../CH8_finish_appendixes/Makefile | 24 -- .../CH8_finish_appendixes/copy.tex | 11 - 4 files changed, 300 insertions(+), 192 deletions(-) delete mode 100644 submission_thesis/CH8_finish_appendixes/Makefile delete mode 100644 submission_thesis/CH8_finish_appendixes/copy.tex diff --git a/submission_thesis/CH2_FMEA/copy.tex b/submission_thesis/CH2_FMEA/copy.tex index da84c1a..026cbd0 100644 --- a/submission_thesis/CH2_FMEA/copy.tex +++ b/submission_thesis/CH2_FMEA/copy.tex @@ -23,7 +23,7 @@ for a large proportion of safety critical products sold in the European Union. The acronym FMEA can be expanded as follows: \begin{itemize} \item \textbf{F - Failures of given component,} Consider a particular component in a system; - \item \textbf{M - Failure Mode,} Look at one of the ways in which it can fail (i.e. determine a component `failure~mode'); + \item \textbf{M - Failure Mode,} Choose a component `failure~mode'); \item \textbf{E - Effects,} Determine the effects this failure mode will cause to the system we are examining; \item \textbf{A - Analysis,} Analyse how much impact this symptom will have on the environment/people/the system its-self. \end{itemize} @@ -33,7 +33,7 @@ how failures could affect some equipment in %an initial a brain-storming session %in product design, to formal submission as part of safety critical certification. -FMEA is a time intensive process. To reduce amount of work to perform, +FMEA is a manual and therefore time intensive process. To reduce amount of work to perform, software packages~\cite{931423, 1778436820050601} and analysis strategies have been developed~\cite{incrementalfmea, automatingFMEA1281774}. % @@ -72,12 +72,13 @@ under given conditions. How base components could fail internally, is not of interest to an FMEA investigation. The FMEA investigator needs to know what failure behaviour a component may exhibit. %, or in other words, its modes of failure. % -A large body of literature exists which gives guidance for determining component {\fms}. +A large body of literature exists giving guidance for the determination of component {\fms}. % For this study FMD-91~\cite{fmd91} and the gas burner standard EN298~\cite{en298} are examined. %Some standards prescribe specific failure modes for generic component types. In EN298 failure modes for most generic component types are listed, or if not listed, -determined by considering all pins OPEN and all adjacent pins shorted. +are determined using a procedure where we consider +all pins open and then all adjacent pins shorted. %a procedure where failure scenarios of all pins OPEN and all adjacent pins shorted %are examined. % @@ -118,11 +119,11 @@ requires statistics for Meantime to Failure (MTTF) for all {\bc} failure modes. \section{Determining the failure modes of Components.} -The starting point for FMEA are the failure modes of {\bcs}. +The starting point in the FMEA process are the failure modes of {\bcs}. In order the define FMEA we must start with a discussion on how these failure modes are chosen. % In this section we look in detail at two common electrical components and examine how -the two sources of information define their failure mode behaviour. +the two chosen sources of {\fm} information define their failure mode behaviour. We look at the reasons why some known failure modes % are omitted, or presented in %specific but unintuitive ways. %We compare the US. military published failure mode specifications wi @@ -172,7 +173,7 @@ as shown below. \item Lead damage 1.9\% $\mapsto$ OPEN. \end{itemize} % -The main causes of drift are overloading of components. +We note that the main causes of resistor value drift are overloading. % of components. This is borne out in in the FMD-91~\cite{fmd91}[232] entry for a resistor network where the failure modes do not include drift. % @@ -235,7 +236,7 @@ $$ fm(R) = \{ OPEN, SHORT \} . $$ \centering \includegraphics[width=200pt]{CH5_Examples/lm258pinout.jpg} % lm258pinout.jpg: 478x348 pixel, 96dpi, 12.65x9.21 cm, bb=0 0 359 261 - \caption{Pinout for an LM358 dual OpAmp} + \caption{Pinout for an LM358 dual Op-Amp} \label{fig:lm258} \end{figure} @@ -247,10 +248,10 @@ For the purpose of example, we look at a typical op-amp designed for instrumentation and measurement, the dual packaged version of the LM358~\cite{lm358} (see figure~\ref{fig:lm258}), and use this to compare the failure mode derivations from FMD-91 and EN298. -\paragraph{ Failure Modes of an OpAmp according to FMD-91 } +\paragraph{ Failure Modes of an Op-Amp according to FMD-91 } %Literature suggests, latch up, latch down and oscillation. -For OpAmp failures modes, FMD-91\cite{fmd91}{3-116] states, +For Op-Amp failures modes, FMD-91\cite{fmd91}{3-116] states, \begin{itemize} \item Degraded Output 50\% Low Slew rate - poor die attach \item No Operation - overstress 31.3\% @@ -260,11 +261,11 @@ For OpAmp failures modes, FMD-91\cite{fmd91}{3-116] states, Again these are mostly internal causes of failure, more of interest to the component manufacturer than a designer looking for the symptoms of failure. -We need to translate these failure causes within the OpAmp into {\fms}. +We need to translate these failure causes within the Op-Amp into {\fms}. We can look at each failure cause in turn, and map it to potential {\fms} suitable for use in FMEA investigations. -\paragraph{OpAmp failure cause: Poor Die attach} +\paragraph{Op-Amp failure cause: Poor Die attach} The symptom for this is given as a low slew rate. This means that the op-amp will not react quickly to changes on its input terminals. This is a failure symptom that may not be of concern in a slow responding system like an @@ -273,7 +274,7 @@ a signal may entirely be lost. We can map this failure cause to a {\fm}, and we can call it $LOW_{slew}$. \paragraph{No Operation - over stress} -Here the OP\_AMP has been damaged, and the output may be held HIGH or LOW, or may be +Here the OP-Amp has been damaged, and the output may be held HIGH or LOW, or may be effectively tri-stated, i.e. not able to drive circuitry in along the next stages of the signal path: we can call this state NOOP (no Operation). % @@ -286,18 +287,18 @@ We map this failure cause to $HIGH$ or $LOW$. \paragraph{Open $V_+$} This failure cause will mean that the minus input will have the very high gain -of the OpAmp applied to it, and the output will be forced HIGH or LOW. +of the Op-Amp applied to it, and the output will be forced HIGH or LOW. We map this failure cause to $HIGH$ or $LOW$. -\paragraph{Collecting OpAmp failure modes from FMD-91} -We can define an OpAmp, under FMD-91 definitions to have the following {\fms}. +\paragraph{Collecting Op-Amp failure modes from FMD-91} +We can define an Op-Amp, under FMD-91 definitions to have the following {\fms}. \begin{equation} \label{eqn:opampfms} fm(OpAmp) = \{ HIGH, LOW, NOOP, LOW_{slew} \} \end{equation} -\paragraph{Failure Modes of an OpAmp according to EN298} +\paragraph{Failure Modes of an Op-Amp according to EN298} EN298 does not specifically define OP\_AMPS failure modes; these can be determined by following a procedure for `integrated~circuits' outlined in @@ -377,7 +378,7 @@ that we got from FMD-91, listed in equation~\ref{eqn:opampfms}. %\clearpage -\subsubsection{Failure modes of an OpAmp} +\subsubsection{Failure modes of an Op-Amp} \label{sec:opamp_fms} For the purpose of the examples to follow, the op-amp will @@ -397,7 +398,7 @@ component {\fms} in FMEA or FMMD and require interpretation. -%For our OpAmp example could have come up with different symptoms for both sides. Cannot predict the effect of internal errors, for instance ($LOW_{slew}$) +%For our Op-Amp example could have come up with different symptoms for both sides. Cannot predict the effect of internal errors, for instance ($LOW_{slew}$) %is missing from the EN298 failure modes set. @@ -463,11 +464,11 @@ component {\fms} in FMEA or FMMD and require interpretation. -FMEA is a procedure which starts with the failure modes of the low level components of a system, an example +FMEA is a bottom-up procedure which starts with the failure modes of the low level components of a system, an example analysis will serve to demonstrate it in practise. - \paragraph{ FMEA Example: Milli-volt reader} -Example: Let us consider a system, in this case a milli-volt reader, consisting + \paragraph{ FMEA Example: Milli-volt reader.} +Example: Let us consider a system, in this case a simple milli-volt reader, consisting of instrumentation amplifiers connected to a micro-processor that reports its readings via RS-232. \begin{figure} @@ -585,6 +586,7 @@ and our equipment can react by raising an alarm or compensating for the resultin % Some failure modes may cause undetectable failures, for instance a component that causes a measured reading to change could have adverse consequences yet not be flagged as a failure. +% This type of failure would not be flagged as a failure by the system, because it has no way of knowing the reading is invalid. % @@ -624,6 +626,9 @@ methodologies such as FTA~\cite{nucfta,nasafta} are backward searches. Forward search types of fault analysis is said to be `deductive'. Backward (or bottom-up) searches are said to be inductive (i.e. the results of failure are induced). + + + \paragraph{Reasoning distance.} \label{reasoningdistance} A reasoning distance is the number of stages of logic and reasoning @@ -641,6 +646,9 @@ in that system. If the milli-volt reader had say 100 components, with three failure modes each, this would give a reasoning distance of 3 * 100 * 99. +The discussion on reasoning distance leads provides us with a metric to examine +the state explosion problems associated with forward search failure investigation +methodologies. %.... general concept... simple ideas about how complex a %failure analysis is the more modules and components are involved @@ -652,7 +660,12 @@ would give a reasoning distance of 3 * 100 * 99. FMEA for a safety critical certification~\cite{en298,en61508} will have to be applied to all known failure modes of all components within a system. +FMEA does not define or specify the scope of the investigation of each component failure mode. +Should we follow the signal path, and all components we encounter along that, or should the scope be wider? +If we were to examine the effect of a component {\fm} against all other components +in a system, this could be said to be exhaustive analysis. +\paragraph{Exhaustive Single Failure FMEA.} To perform FMEA exhaustively (i.e. to examine every possible interaction of a failure mode with all other components in a system). Or in other words, ---we would need to look at all possible failure scenarios. @@ -691,7 +704,7 @@ For our theoretical 100 components with 3 failure modes each example, this is $100*99*98*3=2,910,600$ failure mode scenarios. -\paragraph{Reliance of experts for meaningful FMEA Analysis.} +\paragraph{Reliance on experts for meaningful FMEA Analysis.} Current FMEA methodologies cannot consider---for the reason of state explosion---an exhaustive approach. We define exhaustive FMEA ({\XFMEA}) as examining the effect of every component failure mode against the remaining components in the system under investigation. @@ -700,7 +713,9 @@ Because we cannot perform XFMEA, we rely on experts in the system under investigation to perform a meaningful FMEA analysis. % -In practise these experts have to select the areas they see as most critical for detailed FMEA analysis. +In practise these experts have to select the areas they see as most critical for detailed FMEA analysis: +its is usually impossible to perform a detail level of analysis on all component {\fms} +on anything but a non-trivial system. \subsection{Component Tolerance} @@ -714,10 +729,10 @@ is given in section~\ref{sec:resistortolerance}. \paragraph{Five main Variants of FMEA} \begin{itemize} - \item \textbf{PFMEA - Production} Car Manufacture etc - \item \textbf{FMECA - Criticality} Military/Space - \item \textbf{FMEDA - Statistical safety} EN61508/IOC1508 Safety Integrity Levels - \item \textbf{DFMEA - Design or static/theoretical} EN298/EN230/UL1998 + \item \textbf{PFMEA - Production} Emphasis on cost reduction and product improvement; + \item \textbf{FMECA - Criticality} Emphasis on minimising the effect of critial systems failing; % Military/Space + \item \textbf{FMEDA - Statistical safety} Statistical analysis giving Safety Integrity Levels; + \item \textbf{DFMEA - Design or static/theoretical} Approval of safety critical systems using FMEA and single or double failure prevention;% EN298/EN230/UL1998 \item \textbf{SFMEA - Software FMEA --- only used in highly critical systems at present} \end{itemize} @@ -766,11 +781,16 @@ will return most cost benefit. % \caption{A10 Thunderbolt} % \label{fig:f16missile} % \end{figure} -Emphasis on determining criticality rather than the cost of system failures. -Applies some Bayesian statistics (probabilities of component failures and those +FMECA places emphasis on determining criticality rather than the cost of system failures. +% +Applies some Bayesian statistics (probabilities of component failures thereby causing given system level failures). +% +Also the probability of the system failure causing a critical event. +% Applying Bayesian statistics to failure analysis, suffers the -problem that correlation does not imply causation~\cite{bayesfrequentist}. +problem that correlation does not imply causation~\cite{bayesfrequentist}. +% However, correlation is evidence for causation, and maybe the only evidence to hand and this is the justification behind its use. A history of the usage and development of FMECA may be found in~\cite{FMECAresearch}. @@ -829,7 +849,7 @@ for a project manager. -\subsection{ FMEDA - Failure Modes Effects and Diagnostic Analysis} +%\subsection{ FMEDA - Failure Modes Effects and Diagnostic Analysis} % \begin{figure} % \centering % \includegraphics[width=200pt]{./SIL.png} @@ -859,16 +879,18 @@ type standards (EN61508/IOC5108). It provides a statistical overall level of safety and allows diagnostic mitigation for self checking etc. It provides guidelines for the design and architecture -of computer/software systems for the four levels of -safety Integrity. +of computer/software systems for four levels of +safety Integrity, referred to as Safety Integrity Levels (SIL). %For Hardware % FMEDA does force the user to consider all hardware components in a system -by requiring that a MTTF value is assigned for each failure~mode; +by requiring that a MTTF value is assigned for each base component failure~mode; the MTTF may be statistically mitigated (improved) if it can be shown that self-checking will detect failure modes. -For software it provides procedural quality guidelines and constraints (such as forbidding certain -programming languages and/or features. +% +EN61508 in relation to software provides procedural quality guidelines and constraints (such as forbidding certain +programming languages and/or features): it does not provide a means to trace failure mode effects in software +or across the software/hardware interface. %\subsection{ FMEDA - Failure Modes Effects and Diagnostic Analysis} @@ -988,11 +1010,14 @@ It has a simple final result, a Safety Integrity Level (SIL) from 1 to 4 (wher \caption{FMEA Meeting} \label{fig:tech_meeting} \end{figure} -Static FMEA, Design FMEA, Approvals FMEA +%Static FMEA, Design FMEA, Approvals FMEA Experts from Approval House and Equipment Manufacturer discuss selected component failure modes judged to be in critical sections of the product. +% +This could be considered as a design check method, deliberately +looking for weaknesses at a theoretical level. @@ -1010,127 +1035,128 @@ judged to be in critical sections of the product. % \end{figure} \begin{itemize} - \item Impossible to look at all component failures let alone apply FMEA rigorously. + \item Impossible to look at all component failures let alone apply FMEA exhaustively/rigorously. \item In practice, failure scenarios for critical sections are contested, and either justified or extra safety measures implemented. - \item Often Meeting notes or minutes only. Unusual for detailed arguments to be documented. + \item Often Meeting notes or minutes only. Unusual for detailed technical arguments to be documented. \end{itemize} - -\section{Conclusions on current FMEA Methodologies} - -%% FOCUS -The focus of this chapter %literature review -is to establish the current practice and applications -of FMEA. -%, and to examine its strengths and weaknesses. -%% GOAL -Its -goal is to identify central issues and to criticise and assess the current -FMEA methodologies. -%% PERSPECTIVE -The perspective of the author, is as a practitioner of static failure mode analysis techniques -concerning approval of product -to European safety standards, both the prescriptive~\cite{en298,en230} and statistical~\cite{en61508}. -A second perspective is that of a software engineer trained to use formal methods. -Examining FMEA methodologies for mathematical properties, influenced by -formal methods applied to software, should provide a perspective not traditionally considered. -%% COVERAGE -The literature reviewed, has been restricted to published books, European safety standards (as examples -of current safety measures applied), and traditional research, from journal and conference papers. -%% ORGANISATION -The review is organised by concept, that is, FMEA can be applied to hardware, software, software~interfacing and -to multiple failure scenarios etc. Methodologies related to FMEA are briefly covered for the sake of context. -%% AUDIENCE -% Well duh! PhD supervisors and examiners.... - -% \subsection{Related Methodologies} -% FTA --- HAZOP --- ALARP --- Event Tree Analysis --- bow tie concept -% \subsection{Hardware FMEA (HFMEA)} -% \subsection{Multiple Failure scenarios and FMEA} -% \subsection{Software FMEA (SFMEA)} - -\paragraph{Current work on Software FMEA} - -SFMEA usually does not seek to integrate -hardware and software models, but to perform -FMEA on the software in isolation~\cite{procsfmea}. -% -Work has been performed using databases -to track the relationships between variables -and system failure modes~\cite{procsfmeadb}, to %work has been performed to -introduce automation into the FMEA process~\cite{appswfmea} and to provide code analysis -automation~\cite{modelsfmea}. Although the SFMEA and hardware FMEAs are performed separately, -some schools of thought aim for Fault Tree Analysis (FTA)~\cite{nasafta,nucfta} (top down - deductive) -and FMEA (bottom-up inductive) -to be performed on the same system to provide insight into the -software hardware/interface~\cite{embedsfmea}. -% -Although this -would give a better picture of the failure mode behaviour, it -is by no means a rigorous approach to tracing errors that may occur in hardware -through to the top (and therefore ultimately controlling) layer of software. - -\paragraph{Current FMEA techniques are not suitable for software} - -The main FMEA methodologies are all based on the concept of taking -base component {\fms}, and translating them into system level events/failures~\cite{sfmea,sfmeaa}. -% -In a complicated system, mapping a component failure mode to a system level failure -will mean a long reasoning distance; that is to say the actions of the -failed component will have to be traced through -several sub-systems, gauging its effects with and on other components. -% -With software at the higher levels of these sub-systems, -we have yet another layer of complication. -% -%In order to integrate software, %in a meaningful way -%we need to re-think the -%FMEA concept of simply mapping a base component failure to a system level event. -% -SFMEA regards, in place of hardware components, the variables used by the programs to be their equivalent~\cite{procsfmea}. -The failure modes of these variables, are that they could become erroneously over-written, -calculated incorrectly (due to a mistake by the programmer, or a fault in the micro-processor on which it is running), or -external influences such as -ionising radiation causing bits to be erroneously altered. - - -\paragraph{FMEA and Modularity} -From the 1940's onwards, software has evolved from a simple procedural languages (i.e. assembly language/Fortran~\cite{f77} call return) -to structured programming ( C~\cite{DBLP:books/ph/KernighanR88}, pascal etc) and then to object oriented models (Java C++...). -FMEA has undergone no such evolution. -% -In a world where sensor systems, often including embedded software components, are brought in to -create complex systems, FMEA still follows a rigid {\bc} {\fm} to system level error model, -that is only suitable for simple electro mechanical systems. - - - -% - -% -% MAYBE MOVE THIS TO CH3, FMEA CRITICISM -% 30JAN2013 -% - -\subsection{Where FMEA is now.} -FMEA useful tool for basic safety --- provides statistics on safety where field data impractical --- -very good with single failure modes linked to top level events. -FMEA has become part of the safety critical and safety certification industries. -% -SFMEA is in its infancy, and there are corresponding gaps in -certification for software, EN61508~\cite{en61508}, recommends hardware redundancy architectures in conjunction -with FMEDA for hardware: for software it recommends language constraints and quality procedures -but no inductive fault finding technique. -% -FMEA has adapted from a cost saving exercise for mass produced items~\cite{bfmea,generic_automotive_fmea_6034891}, to incorporating statistical techniques -(FMECA) to allowing for self diagnostic mitigation (FMEDA). -% -However, it is still based on the concept of single component failures mapped to top~level/system~failures. -All these FMEA based methodologies have the following short comings: -\begin{itemize} - \item Impossible to integrate Software and hardware models, - \item State explosion problem exacerbated by increasing complexity due to density of modern electronics, - \item Impossibility to consider all multiple component failure modes~\cite{FMEAmultiple653556} -\end{itemize} +% MOVED TO CH3: 15MAR2013 +% +% \section{Conclusions on current FMEA Methodologies} +% +% %% FOCUS +% The focus of this chapter %literature review +% is to establish the current practice and applications +% of FMEA. +% %, and to examine its strengths and weaknesses. +% %% GOAL +% Its +% goal is to identify central issues and to criticise and assess the current +% FMEA methodologies. +% %% PERSPECTIVE +% The perspective of the author, is as a practitioner of static failure mode analysis techniques +% concerning approval of product +% to European safety standards, both the prescriptive~\cite{en298,en230} and statistical~\cite{en61508}. +% A second perspective is that of a software engineer trained to use formal methods. +% Examining FMEA methodologies for mathematical properties, influenced by +% formal methods applied to software, should provide a perspective not traditionally considered. +% %% COVERAGE +% The literature reviewed, has been restricted to published books, European safety standards (as examples +% of current safety measures applied), and traditional research, from journal and conference papers. +% %% ORGANISATION +% The review is organised by concept, that is, FMEA can be applied to hardware, software, software~interfacing and +% to multiple failure scenarios etc. Methodologies related to FMEA are briefly covered for the sake of context. +% %% AUDIENCE +% % Well duh! PhD supervisors and examiners.... +% +% % \subsection{Related Methodologies} +% % FTA --- HAZOP --- ALARP --- Event Tree Analysis --- bow tie concept +% % \subsection{Hardware FMEA (HFMEA)} +% % \subsection{Multiple Failure scenarios and FMEA} +% % \subsection{Software FMEA (SFMEA)} +% +% \paragraph{Current work on Software FMEA} +% +% SFMEA usually does not seek to integrate +% hardware and software models, but to perform +% FMEA on the software in isolation~\cite{procsfmea}. +% % +% Work has been performed using databases +% to track the relationships between variables +% and system failure modes~\cite{procsfmeadb}, to %work has been performed to +% introduce automation into the FMEA process~\cite{appswfmea} and to provide code analysis +% automation~\cite{modelsfmea}. Although the SFMEA and hardware FMEAs are performed separately, +% some schools of thought aim for Fault Tree Analysis (FTA)~\cite{nasafta,nucfta} (top down - deductive) +% and FMEA (bottom-up inductive) +% to be performed on the same system to provide insight into the +% software hardware/interface~\cite{embedsfmea}. +% % +% Although this +% would give a better picture of the failure mode behaviour, it +% is by no means a rigorous approach to tracing errors that may occur in hardware +% through to the top (and therefore ultimately controlling) layer of software. +% +% \paragraph{Current FMEA techniques are not suitable for software} +% +% The main FMEA methodologies are all based on the concept of taking +% base component {\fms}, and translating them into system level events/failures~\cite{sfmea,sfmeaa}. +% % +% In a complicated system, mapping a component failure mode to a system level failure +% will mean a long reasoning distance; that is to say the actions of the +% failed component will have to be traced through +% several sub-systems, gauging its effects with and on other components. +% % +% With software at the higher levels of these sub-systems, +% we have yet another layer of complication. +% % +% %In order to integrate software, %in a meaningful way +% %we need to re-think the +% %FMEA concept of simply mapping a base component failure to a system level event. +% % +% SFMEA regards, in place of hardware components, the variables used by the programs to be their equivalent~\cite{procsfmea}. +% The failure modes of these variables, are that they could become erroneously over-written, +% calculated incorrectly (due to a mistake by the programmer, or a fault in the micro-processor on which it is running), or +% external influences such as +% ionising radiation causing bits to be erroneously altered. +% +% +% \paragraph{FMEA and Modularity} +% From the 1940's onwards, software has evolved from a simple procedural languages (i.e. assembly language/Fortran~\cite{f77} call return) +% to structured programming ( C~\cite{DBLP:books/ph/KernighanR88}, pascal etc) and then to object oriented models (Java C++...). +% FMEA has undergone no such evolution. +% % +% In a world where sensor systems, often including embedded software components, are brought in to +% create complex systems, FMEA still follows a rigid {\bc} {\fm} to system level error model, +% that is only suitable for simple electro mechanical systems. +% +% +% +% % +% +% % +% % MAYBE MOVE THIS TO CH3, FMEA CRITICISM +% % 30JAN2013 +% % +% +% \subsection{Where FMEA is now.} +% FMEA useful tool for basic safety --- provides statistics on safety where field data impractical --- +% very good with single failure modes linked to top level events. +% FMEA has become part of the safety critical and safety certification industries. +% % +% SFMEA is in its infancy, and there are corresponding gaps in +% certification for software, EN61508~\cite{en61508}, recommends hardware redundancy architectures in conjunction +% with FMEDA for hardware: for software it recommends language constraints and quality procedures +% but no inductive fault finding technique. +% % +% FMEA has adapted from a cost saving exercise for mass produced items~\cite{bfmea,generic_automotive_fmea_6034891}, to incorporating statistical techniques +% (FMECA) to allowing for self diagnostic mitigation (FMEDA). +% % +% However, it is still based on the concept of single component failures mapped to top~level/system~failures. +% All these FMEA based methodologies have the following short comings: +% \begin{itemize} +% \item Impossible to integrate Software and hardware models, +% \item State explosion problem exacerbated by increasing complexity due to density of modern electronics, +% \item Impossibility to consider all multiple component failure modes~\cite{FMEAmultiple653556} +% \end{itemize} diff --git a/submission_thesis/CH3_FMEA_criticism/copy.tex b/submission_thesis/CH3_FMEA_criticism/copy.tex index 0ff5553..dcdd440 100644 --- a/submission_thesis/CH3_FMEA_criticism/copy.tex +++ b/submission_thesis/CH3_FMEA_criticism/copy.tex @@ -10,6 +10,7 @@ In the 1980s FMEA was extended again (FMEDA~\cite{fmeda}) to provide statistics for predicting failure rates. However a typical entry in each of the above methodologies, starts with a particular component failure mode and associates it with a system---or top level---failure symptom. +This means that we have one analysis case per component failure mode for all the components in the system under investigation. This analysis philosophy has not changed since FMEA was first used. @@ -187,6 +188,122 @@ utterly anachronistic in the distributed real time system environment. FMEA is no longer fit for purpose! % +\section{Conclusions on current FMEA Methodologies} + +%% FOCUS +The focus of this chapter %literature review +is to establish the current practice and applications +of FMEA. +%, and to examine its strengths and weaknesses. +%% GOAL +Its +goal is to identify central issues and to criticise and assess the current +FMEA methodologies. +%% PERSPECTIVE +The perspective of the author, is as a practitioner of static failure mode analysis techniques +concerning approval of product +to European safety standards, both the prescriptive~\cite{en298,en230} and statistical~\cite{en61508}. +A second perspective is that of a software engineer trained to use formal methods. +Examining FMEA methodologies for mathematical properties, influenced by +formal methods applied to software, should provide a perspective not traditionally considered. +%% COVERAGE +The literature reviewed, has been restricted to published books, European safety standards (as examples +of current safety measures applied), and traditional research, from journal and conference papers. +%% ORGANISATION +The review is organised by concept, that is, FMEA can be applied to hardware, software, software~interfacing and +to multiple failure scenarios etc. Methodologies related to FMEA are briefly covered for the sake of context. +%% AUDIENCE +% Well duh! PhD supervisors and examiners.... + +% \subsection{Related Methodologies} +% FTA --- HAZOP --- ALARP --- Event Tree Analysis --- bow tie concept +% \subsection{Hardware FMEA (HFMEA)} +% \subsection{Multiple Failure scenarios and FMEA} +% \subsection{Software FMEA (SFMEA)} + +\paragraph{Current work on Software FMEA} + +SFMEA usually does not seek to integrate +hardware and software models, but to perform +FMEA on the software in isolation~\cite{procsfmea}. +% +Work has been performed using databases +to track the relationships between variables +and system failure modes~\cite{procsfmeadb}, to %work has been performed to +introduce automation into the FMEA process~\cite{appswfmea} and to provide code analysis +automation~\cite{modelsfmea}. Although the SFMEA and hardware FMEAs are performed separately, +some schools of thought aim for Fault Tree Analysis (FTA)~\cite{nasafta,nucfta} (top down - deductive) +and FMEA (bottom-up inductive) +to be performed on the same system to provide insight into the +software hardware/interface~\cite{embedsfmea}. +% +Although this +would give a better picture of the failure mode behaviour, it +is by no means a rigorous approach to tracing errors that may occur in hardware +through to the top (and therefore ultimately controlling) layer of software. + +\paragraph{Current FMEA techniques are not suitable for software} + +The main FMEA methodologies are all based on the concept of taking +base component {\fms}, and translating them into system level events/failures~\cite{sfmea,sfmeaa}. +% +In a complicated system, mapping a component failure mode to a system level failure +will mean a long reasoning distance; that is to say the actions of the +failed component will have to be traced through +several sub-systems, gauging its effects with and on other components. +% +With software at the higher levels of these sub-systems, +we have yet another layer of complication. +% +%In order to integrate software, %in a meaningful way +%we need to re-think the +%FMEA concept of simply mapping a base component failure to a system level event. +% +SFMEA regards, in place of hardware components, the variables used by the programs to be their equivalent~\cite{procsfmea}. +The failure modes of these variables, are that they could become erroneously over-written, +calculated incorrectly (due to a mistake by the programmer, or a fault in the micro-processor on which it is running), or +external influences such as +ionising radiation causing bits to be erroneously altered. + + +\paragraph{FMEA and Modularity} +From the 1940's onwards, software has evolved from a simple procedural languages (i.e. assembly language/Fortran~\cite{f77} call return) +to structured programming ( C~\cite{DBLP:books/ph/KernighanR88}, pascal etc) and then to object oriented models (Java C++...). +FMEA has undergone no such evolution. +% +In a world where sensor systems, often including embedded software components, are brought in to +create complex systems, FMEA still follows a rigid {\bc} {\fm} to system level error model, +that is only suitable for simple electro mechanical systems. + + + +% + +% +% MAYBE MOVE THIS TO CH3, FMEA CRITICISM +% 30JAN2013 +% + +\subsection{Where FMEA is now.} +FMEA useful tool for basic safety --- provides statistics on safety where field data impractical --- +very good with single failure modes linked to top level events. +FMEA has become part of the safety critical and safety certification industries. +% +SFMEA is in its infancy, and there are corresponding gaps in +certification for software, EN61508~\cite{en61508}, recommends hardware redundancy architectures in conjunction +with FMEDA for hardware: for software it recommends language constraints and quality procedures +but no inductive fault finding technique. +% +FMEA has adapted from a cost saving exercise for mass produced items~\cite{bfmea,generic_automotive_fmea_6034891}, to incorporating statistical techniques +(FMECA) to allowing for self diagnostic mitigation (FMEDA). +% +However, it is still based on the concept of single component failures mapped to top~level/system~failures. +All these FMEA based methodologies have the following short comings: +\begin{itemize} + \item Impossible to integrate Software and hardware models, + \item State explosion problem exacerbated by increasing complexity due to density of modern electronics, + \item Impossibility to consider all multiple component failure modes~\cite{FMEAmultiple653556} +\end{itemize} diff --git a/submission_thesis/CH8_finish_appendixes/Makefile b/submission_thesis/CH8_finish_appendixes/Makefile deleted file mode 100644 index 5743eec..0000000 --- a/submission_thesis/CH8_finish_appendixes/Makefile +++ /dev/null @@ -1,24 +0,0 @@ - -# Makefile to create all graphics file etc -# -# Place all .dia files here as .png targets -# -DIA = - - -doc: $(DIA) - -# -#bib: -# -# bibtex HR230003_combined_o2_co_sensor -# - - -%.png:%.dia - dia -t png $< - - - -copy: - echo $@ diff --git a/submission_thesis/CH8_finish_appendixes/copy.tex b/submission_thesis/CH8_finish_appendixes/copy.tex deleted file mode 100644 index 4e0d10d..0000000 --- a/submission_thesis/CH8_finish_appendixes/copy.tex +++ /dev/null @@ -1,11 +0,0 @@ -\label{sec:chap8} - -%% -%% CH8 finishing up and appendixes -%% - -\printglossary - - -\addcontentsline{toc}{chapter}{Glossary} -