Good Friday morning

2013-03-29 15:38:36 +00:00 · 2013-03-29 15:38:36 +00:00 · a7aa5e3854
commit a7aa5e3854
parent 0ea57ac50c
2 changed files with 236 additions and 155 deletions
--- a/submission_thesis/CH2_FMEA/copy.tex
+++ b/submission_thesis/CH2_FMEA/copy.tex
@ -3,20 +3,32 @@
 \label{sec:chap2}

 The generic and statistical European Safety Standard, EN61508:6\cite{en61508}[B.6.6]
-describes Failure Mode Effect Analysis (FMEA) as: 
+describes FMEA as: 
 \begin{quotation}
-"To analyse a system design, by examining all possible sources of failure
+``To analyse a system design, by examining all possible sources of failure
 of a system's components and determining the effects of these failures
-on the behaviour and safety of the system."
+on the behaviour and safety of the system.''
 \end{quotation}.

+\section*{Introduction}
+This chapter introduces Failure Mode Effect Analysis (FMEA).
+%It begins with a simple example to demonstrate the basic concept of FMEA
+%and then 
+It starts by looking at how we determine the failure modes associated with components.
+Two common electrical components, the resistor and the operational amplifier
+and examined in the context of two sources of information that define failure modes.
+A simple example of an FMEA is then given.
+The four main variants are then described and finally we conclude by describing concepts
+that underlie the usage and philosophy of FMEA.


-\section{FMEA}
+
+
+\section{FMEA Basic concept.}
 \label{basicfmea}
 %\subsection{FMEA}
 %\tableofcontents[currentsection]
-\paragraph{FMEA basic concept.}
+%\paragraph{FMEA basic concept.}

 FMEA~\cite{safeware}[pp.341-344] is widely used, and proof of its use is a mandatory legal requirement
 for a large proportion of safety critical products sold in the European Union.
@ -62,15 +74,16 @@ the effectiveness of FMEA.
 In order to apply any form of FMEA  we need to know the ways in which 
 the components we are using can fail.
 %
-A good introduction to hardware and software failure modes may be found in~\cite{sccs}[pp.114-124].
+\footnote{A good introduction to hardware and software failure modes may be found in~\cite{sccs}[pp.114-124].}
 %
 Typically when choosing components for a design, we look at manufacturers' data sheets 
 which describe functionality, physical dimensions
 environmental ranges, tolerances and can indicate how a component may fail/misbehave
 under given conditions.
 %
-How base components could fail internally, is not of interest to an FMEA investigation.
-The FMEA investigator needs to know what failure behaviour a component may exhibit. %, or in other words, its modes of failure.
+How %base 
+components could fail internally, is not of interest to an FMEA investigation.
+The FMEA investigator needs to know what failure behaviour a component could exhibit. %, or in other words, its modes of failure.
 %
 A large body of literature exists giving guidance for the determination of  component {\fms}.
 %
@ -90,7 +103,7 @@ FMD-91 entries include general descriptions of  internal failures alongside {\fm
 %
 FMD-91 entries need, in some cases, some interpretation to be mapped to a clear set of
 component {\fms} suitable for use in FMEA.
-
+%
 A third document, MIL-1991~\cite{mil1991} provides overall reliability statistics for 
 component types, but does not detail specific failure modes.
 %
@ -119,10 +132,13 @@ requires statistics for Meantime to Failure (MTTF) for all {\bc} failure modes.

 \section{Determining the failure modes of Components.}

-The starting point in the  FMEA process  are the failure modes of {\bcs}.
+The starting point in the  FMEA process  are the failure modes of the components 
+we would typically find in a production parts list, which we can term the {\bcs}.
+%
 In order the define FMEA we must start with a discussion on how these failure modes are chosen.
 %
-In this section we look in detail at two common electrical components and examine how
+In this section we pick %look in detail at 
+two common electrical components as examples, and examine how
 the two chosen sources of {\fm} information define their failure mode behaviour.
 We look at the reasons why some known failure modes % are omitted, or presented in 
 %specific but unintuitive ways.
@ -130,8 +146,8 @@ We look at the reasons why some known failure modes % are omitted, or presented
 can be found in one source but not in the others and vice versa.
 %
 Finally we compare and contrast the failure modes determined for these components
-from the FMD-91 reference source and from the guidelines of the 
-European burner standard EN298.
+from the FMD-91~\cite{fmd91} reference source and from the guidelines of the 
+European burner standard EN298~\cite{en298}.

 \subsection{Failure mode determination for generic resistor.}
 \label{sec:resistorfm}
@ -221,6 +237,10 @@ and thus subject to drift/parameter change.

 \subsubsection{Resistor Failure Modes}
 \label{sec:res_fms}
+The differneces in resistor failure modes between FMD-91 and EN298 are that FMD-91 would
+include the failure mode DRIFT. EN298 does not include this, mainly because it imposes circuit design constraints
+that effectively side step that problem.
+%
 For this study we will take the conservative view from EN298, and consider the failure
 modes for a generic resistor to be both OPEN and SHORT.
 i.e.
@ -268,7 +288,7 @@ We need to translate these failure causes within the Op-Amp into {\fms}.
 We can look at each failure cause in turn, and map it to potential {\fms} suitable for use in FMEA
 investigations.

-\paragraph{Op-Amp failure cause: Poor Die attach}
+\paragraph{Op-Amp failure cause: Poor Die attach.}
 The symptom for this is given as a low slew rate. This means that the op-amp
 will not react quickly to changes on its input terminals.
 This is a failure symptom that may not be of concern in a slow responding system like an
@ -276,24 +296,24 @@ instrumentation amplifier. However, where higher frequencies are being processed
 a signal may entirely be lost.
 We can map this failure cause to a {\fm}, and we can call it $LOW_{slew}$.

-\paragraph{No Operation - over stress}
+\paragraph{No Operation - over stress.}
 Here the OP-Amp has been damaged, and the output may be held HIGH or LOW, or may be 
 effectively tri-stated, i.e. not able to drive circuitry in along the next stages of 
 the signal path: we can call this state NOOP (no Operation).
 %
 We can map this failure cause to three {\fms}, $LOW$, $HIGH$, $NOOP$. 

-\paragraph{Shorted $V_+$  to $V_-$}
+\paragraph{Shorted inputs: $V_+$  to $V_-$.}
 Due to the high intrinsic gain of an op-amp, and the effect of offset currents,
 this will force the output HIGH or LOW.
 We map this failure cause to $HIGH$ or $LOW$.

-\paragraph{Open $V_+$}
+\paragraph{Open input: $V_+$.}
 This failure cause will mean that the minus input will have the very high gain
 of the Op-Amp applied to it, and the output will be forced HIGH or LOW.
 We map this failure cause to $HIGH$ or $LOW$.

-\paragraph{Collecting Op-Amp failure modes from FMD-91}
+\paragraph{Collecting Op-Amp failure modes from FMD-91.}
 We can define an Op-Amp, under FMD-91 definitions to have the following {\fms}.
 \begin{equation}
 \label{eqn:opampfms}
@ -301,7 +321,7 @@ We can define an Op-Amp, under FMD-91 definitions to have the following {\fms}.
 \end{equation}


-\paragraph{Failure Modes of an Op-Amp according to EN298}
+\paragraph{Failure Modes of an Op-Amp according to EN298.}

 EN298 does not specifically define  OP\_AMPS failure modes; these can be determined
 by following a  procedure for `integrated~circuits' outlined in
@ -470,7 +490,7 @@ component {\fms} in FMEA or FMMD and require interpretation.
 FMEA is a bottom-up procedure which starts with the failure modes of the  low level components of a system, an example 
 analysis will serve to demonstrate it in practise.

- \paragraph{ FMEA Example: Milli-volt reader.}
+ \section{FMEA worked example: milli-volt reader.}
 Example: Let us consider a system, in this case a simple milli-volt reader, consisting
 of instrumentation amplifiers connected to a micro-processor
 that reports its readings via RS-232.
@ -542,6 +562,7 @@ In this section we examine some fundamental concepts and underlying philosophies

 \paragraph{The signal path.}

+% C Garret does not like the terms afferent and efferent here, try to think of something else
 Most electronic systems are used to process a signal: with signal processing
 there is usually a clear afferent to transform to efferent path.
 %
@ -558,9 +579,6 @@ An FMEA investigation will often take the component {\fm} and examine its effect
 in the direction of the signal,
 echoing diagnostic/fault~finding methods~\cite{garrett, maikowski}. % loebowski}.
 %
-The rationale and work-culture of those tasked to 
-perform FMEA are generally personnel who have performed fault finding.
-%
 When fault finding we generally follow the signal path, checking for correct behaviour
 along it: when we find something out of place we zoom in and measure 
 the circuit behaviour until we find a faulty component or module.
@ -568,6 +586,10 @@ the circuit behaviour until we find a faulty component or module.
 With this style of fault finding, because it is based on experiment, 
 we can hop from module to module eliminating working modules, until we find the 
 failure.
+%
+The rationale and work-culture of those tasked to 
+perform FMEA are generally personnel who have performed fault finding.
+%


 FMEA is a theoretical discipline.
@ -575,15 +597,23 @@ FMEA is a theoretical discipline.
 It  would be very unusual to build a circuit and then simulate
 component failure modes.
 %
-This would be  time consuming as it would involve building a circuit for each component {\fm} in the system.
+This would be  time consuming as it would involve building a circuit for each component {\fm} in 
+the system\footnote{Building circuit simulations and simulating component failure modes
+would be a very time consuming process and might only be performed as a final-stage of accident investigation, where the cause is 
+required to be proven.}
 %
 We cannot, as with fault finding, verify modules along the signal path for correct behaviour
 and eliminate them from the investigation.
 %
-With FMEA we therefore need to be more thorough.
+FMEA is a `thought~experiment', not actual experiment.
+%
+With FMEA we therefore need to be more thorough in the consideration of the effects a failure mode may have
+on the other components in a system, than with fault finding.
 %
 The question is by how much.
+%
 Too much and the task becomes impossible due to time/labour constraints.
+%
 Too little and the analysis could become meaningless because it misses
 potential system failures.
 %
@ -594,10 +624,21 @@ of the component exhibiting the {\fm} under investigation.
 Also, whether following the effects through the signal path {\em only} is acceptable, and instead
 looking at its effect on all other components in the system is necessary,
 is a matter for debate.
+%
 In practise, it is a compromise between the amount of time/money  that can be spent
 on analysis relative to the criticality of the project.
 Metrics from measuring the amount of work to undertake for FMEA are examined in section~\ref{sec:xfmea}.

+\paragraph{Failure Modes and the signal path}
+
+In general a component failure mode in an electronic circuit will
+change the circuit topology. For a single failure
+this effect may cause additional complications for the analyst.
+For multiple failures this means 
+that the analyst 
+will have to deal altered---or changed circuit topologies---
+of the electronic circuit for each analysis.
+

 \paragraph{Single component failure mode to system failure relation.}

@ -619,11 +660,12 @@ From a whole system perspective, we may find that {\bc} {\fms}
 may have more than one possible system event associated with them.
 Often there will be a clear one to one mapping, but 
 probabilities to failure (as used in FMECA)
-could mean one to many.% mapping.
+could mean one too many. % mapping.
 %
+\paragraph{Use of Markov chains to model failure modes.}
 We could represent a failure mode and its possible outcomes using a Markov chain~\cite{probfmea_4338247}.
 %
-Where multiple simultaneous\footnote{Multiple simultaneous failures are taken to mean failures that occur within the same detection period.} 
+Where multiple simultaneous%\footnote{Multiple simultaneous failures are taken to mean failures that occur within the same detection period.} 
 failure modes are considered this complicates
 the statistical nature of the Markov chain, cause effect model.
 %
@ -734,15 +776,22 @@ required to map a failure cause to its potential outcomes.
 In our basic FMEA example in section~\ref{basicfmea}
 we were asked to consider one failure mode against all the components in the milli-volt reader.
 %
-To create a complete FMEA report on the milli-volt reader we would have had to examine every 
+To create an exhaustive FMEA report on the milli-volt reader, we would have had to examine every 
 known failure mode of every component within it---against all its other components.
 %
-The reasoning~distance is defined as  the sum of the number of failure modes, against all other components
+We define `reasoning~distance' as the number of components checked against
+for a given failure mode to determine a system level symptom.
+%
+No current FMEA variant gives guidelines for the components that should 
+be included to analyse a {\fm} in a system.
+%does not
+The exhaustive~reasoning~distance would be
+the sum of the number of failure modes, against all other components
 in that system.
 %
 If the milli-volt reader had say 100 components, with three failure modes each, this
-would give a reasoning distance of 3 * 100 * 99.
-
+would give an exhaustive reasoning distance of 3 * 100 * 99.
+%
 The discussion on reasoning distance leads provides us with a metric to examine
 the state explosion problems associated with forward search failure investigation
 methodologies.
@ -799,9 +848,10 @@ double failure scenarios (for burner lock-out scenarios).}
  %(N^2 - N).f 
 \end{equation}
 
-For our theoretical 100 components with 3 failure modes each example, this is
-$100*99*98*3=2,910,600$ failure mode scenarios.
-
+For our theoretical 100 components with 3 failure modes each example, this is a reasoning distance of
+$100*99*98*3=2,910,600$ . % failure mode scenarios.
+In practise there is an additional concern here, that of 
+the circuit topology changes that {\fms} can cause.

 \paragraph{Reliance on experts for meaningful FMEA Analysis.}
 Current FMEA methodologies cannot consider---for the reason of state explosion---an exhaustive approach.
@ -818,7 +868,7 @@ on anything but a non-trivial system.

 \subsection{Component Tolerance}

-Component tolerances may need considered when determining if a component has failed.
+Component tolerances may need considering when determining if a component has failed.
 Calculations for acceptable ranges to determine failure or acceptable conditions
 must be made where appropriate.
 %
@ -846,13 +896,14 @@ is given in section~\ref{sec:resistortolerance}.

 Production FMEA (or PFMEA), is FMEA used to prioritise, in terms of
 cost, problems to be addressed in product production.
-
-It focuses on known problems, determines the 
-frequency they occur and their cost to fix.
-This is multiplied together and called an RPN
-number.
+%
+It generally focuses on known problems and using their 
+statistical frequency %they occur 
+and their cost to fix multiplied gives a  Risk Priority Number (RPN)
+number for the component {\fm}.
+%
 Fixing problems with the highest RPN number
-will return most cost benefit.
+will return most cost benefit~\cite{bfmea}.

 % benign example of PFMEA in CARS - make something up.
 \subsection{PFMEA Example}
@ -872,7 +923,7 @@ will return most cost benefit.

 \section{FMECA - Failure Modes Effects and Criticality Analysis}
 
-\subsection{ FMECA - Failure Modes Effects and Criticality Analysis}
+\paragraph{ FMECA - Failure Modes Effects and Criticality Analysis.}
 % \begin{figure}
 %  \centering
 %  %\includegraphics[width=100pt]{./military-aircraft-desktop-computer-wallpaper-missile-launch.jpg}
@ -883,10 +934,16 @@ will return most cost benefit.
 % \end{figure}
 FMECA places emphasis on determining criticality rather than the cost of system failures.
 %
-Applies some Bayesian statistics (probabilities of component failures
-thereby causing given system level failures).
+It applies Bayesian statistics (probabilities of component failures
+and the probability of those failures causing given system level failures)
+to determine the risk of system level events/symptoms.
+The results of the probabilities for the system level failures
+are multiplied by the operational time of the system.
+For instance a military or emergency  system may be typically operational for
+a given number of hours. This in conjunction with the severity
+of the system level event gives us a level of criticality.
 %
-Also the probability of the system failure causing a critical event.
+%Also the probability of the system failure causing a critical event.
 %
 Applying Bayesian statistics to failure analysis, suffers the 
 problem that correlation does not imply causation~\cite{bayesfrequentist}.
@ -895,9 +952,7 @@ However, correlation is evidence for causation, and maybe the only evidence to h
 and this is the justification behind its use.
 A history of the usage and development of FMECA may be found in~\cite{FMECAresearch}.

-
-
-\subsection{ FMECA - Failure Modes Effects and Criticality Analysis}
+\paragraph{ FMECA - Failure Modes Effects and Criticality Analysis.}
 Very similar to PFMEA, but instead of cost, a criticality or
 seriousness factor is ascribed to putative top level incidents.
 FMECA has three probability factors for component failures.
@ -917,7 +972,7 @@ a particular failure~mode occurring within a component.  reference FMD-91.



-\subsection{ FMECA - Failure Modes Effects and Criticality Analysis}
+\paragraph{ FMECA - Failure Modes Effects and Criticality Analysis.}
 \textbf{FMECA $\beta$ value.}
 The second probability factor $\beta$, is the probability that the failure mode
 will cause a given system failure.
@ -938,17 +993,15 @@ A weighting factor to indicate the seriousness of the putative system level erro
 C_m  =  {\beta} .  {\alpha} . {{\lambda}_p} . {t} . {s}
 \end{equation}

-Highest $C_m$ values would be at the top of a `to~do' list
-for a project manager.
-
-
+The highest $C_m$ values would represent the most dangerous or serious
+system level failures.
+The highest $C_m$ values would be at the top of a `to~fix' list
+for a project manager, and some levels of risk may be considered unacceptable
+and require re-design of some systems.


 \section{FMEDA - Failure Modes Effects and Diagnostic Analysis}

-
-
-
 %\subsection{ FMEDA - Failure Modes Effects and Diagnostic Analysis}
 % \begin{figure}
 %  \centering
--- a/submission_thesis/CH3_FMEA_criticism/copy.tex
+++ b/submission_thesis/CH3_FMEA_criticism/copy.tex
@ -1,8 +1,20 @@
 \label{sec:chap3}

+\section*{Introduction}
+
+This chapter examines FMEA in a critical light.
+The problems with the scope---or required reasoning distance---of detail to apply 
+for FMEA analysis are examined. The impossibility of integrating software
+and hardware in FMEA failure models, and the impossibility of performing meaningful
+multiple failure analysis are examined.
+Additional problems such as the inability to easily re-use, and validate (through
+traceable reasoning) FMEA models is presented.
+Finally we conclude with a list of deficiencies in current FMEA methodologies, and present a wish list
+for an improved methodology.
+
 \section{Historical Origins of FMEA}

-\subsection{FMEA designed for simple electro-mechanical systems}
+\subsection{FMEA: {\bc} {\fm} to system level failure modelling}
 FMEA traces it roots to the 1940s when it was used to identify the most costly
 failures arising from car mass-production~\cite{bfmea}.
 It was later modified slightly to include severity of the top level failure (FMECA~\cite{fmeca}).
@ -14,6 +26,13 @@ This means that we have one analysis case per component failure mode for all the
 This analysis philosophy has not changed since FMEA was first used.


+\subsection{FMEA does not support Traceable Reasoning}
+An FMEA report normally assigns one line of a spreadsheet to
+each {\bc} {\fm}.  
+This means that the reasoning involved in determining the system level failure/symptom  is described (if at all) very briefly.
+Ideally supporting documentation would give the reasoning and calculations behind each analysis case,
+but the structure of current FMEA reports does not encourage this.
+
 \subsection{FMEA does not support modularity.}
 It is a common practise in the process control industry to buy in sub-systems, 
 typically sensors and actuators connected to an industrially hardened computer bus, i.e. CANbus~\cite{can,canspec}, modbus~\cite{modbus} etc.
@ -64,10 +83,19 @@ We could term such a group a `{\fg}'.

 Given the {\bc} {\fm} to system level failure mode paradigm it is 
 difficult to re-use FMEA analysis.
+%
 Several strategies to aid re-use have been proposed~\cite{rudov2009language, reuse_of_fmea}, but
 the fundamental problem remains, that, with any changes 
 to the component base in a system, it is very difficult to
 determine which FMEA test scenarios must be re-worked.
+%
+It is common in safety critical systems to have repeated circuit topologies.
+For instance we may have several signal input and output
+structures that are repeated.
+%
+The failure mode behaviour of these repeated structures will be the same.
+However with the {\bc} {\fm} to system level failure mode mapping 
+work is likely to be repeated.


 \section{software and FMEA}
@ -82,7 +110,7 @@ Similar difficulties in integrating mechanical and electronic/software
 failure models are discussed in ~\cite{SMR:SMR580,swassessment}.


-\paragraph{Current work on Software FMEA}
+\paragraph{Current work on Software FMEA.}

 SFMEA usually does not seek to integrate
 hardware and software models, but to perform
@ -204,104 +232,104 @@ utterly anachronistic in the distributed real time system environment.

 FMEA is no longer fit for purpose!
 %
-
-\section{Conclusions on current FMEA Methodologies}
-
-%% FOCUS
-The focus of this chapter %literature review 
-is to establish the current practice and applications
-of FMEA.
-%, and to examine its strengths and weaknesses.
-%% GOAL
-Its 
-goal is to identify central issues and to criticise and assess  the current 
-FMEA methodologies.
-%% PERSPECTIVE
-The perspective of the author, is as a practitioner of static failure mode analysis techniques
-concerning approval of product 
-to European safety standards, both the prescriptive~\cite{en298,en230} and statistical~\cite{en61508}.
-A second perspective is that of a software engineer trained to use formal methods.
-Examining FMEA methodologies for mathematical properties, influenced by
-formal methods applied to software, should provide a perspective not traditionally considered.
-%% COVERAGE
-The literature reviewed, has been restricted to published books, European safety standards (as examples
-of current safety measures applied), and traditional research, from journal and conference papers.
-%% ORGANISATION
-The review is organised by concept, that is, FMEA can be applied to hardware, software, software~interfacing and
-to multiple failure scenarios etc. Methodologies related to FMEA are briefly covered for the sake of context.
-%% AUDIENCE
-% Well duh! PhD supervisors and examiners....
-
-% \subsection{Related Methodologies}
-% FTA --- HAZOP  --- ALARP  --- Event Tree Analysis --- bow tie concept
-% \subsection{Hardware FMEA (HFMEA)}
-% \subsection{Multiple Failure scenarios and FMEA}
-% \subsection{Software FMEA (SFMEA)}
-
-\paragraph{Current work on Software FMEA}
-
-SFMEA usually does not seek to integrate
-hardware and software models, but to perform
-FMEA on the software in isolation~\cite{procsfmea}.
-%
-Work has been performed using databases
-to track the relationships between variables 
-and system failure modes~\cite{procsfmeadb}, to %work has been performed to 
-introduce automation into the FMEA process~\cite{appswfmea} and to provide code analysis
-automation~\cite{modelsfmea}. Although the SFMEA and hardware FMEAs are performed separately,
-some schools of thought aim for Fault Tree Analysis (FTA)~\cite{nasafta,nucfta} (top down - deductive) 
-and FMEA (bottom-up inductive)
-to be performed on the same system to provide insight into the
-software hardware/interface~\cite{embedsfmea}.
-%
-Although this
-would give a better picture of the failure mode behaviour, it
-is by no means a rigorous approach to tracing errors that may occur in hardware
-through to the top (and therefore ultimately controlling) layer of software~\cite{swassessment}.
-
-\paragraph{Current FMEA techniques are not suitable for software}
-
-The main FMEA methodologies are all based on the concept of taking  
-base component {\fms}, and translating them into system level events/failures~\cite{sfmea,sfmeaa}.
-%
-In a complicated system, mapping a component failure mode to a system level failure
-will mean a long reasoning distance; that is to say the actions of the 
-failed component will have to be traced through
-several sub-systems, gauging its effects with and on other components. 
-%
-With software at the higher levels of these sub-systems,
-we have yet another layer of complication.
-%
-%In order to integrate software, %in a meaningful way 
-%we need to re-think the 
-%FMEA concept of simply mapping a base component failure to a system level event.
-%
-SFMEA regards, in place of hardware components, the variables used by the programs to be their equivalent~\cite{procsfmea}.
-The failure modes of these variables, are that they could become erroneously over-written, 
-calculated incorrectly (due to a mistake by the programmer, or a fault in the micro-processor on which it is running), or
-external influences such as
-ionising radiation causing bits to be erroneously altered.
-
-
-\paragraph{FMEA and Modularity}
-From the 1940's onwards, software has evolved from a simple procedural languages (i.e. assembly language/Fortran~\cite{f77} call return)
-to structured programming ( C~\cite{DBLP:books/ph/KernighanR88}, pascal etc) and then to object oriented models (Java C++...).
-FMEA has undergone no such evolution.
-%
-In a world where sensor systems, often including embedded software components, are brought in to
-create complex systems, FMEA still follows a rigid {\bc} {\fm} to system level error model,
-that is only suitable for simple electro mechanical systems.
-
-
-
-%  
-
-%
-% MAYBE MOVE THIS TO CH3, FMEA CRITICISM
+% 
+% \section{Conclusions on current FMEA Methodologies}
+% 
+% %% FOCUS
+% The focus of this chapter %literature review 
+% is to establish the current practice and applications
+% of FMEA.
+% %, and to examine its strengths and weaknesses.
+% %% GOAL
+% Its 
+% goal is to identify central issues and to criticise and assess  the current 
+% FMEA methodologies.
+% %% PERSPECTIVE
+% The perspective of the author, is as a practitioner of static failure mode analysis techniques
+% concerning approval of product 
+% to European safety standards, both the prescriptive~\cite{en298,en230} and statistical~\cite{en61508}.
+% A second perspective is that of a software engineer trained to use formal methods.
+% Examining FMEA methodologies for mathematical properties, influenced by
+% formal methods applied to software, should provide a perspective not traditionally considered.
+% %% COVERAGE
+% The literature reviewed, has been restricted to published books, European safety standards (as examples
+% of current safety measures applied), and traditional research, from journal and conference papers.
+% %% ORGANISATION
+% The review is organised by concept, that is, FMEA can be applied to hardware, software, software~interfacing and
+% to multiple failure scenarios etc. Methodologies related to FMEA are briefly covered for the sake of context.
+% %% AUDIENCE
+% % Well duh! PhD supervisors and examiners....
+% 
+% % \subsection{Related Methodologies}
+% % FTA --- HAZOP  --- ALARP  --- Event Tree Analysis --- bow tie concept
+% % \subsection{Hardware FMEA (HFMEA)}
+% % \subsection{Multiple Failure scenarios and FMEA}
+% % \subsection{Software FMEA (SFMEA)}
+% 
+% \paragraph{Current work on Software FMEA}
+% 
+% SFMEA usually does not seek to integrate
+% hardware and software models, but to perform
+% FMEA on the software in isolation~\cite{procsfmea}.
+% %
+% Work has been performed using databases
+% to track the relationships between variables 
+% and system failure modes~\cite{procsfmeadb}, to %work has been performed to 
+% introduce automation into the FMEA process~\cite{appswfmea} and to provide code analysis
+% automation~\cite{modelsfmea}. Although the SFMEA and hardware FMEAs are performed separately,
+% some schools of thought aim for Fault Tree Analysis (FTA)~\cite{nasafta,nucfta} (top down - deductive) 
+% and FMEA (bottom-up inductive)
+% to be performed on the same system to provide insight into the
+% software hardware/interface~\cite{embedsfmea}.
+% %
+% Although this
+% would give a better picture of the failure mode behaviour, it
+% is by no means a rigorous approach to tracing errors that may occur in hardware
+% through to the top (and therefore ultimately controlling) layer of software~\cite{swassessment}.
+% 
+% \paragraph{Current FMEA techniques are not suitable for software}
+% 
+% The main FMEA methodologies are all based on the concept of taking  
+% base component {\fms}, and translating them into system level events/failures~\cite{sfmea,sfmeaa}.
+% %
+% In a complicated system, mapping a component failure mode to a system level failure
+% will mean a long reasoning distance; that is to say the actions of the 
+% failed component will have to be traced through
+% several sub-systems, gauging its effects with and on other components. 
+% %
+% With software at the higher levels of these sub-systems,
+% we have yet another layer of complication.
+% %
+% %In order to integrate software, %in a meaningful way 
+% %we need to re-think the 
+% %FMEA concept of simply mapping a base component failure to a system level event.
+% %
+% SFMEA regards, in place of hardware components, the variables used by the programs to be their equivalent~\cite{procsfmea}.
+% The failure modes of these variables, are that they could become erroneously over-written, 
+% calculated incorrectly (due to a mistake by the programmer, or a fault in the micro-processor on which it is running), or
+% external influences such as
+% ionising radiation causing bits to be erroneously altered.
+% 
+% 
+% \paragraph{FMEA and Modularity}
+% From the 1940's onwards, software has evolved from a simple procedural languages (i.e. assembly language/Fortran~\cite{f77} call return)
+% to structured programming ( C~\cite{DBLP:books/ph/KernighanR88}, pascal etc) and then to object oriented models (Java C++...).
+% FMEA has undergone no such evolution.
+% %
+% In a world where sensor systems, often including embedded software components, are brought in to
+% create complex systems, FMEA still follows a rigid {\bc} {\fm} to system level error model,
+% that is only suitable for simple electro mechanical systems.
+% 
+% 
+% 
+% %  
+% 
+% %
+% % MAYBE MOVE THIS TO CH3, FMEA CRITICISM
 % 30JAN2013
 %

-\subsection{Where FMEA is now.}
+\subsection{FMEA Criticism: Conclusions.}
 FMEA useful tool for basic safety --- provides statistics on safety where field data impractical ---
 very good with single failure modes linked to top level events. 
 FMEA has become part of the safety critical and safety certification industries.
@ -319,7 +347,7 @@ All these FMEA based methodologies have the following short comings:
 \begin{itemize}
 \item Impossible to integrate Software and hardware models,
 \item State explosion problem exacerbated by increasing complexity due to density of modern electronics,
- \item Impossibility to consider all multiple component failure modes~\cite{FMEAmultiple653556}
+ \item Impossible to consider all multiple component failure modes~\cite{FMEAmultiple653556}
 \end{itemize}


@ -333,7 +361,7 @@ We now form a wish list, stating the features that we would want
 in an improved FMEA methodology,
 \begin{itemize}
    \item No state explosion making analysis impractical,  
-   \item  Rigorous (total failure coverage within {\fgs} all interacting component and failure modes checked),
+   \item  Exhaustive checking (total failure coverage within {\fgs} all interacting component and failure modes checked),
    \item Reasoning Traceable in system models,  
    \item Re-useable i.e. it should be possible to re-use analysis performed previously,
    \item It must be possible to analyse simultaneous/multiple failures,