diff --git a/mybib.bib b/mybib.bib index a1d9e58..97c9f1e 100644 --- a/mybib.bib +++ b/mybib.bib @@ -380,6 +380,13 @@ year = {2012}, YEAR = "2005" } +@BOOK{easw, + AUTHOR = "Nancy Leveson", + TITLE = "Engineering a Safer World ISBN: 978-0-262-01662-9", + PUBLISHER = "Addison-Wesley", + YEAR = "2005" +} + @BOOK{scse, AUTHOR = "Fortescue, Swinerd, Stark", TITLE = "Spacecraft Systems Engineering ISBN:978-0-470-75012-4", diff --git a/submission_thesis/CH1_introduction/copy.tex b/submission_thesis/CH1_introduction/copy.tex index 19c375f..7132d32 100644 --- a/submission_thesis/CH1_introduction/copy.tex +++ b/submission_thesis/CH1_introduction/copy.tex @@ -1,26 +1,20 @@ -\section{Copy dot tex} +\section{Introduction} + +Msc project Euler/Spider Diagram editor --- Euler/Spider Diagrams +could be used to model failure modes in components. +--- 2005 paper --- need for static analysis because of +high reliability of modern safety critical systems. + +\section{Practical Experience: Safety Critical Product Approvals} + +FMEA performed on selected areas perceived as critical +by test house. +Blanket measures, RAM ROM checks, EMC, electrical and environmental stress testing + +\subsection{Practical limitations of testing for certification vs. rigorous approach} + +State explosion problem considering a failure mode of a given component against +all other components in the system. + +Impossible to perform double simultaneous failure analysis (as demanded by EN298~\cite{en298}). -sample text -sample text -sample text -sample text -sample text -sample text -sample text -sample text -sample text -sample text -sample text -sample text -sample text -sample text -sample text -sample text -sample text -sample text -sample text -sample text -sample text -sample text -sample text -sample text diff --git a/submission_thesis/CH2_FMEA/copy.tex b/submission_thesis/CH2_FMEA/copy.tex index cf90759..1879abd 100644 --- a/submission_thesis/CH2_FMEA/copy.tex +++ b/submission_thesis/CH2_FMEA/copy.tex @@ -1,7 +1,7 @@ -EN61508:6\cite{en61508}[B.6.6] -describes FMEA as: +The generic and statistical European Safety Standard, EN61508:6\cite{en61508}[B.6.6] +describes Failure Mode Effect Analysis (FMEA) as: \begin{quotation} "To analyse a system design, by examining all possible sources of failure of a system's components and determining the effects of these failures @@ -10,26 +10,34 @@ on the behaviour and safety of the system." \section{Concepts} -Forward and backward searching... -forward search starts with possible failure causes -and works out what could happen +\paragraph{Forward and backward searches} -backward search uses possible failures and works back down (and not necessarily to -base components in a system) -Reasoning distance .... general concept... simple ideas about how complex a -failure analysis is the more modules and components are involved +A forward search starts with possible failure causes +and uses logic and reasoning to determine system level outcomes. +A backward search starts with system level events +works back down (and not necessarily to +base components in a system) using de-composition of +of the system and logic. +FMEA based methodologies are forward searches\cite{Lutz:1997:RAU:590564.590572} and top down +methodologies such as FTA~\cite{nucfta,nasafta} + +\paragraph{Reasoning distance} +A reasoning distance is the number of stages of logic and reasoning +required to map a failure cause to its potential outcomes. +%.... general concept... simple ideas about how complex a +%failure analysis is the more modules and components are involved % cite for forward and backward search related to safety critical software -\cite{Lutz:1997:RAU:590564.590572} %{sfmeaforwardbackward} + %{sfmeaforwardbackward} -\section{F.M.E.A.} +\section{FMEA} -\subsection{FMEA} +%\subsection{FMEA} %\tableofcontents[currentsection] FMEA is a broad term; it could mean anything from an informal check on how how failures could affect some equipment in an initial brain-storming session -in product design, to formal submissions as part of safety critical certification. +in product design, to formal submission as part of safety critical certification. % This chapter describes basic concepts of FMEA, uses a simple example to demonstrate a single FMEA analysis stage, describes the four main variants of FMEA in use today @@ -143,9 +151,9 @@ approach in looking for system failures. \subsection{The unacceptability of a single component failure causing a catastrophe} FMEA, due to its inductive bottom-up approach, is very good -at finding potential component failures that could have catastrophic implications. +at finding potential single component failures that could have catastrophic implications. Used in the design phase of a project FMEA is an invaluable tool -for unearthing these type of failure scenario. +for unearthing these failure scenarios. It is less useful for determining catastrophic events for multiple simultaneous\footnote{Multiple simultaneous failures are taken to mean failure that occur within the same detection period.} failures. @@ -154,18 +162,18 @@ simultaneous\footnote{Multiple simultaneous failures are taken to mean failure t Modern electronic components, are generally very reliable, and the systems built from them are thus very reliable too. Reliable field data on failures will, therefore be sparse. Should we wish to prove a continuous demand system for say ${10}^{-7}$ failures\footnote{${10}^{-7}$ failures per hour of operation is the -threshold for S.I.L. 3 reliability~\cite{en61508}.} +threshold for S.I.L. 3 reliability~\cite{en61508}. Failure rates are normally measured per $10^9$ hours of operation +and are know as Failure in Time (FIT) values. The maximum FIT values for a SIL 3 system is therefore 100.} per hour of operation, even with 1000 correctly monitored units in the field we could only expect one failure per ten thousand hours (a little over one a year). It would be utterly impractical to get statistically significant data for equipment at these reliability levels. -However, we can use FMEA (more specifically the FMEDA variant, see section~\ref{sec:FMEDA}), working from known component failure rates, to obtain +However, we can use FMEA (more specifically the FMEDA variant, see section~\ref{sec:FMEDA}), +working from known component failure rates, to obtain statistical estimates of the equipment reliability. -\subsection{Rigorous FMEA --- State Explosion Problem} - - +\subsection{FMEA and the State Explosion Problem} \paragraph{Rigorous Single Failure FMEA} @@ -251,14 +259,8 @@ number. Fixing problems with the highest RPN number will return most cost benefit. - - - - % benign example of PFMEA in CARS - make something up. \subsection{PFMEA Example} - - \begin{table}[ht] \caption{FMEA Calculations} % title of Table %\centering % used for centering table @@ -268,97 +270,22 @@ will return most cost benefit. relay 2 n/c & $1*10^{-5}$ & 98.0 & doorlocks fail & 0.00098 \\ \hline % rear end crash & $14.4*10^{-6}$ & 267,700 & fatal fire & 3.855 \\ % ruptured f.tank & & & & \\ \hline - - \hline \end{tabular} \end{table} -%Savings: 180 burn deaths, 180 serious burn injuries, 2,100 burned vehicles. Unit Cost: $200,000 per death, $67,000 per injury, $700 per vehicle. -%Total Benefit: 180 X ($200,000) + 180 X ($67,000) + $2,100 X ($700) = $49.5 million. -%COSTS -%Sales: 11 million cars, 1.5 million light trucks. -%Unit Cost: $11 per car, $11 per truck. -%Total Cost: 11,000,000 X ($11) + 1,500,000 X ($11) = $137 million. - - - - - - - - -%\subsection{Production FMEA : Example Ford Pinto : 1975} - - \subsection{PFMEA Example: Ford Pinto: 1975} - -\begin{figure}[h] - \centering - \includegraphics[width=300pt]{./CH2_FMEA/ad_ford_pinto_mpg_red_3_1975.jpg} - % ad_ford_pinto_mpg_red_3_1975.jpg: 720x933 pixel, 96dpi, 19.05x24.69 cm, bb=0 0 540 700 - \caption{Ford Pinto Advert} - \label{fig:fordpintoad} -\end{figure} - - - - - - -\begin{figure}[h] - \centering - \includegraphics[width=300pt]{./CH2_FMEA/burntoutpinto.png} - % burntoutpinto.png: 376x250 pixel, 72dpi, 13.26x8.82 cm, bb=0 0 376 250 - \caption{Burnt Out Pinto} - \label{fig:burntoutpinto} -\end{figure} - - - - - - -\begin{table}[ht] -\caption{FMEA Calculations} % title of Table -%\centering % used for centering table -\begin{tabular}{|| l | l | c | c | l ||} \hline - \textbf{Failure Mode} & \textbf{P} & \textbf{Cost} & \textbf{Symptom} & \textbf{RPN} \\ \hline \hline - relay 1 n/c & $1*10^{-5}$ & 38.0 & indicators fail & 0.00038 \\ \hline - relay 2 n/c & $1*10^{-5}$ & 98.0 & doorlocks fail & 0.00098 \\ \hline - rear end crash & $14.4*10^{-6}$ & 267,700 & fatal fire & 3.855 \\ - ruptured f.tank & & & allow & \\ \hline - - rear end crash & $1$ & $11$ & recall & 11.0 \\ - ruptured f.tank & & & fix tank & \\ \hline - -\hline -\end{tabular} -\end{table} - - - -% don't think this is relevant for the thesis: http://www.youtube.com/watch?v=rcNeorjXMrE - - - - - - \section{FMECA - Failure Modes Effects and Criticality Analysis} - - - -\subsection{ FMECA - Failure Modes Effects and Criticallity Analysis} -\begin{figure} - \centering - %\includegraphics[width=100pt]{./military-aircraft-desktop-computer-wallpaper-missile-launch.jpg} - \includegraphics[width=300pt]{./CH2_FMEA/A10_thunderbolt.jpg} - % military-aircraft-desktop-computer-wallpaper-missile-launch.jpg: 1024x768 pixel, 300dpi, 8.67x6.50 cm, bb=0 0 246 184 - \caption{A10 Thunderbolt} - \label{fig:f16missile} -\end{figure} +\subsection{ FMECA - Failure Modes Effects and Criticality Analysis} +% \begin{figure} +% \centering +% %\includegraphics[width=100pt]{./military-aircraft-desktop-computer-wallpaper-missile-launch.jpg} +% \includegraphics[width=300pt]{./CH2_FMEA/A10_thunderbolt.jpg} +% % military-aircraft-desktop-computer-wallpaper-missile-launch.jpg: 1024x768 pixel, 300dpi, 8.67x6.50 cm, bb=0 0 246 184 +% \caption{A10 Thunderbolt} +% \label{fig:f16missile} +% \end{figure} Emphasis on determining criticality of failure. Applies some Bayesian statistics (probabilities of component failures and those thereby causing given system level failures). @@ -538,7 +465,7 @@ by statistically determining how frequently it can fail dangerously. \subsection{ FMEDA - Failure Modes Effects and Diagnostic Analysis} -{ + \begin{table}[ht] \caption{FMEA Calculations} % title of Table %\centering % used for centering table @@ -612,7 +539,36 @@ judged to be in critical sections of the product. -\section{Software FMEA (SFMEA)} +\section{Literature Review} + +%% FOCUS +The focus of this literature review is to establish the practice and applications +of FMEA, and to examine its strengths and weaknesses. +%% GOAL +Its +goal is to identify central issues and to criticise and assess the current +FMEA methodologies. +%% PERSPECTIVE +The perspective of the author, is as a practitioner of static failure mode analysis techniques +concerning approval of product +to European safety standards, both the prescriptive~\cite{en298,en230} and statistical~\cite{en61508}. +A second perspective is that of a software engineer trained to use formal methods. +Examining FMEA methodologies for mathematical properties, influenced by +formal methods applied to software, should provide an angle not traditionally considered. +%% COVERAGE +The literature reviewed, has been restricted to published books, European safety standards (as examples +of current safety measures applied), and traditional research, from journal and conference papers. +%% ORGANISATION +The review is organised by concept, that is, FMEA can be applied to hardware, software, software~interfacing and +to multiple failure scenarios etc. Methodologies related to FMEA are briefly covered for the sake of context. +%% AUDIENCE +% Well duh! PhD supervisors and examiners.... + +\subsection{Related Methodologies} +FTA --- HAZOP --- ALARP --- Event Tree Analysis --- bow tie concept +\subsection{Hardware FMEA (HFMEA)} +\subsection{Multiple Failure scenarios and FMEA} +\subsection{Software FMEA (SFMEA)} \paragraph{Current work on Software FMEA} @@ -635,7 +591,7 @@ would give a better picture of the failure mode behaviour, it is by no means a rigorous approach to tracing errors that may occur in hardware through to the top (and therefore ultimately controlling) layer of software. -\subsection{Current FMEA techniques are not suitable for software} +\paragraph{Current FMEA techniques are not suitable for software} The main FMEA methodologies are all based on the concept of taking base component {\fms}, and translating them into system level events/failures~\cite{sfmea,sfmeaa}. @@ -659,468 +615,29 @@ external influences such as ionising radiation causing bits to be erroneously altered. -\paragraph{A more-complete Failure Mode Model} -% HFMEA -% SFMEA -% VARIABLE CURRUPTION -% MICRO PROCESSOR FAULTS -% INTERFACE ANALYSIS -% -% add them all together --- a load of bollocks, lots of impressive inches of reports that no one will be bothered to read.... -% -In order to obtain a more complete failure mode model of -a hybrid electronic/software system we need to analyse -the hardware, the software, the hardware the software runs on (i.e. the software's medium), -and the software/hardware interface. -% -HFMEA is a well established technique and needs no further description in this paper. - -\section{Example for analysis} % : How can we apply FMEA} - -For the purpose of example, we chose a simple common safety critical industrial circuit -that is nearly always used in conjunction with a programmatic element. -A common method for delivering a quantitative value in analogue electronics is -to supply a current signal to represent the value to be sent~\cite{aoe}[p.934]. -Usually, $4mA$ represents a zero or starting value and $20mA$ represents the full scale, -and this is referred to as {\ft} signalling. -% -{\ft} has an electrical advantage as well because the current in an electronic loop is constant~\cite{aoe}[p.20]. -Thus resistance in the wires between the source and the receiving end is not an issue -that can alter the accuracy of the signal. -% -This circuit has many advantages for safety. If the signal becomes disconnected -it reads an out of range $0mA$ at the receiving end. This is outside the {\ft} range, -and is therefore easy to detect as an error rather than an incorrect value. -% -Should the driving electronics go wrong at the source end, it will usually -supply far too little or far too much current, making an error condition easy to detect. -% -At the receiving end, one needs a resistor to convert the -current signal into a voltage that we can read with an ADC.% -%we only require one simple component to convert the - - -%BLOCK DIAGRAM HERE WITH FT CIRCUIT LOOP - -\begin{figure}[h] - \centering - \includegraphics[width=250pt]{./CH2_FMEA/ftcontext.png} - % ftcontext.png: 767x385 pixel, 72dpi, 27.06x13.58 cm, bb=0 0 767 385 - \caption{Context Diagram for {\ft} loop} - \label{fig:ftcontext} -\end{figure} - - -The diagram in figure~\ref{fig:ftcontext} shows some equipment which is sending a {\ft} -signal to a micro-controller system. -The signal is locally driven over a load resistor, and then read into the micro-controller via -an ADC and its multiplexer. -With the voltage detected at the ADC the multiplexer we read the intended quantitative -value from the external equipment. - -\subsection{Simple Software Example} - - -Consider a software function that reads a {\ft} input, and returns a value between 0 and 999 (i.e. per mil $\permil$) -representing the value intended by the current detected, with an additional error indication flag to indicate the validity -of the value returned. -% -Let us assume the {\ft} detection is via a \ohms{220} resistor, and that we read a voltage -from an ADC into the software. -Let us define any value outside the 4mA to 20mA range as an error condition. -% -As a voltage, we use ohms law~\cite{aoe} to determine the voltage ranges: $V=IR$, $$0.004A * \ohms{220} = 0.88V $$ -and $$0.020A * \ohms{220} = 4.4V \;.$$ -% -Our acceptable voltage range is therefore -% -$$(V \ge 0.88) \wedge (V \le 4.4) \; .$$ - -This voltage range forms our input requirement. -% -We can now examine a software function that performs a conversion from the voltage read to -a per~mil representation of the {\ft} input current. -% -For the purpose of example the `C' programming language~\cite{DBLP:books/ph/KernighanR88} is -used\footnote{ C coding examples use the Misra~\cite{misra} and SIL-3 recommended language constraints~\cite{en61508}.}. -We initially assume a function \textbf{read\_ADC} which returns a floating point %double precision -value representing the voltage read (see code sample in figure~\ref{fig:code_read_4_20_input}). - - -%%{\vbox{ -\begin{figure}[h+] - -\footnotesize -\begin{verbatim} -/***********************************************/ -/* read_4_20_input() */ -/***********************************************/ -/* Software function to read 4mA to 20mA input */ -/* returns a value from 0-999 proportional */ -/* to the current input. */ -/***********************************************/ -int read_4_20_input ( int * value ) { - double input_volts; - int error_flag; - - /* set ADC MUX with input to read from */ - input_volts = read_ADC(INPUT_4_20_mA); - - if ( input_volts < 0.88 || input_volts > 4.4 ) { - error_flag = 1; /* Error flag set to TRUE */ - } - else { - *value = (input_volts - 0.88) * ( 4.4 - 0.88 ) * 999.0; - error_flag = 0; /* indicate current input in range */ - } - /* ensure: value is proportional (0-999) to the - 4 to 20mA input */ - return error_flag; -} -\end{verbatim} -%} -%} - -\caption{Software Function: \textbf{read\_4\_20\_input}} -\label{fig:code_read_4_20_input} -%\label{fig:420i} -\end{figure} - -We now look at the function called by \textbf{read\_4\_20\_input}, \textbf{read\_ADC}, which returns a -voltage for a given ADC channel. -% -This function -deals directly with the hardware in the micro-controller on which we are running the software. -% -Its job is to select the correct channel (ADC multiplexer) and then to initiate a -conversion by setting an ADC 'go' bit (see code sample in figure~\ref{fig:code_read_ADC}). -% -It takes the raw ADC reading and converts it into a -floating point\footnote{the type `double' or `double precision' is a -standard C language floating point type~\cite{DBLP:books/ph/KernighanR88}.} -voltage value. - - - - - -%{\vbox{ -\begin{figure}[h+] - -\footnotesize -\begin{verbatim} -/***********************************************/ -/* read_ADC() */ -/***********************************************/ -/* Software function to read voltage from a */ -/* specified ADC MUX channel */ -/* Assume 10 ADC MUX channels 0..9 */ -/* ADC_CHAN_RANGE = 9 */ -/* Assume ADC is 12 bit and ADCRANGE = 4096 */ -/* returns voltage read as double precision */ -/***********************************************/ -double read_ADC( int channel ) { - int timeout = 0; - - /* return out of range result */ - /* if invalid channel selected */ - if ( channel > ADC_CHAN_RANGE ) - return -2.0; - /* set the multiplexer to the desired channel */ - ADCMUX = channel; - ADCGO = 1; /* initiate ADC conversion hardware */ - /* wait for ADC conversion with timeout */ - while ( ADCGO == 1 || timeout < 100 ) - timeout++; - if ( timeout < 100 ) - dval = (double) ADCOUT * 5.0 / ADCRANGE; - else - dval = -1.0; /* indicate invalid reading */ - /* return voltage as a floating point value */ - /* ensure: value is voltage input to within 0.1% */ - return dval; -} -\end{verbatim} -\caption{Software Function: \textbf{read\_ADC}} -\label{fig:code_read_ADC} -\end{figure} -%} -%} - - -We now have a very simple software structure, a call tree, where {\em read\_4\_20\_input} -calls {\em read\_ADC}, which in turn interacts with the hardware/electronics. -%shown in figure~\ref{fig:ct1}. -% -% \begin{figure}[h] -% \centering -% \includegraphics[width=56pt]{./ct1.png} -% % ct1.png: 151x224 pixel, 72dpi, 5.33x7.90 cm, bb=0 0 151 224 -% \caption{Call tree for software example} -% \label{fig:ct1} -% \end{figure} -% -This software is above the hardware in the conceptual call tree---from a programmatic perspective---%in software terms---the -software is reading values from the `lower~level' electronics. -% -%FMEA is always a bottom-up process and so we must begin with this hardware. -% -The hardware is simply a load resistor, connected across an ADC input -pin on the micro-controller and ground. -% -We can identify the resistor and the ADC module of the micro-controller as -the base components in this design. -% -We now apply FMMD starting with the hardware. - - -\section{Failure Mode effects Analysis} - -Four emerging and current techniques are now used to -apply FMEA to the hardware, the software, the software medium and the software hardware insterface. - -\subsection{Hardware FMEA} - -The hardware FMEA requires that for each component we consider all failure modes -and the putative effect those failure modes would have on the system. -The electronic components in our {\ft} system are the load resistor, -the multiplexer and the analogue to digital converter. - -{ -\tiny -\begin{table}[h+] -\caption{Hardware FMEA {\ft}} % title of Table -\label{tbl:r420i} - -\begin{tabular}{|| l | c | l ||} \hline - \textbf{Failure} & \textbf{failure} & \textbf{System} \\ - \textbf{Scenario} & \textbf{effect} & \textbf{Failure} \\ \hline - \hline - $R$ & OPEN~\cite{en298}[Ann.A] & $LOW$ \\ - & & $READING$ \\ \hline - - $R$ & SHORT~\cite{en298}[Ann.A] & $HIGH$ \\ - & & $READING$ \\ \hline - - - - $MUX$ & read wrong & $VAL\_ERROR$ \\ - & input ~\cite{fmd91}[3-102] & \\ \hline - - - - $ADC$ & ADC output & $VAL\_ERROR$ \\ - & erronous ~\cite{fmd91}[3-109] & \\ \hline -\hline -\end{tabular} -\end{table} -} - -The last two failures both lead to the system failure of $VAL\_ERROR$ . -They could lead to low or high reading as well, but we would only be able to determine this -from knowledge of the software systems criteria for these. -\clearpage -\subsection{Software FMEA - variables in place of components} - -For software FMEA, we take the variables used by the system, -and examine what could happen if they are corrupted in various ways~\cite{procsfmea, embedsfmea}. -From the function $read\_4\_20\_input()$ we have the variables $error\_flag$, -$input\_volts$ and $value$: from the function $read\_ADC()$, $timeout$, $ADCMUX$, $ADCGO$, $dval$. -We must now determine putative system failure modes for these variables becoming corrupted, this is performed in table~\ref{tbl:sfmea}. - - -{ -\tiny -\begin{table}[h+] -\caption{SFMEA {\ft}} % title of Table -\label{tbl:sfmea} - -\begin{tabular}{|| l | c | l ||} \hline - \textbf{Failure} & \textbf{failure} & \textbf{System} \\ - \textbf{Scenario} & \textbf{effect} & \textbf{Failure} \\ \hline - \hline - $error\_flag$ & set FALSE & $VAL\_ERROR$ \\ - & & \\ \hline - - $error\_flag$ & set TRUE & invalid \\ - & & error flag \\ \hline - - $input\_volts$ & corrupted & $VAL\_ERROR$ \\ - & & \\ \hline - - - $value $ & corrupted & $VAL\_ERROR$ \\ - & & \\ \hline - - - - $timeout $ & corrupted & $VAL\_ERROR$ \\ - & & \\ \hline - - - $ADCMUX $ & corrupted & $VAL\_ERROR$ \\ - & & \\ \hline - - - - $ADCGO $ & corrupted & $VAL\_ERROR$ \\ - & & \\ \hline - - $dval $ & corrupted & $VAL\_ERROR$ \\ - & & \\ \hline - - - - -\hline -\end{tabular} -\end{table} xe -} -\clearpage -\subsection{Software FMEA - failure modes of the medium ($\mu P$) of the software} - -Microprocessors/Microcontrollers have sets of known failure modes, these include RAM, ROM -EEPROM failure\footnote{EEPROM failure is not applicable for this example.} and -oscillator clock timing - - - -{ -\tiny -\begin{table}[h+] -\caption{SFMEA {\ft}} % title of Table -\label{tbl:sfmeaup} - -\begin{tabular}{|| l | c | l ||} \hline - \textbf{Failure} & \textbf{failure} & \textbf{System} \\ - \textbf{Scenario} & \textbf{effect} & \textbf{Failure} \\ \hline - \hline - $RAM$ & variable & All errors \\ - & corruption & from table~\ref{tbl:sfmea} \\ \hline - - $RAM$ & proxegram flow & process \\ - & & halts / crashes \\ \hline - - $OSC$ & stopped & process \\ - & & halts \\ \hline - - $OSC$ & too & ADC \\ - & fast & value errors \\ \hline - - $OSC$ & too & ADC \\ - & slow & value errors \\ \hline - - $ROM$ & program & All errors \\ - & corruption & from table~\ref{tbl:sfmea} \\ \hline - - $ROM$ & constant & All errors \\ - & /data corruption & from table~\ref{tbl:sfmea} \\ \hline - -\hline -\end{tabular} -\end{table} -} - -\clearpage -\subsection{Software FMEA - The software/hardware interface} - -As FMEA is applied separately to software and hardware -the interface between them is an undefined factor. -Ozarin~\cite{sfmeainterface,procsfmea} recommends that an FMEA report be written -to focus on the software/hardware interface. -The software/hardware interface has -specific problems common to many systems and configurations -and these are described in~\cite{sfmeainterface}. -%An interface FMEA is performed in table~\ref{hwswinterface}. -% -The hardware to software interface for the {\ft} example is handled -by the 'C' function $read\_ADC()$ -(see code sample in figure~\ref{fig:code_read_ADC}). -% -% An FMEA of the `software~medium' is given in table~\ref{tbl:sfmeaup}. -\paragraph{Timing and Synchronisation.} -The $ADCOUT$ register, where the raw ADC value is read -is an internal register used by the ADC and presented -as a readable memory location when the ADC -has finished updating it. -Reading it at the wrong time would -cause an invalid value to be read. -The synchronisation is performed by polling an $ADCGO$ -bit, a flag mapped to memory by which the ADC indicates that the data is ready. - -\paragraph{Interrupt Contention.} -Were an interrupt to also attempt to read from the ADC -the ADCMUX could be altered, causing the non-interrupt -routine to read from the wrong channel. - -\paragraph{Data Formatting.} -The ADC may use a big-endian or little endian integer -format. It may also right or left justify the bits in its value. - - - -\subsection{SFMEA Conclusion} -% -This paper has picked a very simple example (the industry standard {\ft} -input circuit and software) to demonstrate -SFMEA and HFMEA methodologies used to describe a failure mode model. -%Even a modest system would be far too large to analyse in conference paper -%and this -% -%The {\dc} representing the {\ft} reader -%shows that by taking a -%modular approach for FMEA, i.e. FMMD, we can integrate -Our model is described by four FMEA reports; and these % we can model the failure mode behaviour from -model the system from several failure mode perspectives. -% -With traditional FMEA methods the reasoning~distance is large, because -it stretches from the component failure mode to the top---or---system level failure. -% -With these four analysis reports -we do not have stages along the `reasoning~path' linking the failure modes from the -electronics to those in the software. -%Software is often written `defensively' but t -%Each {\fg} to {\dc} transition represents a -%reasoning stage. -% -% -%For this reason applying traditional FMEA to software stretches -%the reasoning distance even further. -% -In fact many these reasoning paths overlap---or even by-pass one another--- -it is very difficult to gauge cause and effect. -For instance, hardware failures are not analysed in the context of how they will -be handled (or missed) by the software. -% -System outputs commanded from software may not take into account particular -hardware limitations etc. - -The interface FMEA does serve to provide a useful -check-list to ensure data and synchronisation conventions used by the hardware -and software are not mismatched. However, the fact it is perceived as required %The fact its required -highlights the the miss-matches possible between the two types of analysis -which could run deeper than the mere interface level. - -However, while these techniques ensure that the software and hardware is -viewed and analysed from several perspectives, it cannot be termed a homogeneous -failure mode model. -% For instance -% were the ADC to have a small value error, say adding -% a small percentage onto the value, we would be unable to -% detect this under the analysis conditions for this model, or -% be able to pinpoint it. % \section{Conclusion} +\paragraph{Where FMEA is now} FMEA useful tool for basic safety --- provides statistics on safety where field data impractical --- very good with single failure modes linked to top level events. FMEA has become part of the safety critical and safety certification industries. - +% SFMEA is in its infancy, but there is a gap in current -certification for software, EN61508, recommends hardware redundancy architectures in conjunction +certification for software, EN61508~\cite{en61508}, recommends hardware redundancy architectures in conjunction with FMEDA for hardware: for software it recommends language constraints and quality procedures -but no inductive fault finding technique. \ No newline at end of file +but no inductive fault finding technique. + +FMEA has adapted from a cost saving exercise for mass produced items, to incorporating statistical techniques +(FMECA) to allowing for self diagnostic mitigation (FMEDA). +However, it is still based on the single component failure mapped to system level failure. +All these FMEA based methodologies have the following short comings: +\begin{itemize} + \item Impossible to integrate Software and hardware models, + \item State explosion problem exacerbated by increasing complexity due to density of modern electronics, + \item Impossibility to consider all multiple component failure modes +\end{itemize}