%%% CHAPTER 6 \label{sec:chap6} \section{Software and Hardware Failure Mode Concepts} \label{sec:elecsw} In this chapter we show that FMMD can be applied to both software and electronics enabling us to build complete failure models of typical modern safety critical systems. With modular FMEA i.e. FMMD %(FMMD) we have the concepts of failure~modes of components, {\fgs} and symptoms of failure for a functional group. % A programmatic function has similarities with a {\fg} as defined by the FMMD process. % An FMMD {\fg} is placed into a hierarchy. A software function is typically placed into the hierarchy of its call-tree. A software function calls other functions and uses data sources via hardware interaction, which could be viewed as its `components': it has outputs, i.e. it can perform actions on data or hardware. %which will be used by other functions that may call it. % We show that we can map a software function to a {\fg} in FMMD: its failure modes are the failure modes of the software components (other functions it calls) and the hardware from which it reads values. Its outputs are the data it changes, or the hardware actions it performs. %% %% Talk about how software specification will often say how hardware %% will react and how to interpret readings---but they do not %% always cover the failure modes of the hardware being interfaced too. % When we have analysed a software function---using failure conditions of its inputs as failure modes---we can determine its symptoms of failure (i.e. how calling functions will see its failure mode behaviour). % We apply the FMMD process to software functions by viewing them in terms of their failure mode behaviour. % As software already fits into a hierarchy we have one less analysis decision to make, compared to analysing electronics. % For Electronics and Mechanical systems, although we may be guided by the original designers concepts of modularity and sub-systems in design, applying FMMD means deciding on the members for {\fgs} and the subsequent hierarchy. % With software already written, the hierarchies are given. % To apply FMMD to software, we collect the elements used by a software function, along with the function itself to form a {\fg}. When we have analysed the failure mode behaviour of this {\fg} and have its failure mode symptoms, we can create a {\dc}. That {\dc} can be used by functions that call the function we have just analysed, until we form a complete failure mode hierarchy of the system under investigation. % map the FMMD concepts of {\fms}, {\fgs} and {\dcs} %to software functions. % %However, we need to map a the FMMD concepts of {\fms}, {\fgs} and {\dcs} %to software functions. % failure modes of a function in order to %map FMMD to software. % map the FMMD concepts of {\fms}, {\fgs} and {\dcs} %to software functions. % %However, we need to map a the FMMD concepts of {\fms}, {\fgs} and {\dcs} %to software functions. % failure modes of a function in order to %map FMMD to software. \subsection{Software, a natural hierarchy} Software written for safety critical systems is usually constrained to be modular~\cite{en61508}[3] and non recursive~\cite{misra}[15.2]. %{iec61511}. Because of this we can assume direct call trees~\footnote{A typical embedded system will have a run time call tree, and (possibly multiple) interrupt sourced call trees.}. Functions call functions from the top down and eventually call the lowest level library or IO functions that interact with hardware/electronics. What is potentially difficult with a software function, is deciding what its failure modes and symptoms are. With electronic components, we can use literature to point us to suitable sets of {\fms}~\cite{fmd91}~\cite{mil1991}~\cite{en298}. %~\cite{en61508}~\cite{en298}. With software, only some library functions are well known and rigorously documented enough to have the equivalent of known failure modes. Most software is `bespoke'. % We need a different strategy to describe the failure mode behaviour of software functions. We can use definitions from contract programming to assist here. \subsection{Contract programming description} Contract programming is a discipline~\cite{dbcbe} for building software functions in a controlled and traceable way. Each function is subject to pre-conditions (constraints on its inputs), post-conditions (constraints on its outputs) and function wide invariants (rules). \paragraph{Mapping contract `pre-condition' violations to component failure modes.} A precondition, or requirement for a contract software function defines the correct ranges of input conditions for the function to operate successfully. % % C Garret said this was unclear so I have added the following two sentences. % %If we consider a software function to be a {\fg} in the FMMD sense, i.e. We can consider a software function to be a collection of code, functions called and values/variables used. In this way it is similar to an electronic circuit, which is a collection of components connected in a specific way. Using this analogy for software, the connections are the functions code, and the called functions and variables are the components. % Erroneous behaviour from called functions and variables/inputs has the same effect as component failure modes on an electronic {\fg}. % % If we consider the called functions and variables/inputs to be components of a function, we can build a modular and hierarchical failure mode model from existing software. % Thus for FMMD applied to software, we consider a violation of a pre-condition to be equivalent to failure mode of `one of its components'. \paragraph{Mapping contract `post-condition' violations to symptoms.} A post condition is a definition of correct behaviour of a function. % A violated post condition is a symptom of failure, or derived failure mode, from a function. % Post conditions could be either actions performed (i.e. the state of hardware changed) or an output value of a function. In pure contract programming, a violation of a pre-condition would cause the function to \textbf{not} be executed. % In implementation code, a pre-condition violation should cause an error to be generated, and thus a post condition to fail. % A function can fail for reasons other than corruption of its input data (i.e. failure caused by variables it uses or return values from functions it calls). % Variables can become corrupted, by radiation affecting RAM~\cite{5488118,5963919} or by another software function erroneously overwriting variables~\cite{swseatbelt}. % Current work on software FMEA generally focuses on mapping variable corruption to failure modes~\cite{procsfmea,procsfmeadb,sfmeaauto,sfmea}. However, errors other than variable corruption can occur. For instance a microprocessor may have subtle bugs in its instruction set, or incorrectly handled interrupt contention which could cause side effects in software. For the failure mode model of any software function, we must consider that all failure modes defined by post condition violations could simply occur. %`components'. \paragraph{Mapping contract `invariant' violations to symptoms and failure modes.} Invariants in contract programming may apply to inputs to the function (where violations can be considered {\fms} in FMMD terminology), and to outputs (where violations can be considered {failure symptoms} in FMMD terminology). \subsection{Combined Hardware/Software FMMD} For the purpose of example, we chose a simple common safety critical industrial circuit that is nearly always used in conjunction with a programmatic element. A common method for delivering a quantitative value in analogue electronics is to supply a current signal to represent the value to be sent~\cite{aoe}[p.934]. Usually, $4mA$ represents a zero or starting value and $20mA$ represents the full scale, and this is referred to as {\ft} signalling. % {\ft} signalling has intrinsic electrical safety advantages. % Because the current in a loop is constant~\cite{aoe}[p.20], resistance in the wires between the source and receiving end is not an issue that can alter the accuracy of the signal. % %This circuit has many advantages for safety. If the signal becomes disconnected it reads $0mA$ at the receiving end: as this is outside the {\ft} range, it is easily detectable as an error condition rather than an incorrect value. % Should the driving electronics go wrong at the source end, it will usually supply far too little or far too much current, also making error conditions easy to detect. % At the receiving end, we only require one simple component to convert the current signal into a voltage that we can read with an AD---a resistor---given its properties defined by Ohms law. % the humble resistor! %BLOCK DIAGRAM HERE WITH FT CIRCUIT LOOP \begin{figure}[h] \centering \includegraphics[width=230pt]{./CH5_Examples/ftcontext.png} % ftcontext.png: 767x385 pixel, 72dpi, 27.06x13.58 cm, bb=0 0 767 385 \caption{Context Diagram for {\ft} loop} \label{fig:ftcontext} \end{figure} The diagram in figure~\ref{fig:ftcontext}, shows some equipment which is sending a {\ft} signal to a micro-controller system. The signal is locally driven over a load resistor, and then read into the micro-controller via an ADC and its multiplexer. With the voltage determined at the ADC, we read the intended quantitative value from the external equipment. \section{Simple Software Example: Reading a \ft input into software} Consider a software function that reads a {\ft} input, and returns a value between 0 and 999 (i.e. per mil $\permil$) representing the current detected with an additional error indication flag . % Let us assume the {\ft} detection is via a \ohms{220} resistor, and that we read a voltage from an ADC into the software. Let us define any value outside the 4mA to 20mA range as an error condition. % As we read a voltage, we use Ohms law~\cite{aoe} to determine the mA current detected: $V=IR$, $0.004A * \ohms{220} = 0.88V$ and $0.020A * \ohms{220} = 4.4V$. % Our acceptable voltage range is therefore $$(V \ge 0.88) \wedge (V \le 4.4) \; .$$ This voltage range forms our input requirement and can be considered as an invariant condition. % We can now examine a software function that performs a conversion from the voltage read to a per~mil representation of the {\ft} input current. % For the purpose of example the `C' programming language~\cite{DBLP:books/ph/KernighanR88} is used. We initially assume a function \textbf{read\_ADC} that returns a floating point %double precision value which represents the voltage read (see code sample in figure~\ref{fig:code_read_4_20_input}). %%{\vbox{ \begin{figure}[h+] \footnotesize \begin{verbatim} /***********************************************/ /* read_4_20_input() */ /***********************************************/ /* Software function to read 4mA to 20mA input */ /* returns a value from 0-999 proportional */ /* to the current input. */ /***********************************************/ int read_4_20_input ( int * value ) { double input_volts; int error_flag; /* require: input from ADC to be between 0.88 and 4.4 volts */ input_volts = read_ADC(INPUT_4_20_mA); if ( input_volts < 0.88 || input_volts > 4.4 ) { error_flag = 1; /* Error flag set to TRUE */ } else { *value = (input_volts - 0.88) * ( 4.4 - 0.88 ) * 999.0; error_flag = 0; /* indicate current input in range */ } /* ensure: value is proportional (0-999) to the 4 to 20mA input */ return error_flag; } \end{verbatim} %} %}\clearpage \caption{Software Function: \textbf{read\_4\_20\_input}} \label{fig:code_read_4_20_input} %\label{fig:420i} \end{figure} \clearpage We now look at the function called by \textbf{read\_4\_20\_input}, \textbf{read\_ADC}, which returns a voltage for a given ADC channel. % This function deals directly with the hardware in the micro-controller on which the software is running. %software on. % The software's job is to select the correct channel (ADC multiplexer) and then to initiate a conversion by setting an ADC 'go' bit (see code sample in figure~\ref{fig:code_read_ADC}). % It takes the raw ADC reading and converts it into a floating point\footnote{the type, `double' or `double precision', is a standard C language floating point type~\cite{DBLP:books/ph/KernighanR88}.} voltage value. %{\vbox{ \begin{figure}[h+] \footnotesize \begin{verbatim} /***********************************************/ /* read_ADC() */ /***********************************************/ /* Software function to read voltage from a */ /* specified ADC MUX channel */ /* Assume 10 ADC MUX channels 0..9 */ /* ADC_CHAN_RANGE = 9 */ /* Assume ADC is 12 bit and ADCRANGE = 4096 */ /* returns voltage read as double precision */ /***********************************************/ double read_ADC( int channel ) { int timeout = 0; /* require: a) input channel from ADC to be in valid ADC range b) voltage ref is 0.1% of 5V */ /* return out of range result */ /* if invalid channel selected */ if ( channnel > ADC_CHAN_RANGE ) return -2.0; /* set the multiplexer to the desired channel */ ADCMUX = channel; ADCGO = 1; /* initiate ADC conversion hardware */ /* wait for ADC conversion with timeout */ while ( ADCGO == 1 || timeout < 100 ) timeout++; if ( timeout < 100 ) dval = (double) ADCOUT * 5.0 / ADCRANGE; else dval = -1.0; /* indicate invalid reading */ /* return voltage as a floating point value */ /* ensure: value is voltage input to within 0.1% */ return dval; } \end{verbatim} \caption{Software Function: \textbf{read\_ADC}} \label{fig:code_read_ADC} \end{figure} %} %} \clearpage We now have a very simple software structure, a call tree, shown in figure~\ref{fig:ct1}. \begin{figure}[h] \centering \includegraphics[width=100pt]{./CH5_Examples/ct1.png} % ct1.png: 151x224 pixel, 72dpi, 5.33x7.90 cm, bb=0 0 151 224 \caption{Call tree for software example} \label{fig:ct1} \end{figure} This software is above the hardware in the conceptual call tree---from a programmatic perspective---%in software terms---the the software is reading values from the `lower~level' electronics. % FMEA is always a bottom-up process and so we must begin with this hardware. % The hardware is simply a load resistor, connected across an ADC input pin on the micro-controller and ground. % We can identify the resistor and the ADC module of the micro-controller as the base components in this design. % We now apply FMMD starting with the hardware. \subsection{FMMD Process} \paragraph{Functional Group - Convert mA to Voltage - CMATV} This functional group contains the load resistor and the physical Analogue to Digital Converter (ADC). Our functional group, $G_1$ is thus the set of base components: $G_1 = \{R, ADC\}$. We now determine the {\fms} of all the components in $G_1$. For the resistor we can use a failure mode set from the literature~\cite{en298}. Where the function $fm$ returns a set of failure modes for a given component we can state: $$ fm(R) = \{OPEN,SHORT\}. $$ \vbox{ For the ADC we can determine the following failure modes: \begin{itemize} \item STUCKAT --- The ADC outputs a constant value, \item MUXFAIL --- The ADC cannot select its input channel correctly, \item LOW --- The ADC output is always LOW, or zero ADC counts, \item HIGH --- The ADC output is always HIGH, or max ADC counts. \end{itemize} } We can use the function $fm$ to define the {\fms} of an ADC thus: $$ fm(ADC) = \{ STUCKAT, MUXFAIL,LOW, HIGH \}. $$ With these failure modes, we can analyse our first functional group, see table~\ref{tbl:cmatv}. { \tiny \begin{table}[h+] \center \caption{$G_1$: Failure Mode Effects Analysis} % title of Table \label{tbl:cmatv} \begin{tabular}{|| l | c | l ||} \hline %\textbf{Failure} & \textbf{failure} & \textbf{Symptom} \\ %\textbf{Scenario} & \textbf{effect} & \textbf{ADC } \\ \hline % & & & & \\ \textbf{Failure} & \textbf{Failure } & \textbf{Derived Component} \\ \textbf{cause} & \textbf{Effect} & \textbf{Failure Mode} \\ \hline \hline 1: $R_{OPEN}$ & resistor open, & $HIGH$ \\ & voltage on pin high & \\ \hline 2: $R_{SHORT}$ & resistor shorted, & $LOW$ \\ & voltage on pin low & \\ \hline \hline 3: $ADC_{STUCKAT}$ & ADC reads out & $V\_ERR$ \\ & fixed value & \\ \hline 4: $ADC_{MUXFAIL}$ & ADC may read & $V\_ERR$ \\ & wrong channel & \\ \hline 5: $ADC_{LOW}$ & output low & $LOW$ \\ 6: $ADC_{HIGH}$ & output high & $HIGH$ \\ \hline 7: post condition fails & software fails & $V\_ERR$ \\ \hline \hline \hline \end{tabular} \end{table} } We now collect the symptoms for the hardware functional group, $\{ HIGH , LOW, V\_ERR \} $, and create a {\dc} to represent this called, $CMATV$. %We can express this using the `$\derivec$' function thus: %$$ CMATV = \; \derivec (G_1) .$$ As its failure modes are the symptoms of failure from the functional group we state: $$fm ( CMATV ) = \{ HIGH , LOW, V\_ERR \} .$$ \paragraph{Functional Group - Software - Read\_ADC - RADC} \label{readADC} The software function $Read\_ADC$ uses the ADC hardware analysed as the {\dc} CMATV above. The code fragment in figure~\ref{fig:code_read_ADC} states pre-conditions, as {\em/* require: a) input channel from ADC to be in valid ADC range b) voltage ref is 0.1\% of 5V */}. % From the above contractual programming requirements, we see that the function must be sent the correct channel number. % A violation of this can be considered a {\fm} of the function, which we can call $ CHAN\_NO $. % The reference voltage for the ADC has a 0.1\% accuracy requirement. % If the reference value is outside this, it is also a {\fm} of this function, which we can call $V\_REF$. Taken as a component for use in FMEA/FMMD our function has two failure modes. We can therefore treat it as a generic component, $Read\_ADC$, by stating: $$ fm(Read\_ADC) = \{ CHAN\_NO, VREF \} $$ As we have a failure mode model for our function, we use it in conjunction with the ADC hardware {\dc} CMATV, to form a {\fg} $G_2$, where $G_2 =\{ CMSTV, Read\_ADC \}$. % We analyse this hardware/software combined {\fg}. { \tiny \begin{table}[h+] \caption{$G_2$: Failure Mode Effects Analysis} % title of Table \label{tbl:radc} \begin{tabular}{|| l | c | l ||} \hline % \textbf{Failure} & \textbf{failure} & \textbf{Symptom} \\ % \textbf{Scenario} & \textbf{effect} & \textbf{RADC } \\ \hline \textbf{Failure} & \textbf{Failure } & \textbf{Derived Component} \\ \textbf{cause} & \textbf{Effect} & \textbf{Failure Mode} \\ \hline 1: ${CHAN\_NO}$ & wrong voltage & $VV\_ERR$ \\ & read & \\ \hline 2: ${VREF}$ & ADC volt-ref & $VV\_ERR$ \\ & incorrect & \\ \hline \hline 3: $CMATV_{V\_ERR}$ & voltage value & $VV\_ERR$ \\ & incorrect & \\ \hline 4: $CMATV_{HIGH}$ & ADC may read & $HIGH$ \\ & wrong channel & \\ \hline 5: $CMATV_{LOW}$ & output low & $LOW$ \\ \hline 6: post condition fails & software fails & $VV\_ERR$ \\ \hline \hline \hline \end{tabular} \end{table} } We now collect the symptoms of failure for the {\fg} analysed (see table~\ref{tbl:radc}) as $\{ VV\_ERR, HIGH, LOW \}$. We can add as well the violation of the postcondition for the function. This postcondition, {\em /* ensure: value is voltage input to within 0.1\% */}, corresponds to $VV\_ERR$, and is already in the {\fm} set for this {\fg}. %We can now create a {\dc} called $RADC$ thus: $$RADC = \; \derivec(G_2)$$ which has the following %{\fms}: We can now create a {\dc} called $RADC$ thus: $$ fm(RADC) = \{ VV\_ERR, HIGH, LOW \} .$$ \paragraph{Functional Group - Software - voltage to per mil - VTPM } This function sits on top of the $RADC$ {\dc} determined above. We look at the pre-conditions for the function $read\_4\_20\_input$ , % which we can call $RI$ to determine its {\fms}. Its pre-condition is, {\em /* require: input from ADC to be between 0.88 and 4.4 volts */}. We can map this violation of the pre-condition, to the {\fm} VRNGE; %As this function has one pre-condition we can state, $$ fm(read\_4\_20\_input) = \{ VRNGE \} .$$ We can now form a functional group with the {\dc} $RADC$ and the software component $read\_4\_20\_input$, i.e. $G_3 = \{read\_4\_20\_input, RADC\} $. { \tiny \begin{table}[h+] \caption{$G_3$: Read\_4\_20: Failure Mode Effects Analysis} % title of Table \label{tbl:r420i} \begin{tabular}{|| l | c | l ||} \hline % \textbf{Failure} & \textbf{failure} & \textbf{Symptom} \\ % \textbf{Scenario} & \textbf{effect} & \textbf{RADC } \\ \hline \hline \textbf{Failure} & \textbf{Failure } & \textbf{Derived Component} \\ \textbf{cause} & \textbf{Effect} & \textbf{Failure Mode} \\ \hline 1: $RI_{VRGE}$ & voltage & $OUT\_OF\_$ \\ & outside range & $RANGE$ \\ \hline 2: $RADC_{VV_ERR}$ & voltage & $VAL\_ERR$ \\ & incorrect & \\ \hline \hline 3: $RADC_{HIGH}$ & voltage value & $VAL\_ERR$ \\ & incorrect & \\ \hline 4: $RADC_{LOW}$ & ADC low voltage & $OUT\_OF\_$ \\ & so out of range & $RANGE$ \\ & i.e. < 0.88V & \\ \hline 5: post condition fails & software fails & $VAL\_ERR$ \\ \hline \hline \hline \end{tabular} \end{table} } The failure symptoms for the {\fg} are $\{OUT\_OF\_RANGE, VAL\_ERR\}$. The postcondition for the function $read\_4\_20\_input$, {\em /* ensure: value is proportional (0-999) to the 4 to 20mA input */} corresponds to the $VAL\_ERR$ and is already in the set of failure modes. % \paragraph{Final Functional Group} For single failures these are the two ways in which this function can fail. An $OUT\_OF\_RANGE$ will be flagged by the error flag variable. The $VAL\_ERR$ will simply mean that the value read is incorrect. We can finally make a {\dc} to represent a failure mode model for our function $read\_4\_20\_input$. %thus: % $$ R420I = \; \derivec(G_3) .$$ This new {\dc} has the following {\fms}: $$fm(R420I) = \{OUT\_OF\_RANGE, VAL\_ERR\} .$$ % % Using the derived components, CMATV and VTPM we create % a new functional group. This % integrates FMEA's from software and eletronics % into the same failure mode model. We can now represent the software/hardware FMMD analysis as a hierarchical diagram, see figure~\ref{fig:eulerswhw}. % see figure~\ref{fig:hd}. % HTR 27OCT2012 % \begin{figure}[h] % HTR 27OCT2012 % \centering % HTR 27OCT2012 % \includegraphics[width=200pt]{./CH5_Examples/hd.png} % HTR 27OCT2012 % % hd.png: 363x520 pixel, 72dpi, 12.81x18.34 cm, bb=0 0 363 520 % HTR 27OCT2012 % \caption{FMMD hierarchy with hardware and software elements} % HTR 27OCT2012 % \label{fig:hd} % HTR 27OCT2012 % \end{figure} \begin{figure}[h] \centering \includegraphics[width=300pt]{./CH5_Examples/eulerswhw.png} % eulerswhw.png: 510x344 pixel, 72dpi, 17.99x12.14 cm, bb=0 0 510 344 \caption{Electronics and Software shown in an integrated failure mode model---an Euler diagram showing relationship between {\dcs} determined from electronics and software---the two outermost contours are software functions, and the inner two are electronic {\dcs}.} \label{fig:eulerswhw} \end{figure} \subsection{Conclusion: {\ft} Reader Software/Hardware FMMD Model} The {\dc} representing the {\ft} reader in software shows that by FMMD, we can integrate software and electro-mechanical FMMD models. With this analysis we have a complete `reasoning~path' linking the failures modes from the electronics to those in the software. Each functional group to {\dc} transition represents a reasoning stage. % Each reasoning stage will have an associated analysis report. % With traditional FMEA methods the reasoning~distance is large, because it stretches from the component failure mode to the %top---or---system top or system level failure. For this reason applying traditional FMEA to software stretches the reasoning distance even further. This is exacerbated by the fact that traditional SFMEA is performed separately from HFMEA~\cite{sfmea,sfmeaa}, additionally even the software/hardware interfacing is treated as a separate FMEA task~\cite{sfmeainterface,embedsfmea,procsfmea} We now have a {\dc} for a {\ft} input in software. Typically, more than one such input could be present in a real-world system. Not only have we integrated electronics and software in an FMEA, we can also re-use the analysis for each {\ft} input in the system. The unsolved symptoms, or unobservable errors, i.e. $VAL\_ERR$ could be addressed by another software function to read other known signals via the MUX (i.e. voltage references). This strategy would detect ADC\_STUCK\_AT and MUX\_FAIL failure modes. A software specification for a hardware interface will concentrate on how to interpret raw readings, or what signals to apply for actuators. Using FMMD we can determine an accurate failure model for the interface as well~\cite{sfmeainterface}. % HTR == HATE TO REMOVE %HTR 18NOV2012 We can represent %the hierarchy in figure~\ref{fig:hd} algebraically, %HTR 18NOV2012 the analysis hierarchy algebraically using the `$\derivec$' function: %HTR 18NOV2012 %using the groups as intermediate stages: %HTR 18NOV2012 \begin{eqnarray*} %HTR 18NOV2012 G_1 &=& \{R,ADC\} \\ %HTR 18NOV2012 CMATV &=& \;\derivec (G_1) \\ %HTR 18NOV2012 G_2 &=& \{CMATV, read\_ADC \} \\ %HTR 18NOV2012 RADC &=& \; \derivec (G_2) \\ %HTR 18NOV2012 G_3 &=& \{ RADC, read\_4\_20\_input \} \\ %HTR 18NOV2012 R420I &=& \; \derivec (G_3) \\ %HTR 18NOV2012 \end{eqnarray*} %HTR 18NOV2012 or, a nested definition, %HTR 18NOV2012 $$ \derivec \Big( \derivec \big( \derivec(R,ADC), read\_4\_20\_input \big), read\_4\_20\_input \Big). $$ %\section %HTR 18NOV2012 This nested structure means that we have multiple traceable %HTR 18NOV2012 stages of failure mode reasoning in our analysis. Traditional FMEA would have only one stage %HTR 18NOV2012 of reasoning for each component failure mode. \section{Closed Loop Control Hardware/Software Hybrid Example} It is desirable to model a complete standalone system with FMMD. Not only a standalone system, but ideally a hybrid software/hardware system. Temperature control is a first order differential problem, and is often addressed using the Proportional Integral Differential (PID) algorithm~\cite{dcods}[p.66]. Traditionally this was performed in analogue electronics with trimmer potentiometers providing the P and I parameters. Since the introduction of micro-processors, it has been possible to implement PID pro-grammatically. An FMMD analysis of a PID temperature controller would mean an analysis of a realistic standalone system without being it becoming an un-wieldingly large task. \paragraph{The PID Temperature Control Algorithm.} PID control starts with a setpoint, or desired value for a process (here the temperature). It reads the process value and determines an error value for it. The aim of the PID controller is to minimise this error term, by setting an output value, which is fed back into the process (in this example the amount of power to supply the heater). The error value is integrated and multiplied by an I constant. A differential of the error value is calculated and multiplied by a D constant. The error value itself is multiplied by a P constant, and all three of these are added to obtain the output required. % A mathematical description of PID with frequency domain modelling (La-Place transforms etc) may be found in~\cite{dcods}[Ch.3.3]. % \subsection{Design Stage: Implementation on a micro-controller.} When designing a computer program it is often useful to start with a structured analysis `Yourdon' context diagram~\cite{Yourdon:1989:MSA:62004}, see figure~\ref{fig:context_diagram_PID}. \begin{figure}[h]+ \centering \includegraphics[width=300pt]{./CH5_Examples/context_diagram_PID.png} % context_diagram_PID.png: 818x324 pixel, 72dpi, 28.86x11.43 cm, bb=0 0 818 324 \caption{Yourdon Context Diagram for PID Temperature Controller.} \label{fig:context_diagram_PID} \end{figure} Using figure~\ref{fig:context_diagram_PID} we review the system in terms of its data flow, starting with the data sources ( the Pt100 inputs) and the data syncs (the heater output and the LED indicators). % We have two voltage inputs (see section~\ref{sec:Pt100}) from the Pt100 temperature sensor. For the Pt100 sensor, we will need to read the voltages it outputs and for this will therefore require an ADC and MUX. % For the output, we can use a Pulse Width Modulator (PWM) (this is a common module found on micro-controllers allowing a variable power output~\cite{aoe}[p.360]). PWM's ADC's and MUX's are commonly built into cheap micro-controllers~\cite{pic18f2523}[Ch.15]. We refine the Yourdon diagram, with the afferent data flow coming through the MUX and ADC on the micro-controller, and the efferent channelled through a PWM module, %again built into the micro-controller, % and add more detail, see figure~\ref{fig:context_diagram2_PID}. \begin{figure}[h]+ \centering \includegraphics[width=300pt]{./CH5_Examples/context_diagram2_PID.png} % context_diagram_PID.png: 818x324 pixel, 72dpi, 28.86x11.43 cm, bb=0 0 818 324 \caption{Yourdon Context Diagram for PID Temperature Controller.} \label{fig:context_diagram2_PID} \end{figure} The Yourdon methodology allows us to zoom into data transform bubbles and analyse them in more detail. % We define the controlling software, by looking at or zooming into its transform bubble. We have the inputs and outputs from the software. We refine the data flow within the software and thus define software functions. %, and %this in terms of software functions. % We follow the data streams through the process, creating transform bubbles as required. In all `bare~metal'\footnote{`Bare~metal' is a term used to indicate a micro-processor controlled system that does not use a traditional operating system.} software architectures, we need a rudimentary operating system, often referred to as the `monitor'. % We bear in mind that PID, because the algorithm depends heavily on integral calculus is time sensitive and we therefore need to call at precise intervals determined by its proportional, integral and differential (PID) coefficients. % Most micro-controllers feature several general purpose timers~\cite{pic18f2523}. We can use an internal timer in conjunction with the monitor function to call the PID algorithm at a specified interval. % \paragraph{Data flow model to programmatic call tree.} The Yourdon methodology also gives us a guide as to which software functions should be called to control the process, or in `C' terms be the main function. % \begin{figure}[h] \centering \includegraphics[width=300pt]{./CH5_Examples/context_software.png} % context_software.png: 1023x500 pixel, 72dpi, 36.09x17.64 cm, bb=0 0 1023 500 \caption{Context diagram of the software in the PID temperature controller} \label{fig:contextsoftware} \end{figure} Using figure~\ref{fig:contextsoftware} we can now pick the transform bubble we want to be the `main' or controlling function in the software. This can be thought of as picking one bubble and holding it up. The other bubbles hang underneath forming the software call tree hierarchy, see figure~\ref{fig:context_calltree}. From examining the diagram, and with common embedded programming practise, this is clearly going to be the monitor function. \begin{figure}[h]+ \centering \includegraphics[width=300pt]{./CH5_Examples/context_calltree.png} % context_calltree.png: 800x783 pixel, 72dpi, 28.22x27.62 cm, bb=0 0 800 783 \caption{Software yourdon diagram converted to programatic call tree.} \label{fig:context_calltree} \end{figure} % \paragraph{Software Algorithm.} The monitor function will orchestrate the control process. Firstly it will examine the timer value, and when appropriate, call the PID function. The PID function call determine\_set\_point\_error and that calls convert\_ADC\_to\_T which calls Read\_ADC (the function developed in the earlier example) which reads from physical hardware. % With the set point error value the PID function will return output control value to its calling function (i.e. the PID demand which will be returned to the monitor function). % %On returning to the monitor function, it will return the PID demand value. The PID demand value will be applied via the PWM. We now have a rudimentary closed loop control system incorporating both hardware and software. % By using the Yourdon methodology we obtain a the programmatic design i.e. we define a call tree structure. % We now have all the components, i.e. hardware elements and software functions that will be used in the temperature controller. We list these, and begin, from the bottom-up, to apply FMMD analysis. \clearpage \subsection{FMMD Analysis of PID temperature Controller} To summarise from the design stage, Identified electronic components: \begin{itemize} \item ADCMUX --- Electronics, analysed in previous example. \item TIMER --- Internal micro controller timer \item HEATER --- Heating element, essentially a resistor. \item Pt100 --- Pt100 Temperature sensor, as analysed in section~\ref{sec:Pt100}. \item PWM --- Internal micro controller pulse width modulation module \item General Purpose I/O (GPIO) --- I/O used to source LED current \item LEDs --- Indication LEDs via GPIO \item micro-controller --- the medium for running the software \end{itemize} \subsection{Temperature Controller Hardware Elements FMMD} \paragraph{ACDMUX and Read\_ADC} We re-use this derived component from section~\ref{readADC}. $$ fm(RADC) = \{ VV\_ERR, HIGH, LOW \} .$$ \paragraph{TIMER} The internal timer in use is a register which when read returns an incremented time value. Using two's complement mathematics, by subtracting the time we last read it, we can calculate the interval between readings (assuming the timer has not wrapped around more then once). We can say that a timer can fail by incrementing its value at an incorrect rate, or can stop incrementing. $$ fm(TIMER) = \{ STOPPED, INCORRECT\_INTERVAL \}$$ \paragraph{HEATER} A heating element is typically some configuration of resistive wire. It therefore has the same failure modes as a resistor and we can state $$fm(HEATER) = \{ OPEN, SHORT \}$$ \paragraph{Pt100 Platinum Temperature Sensor} The Pt100 four wire configuration is analysed in section~\ref{sec:Pt100} $$ fm(Pt100) = \{ OUT\_OF\_RANGE \} $$ \paragraph{PWM} %The PWM, in use, is a hardware register written to with an integer value~\cite{pic182523}[Ch.15]. From a programmatic perspective a PWM output is a register that software writes an unsigned magnitude value to~\cite{pic182523}[Ch.15]. The PWM hardware module applies this using a mark space ratio proportional to that value, providing a means of varying the amount of power supplied. When the PWM action is halted, or fails, the digital output pin associated with it will typically be held in a high or low state. We therefore state: $$ fm(PWM) = \{ HIGH, LOW \}.$$ \paragraph{Micro-Controller} The Micro controller is a complex piece of highly integrated electronics. Typically, along with a micro-processor with PROM and RAM, they have many I/O modules including UARTS, PWM, ADCMUX, CAN General I/O and interrupt lines to name but a few. In this project we are using the ADCMUX, TIMER, PWM and general purpose computing facilities. We have to therefore consider the general~computing, CLOCK, PROM and RAM failure modes. $$fm (micro-controller) =\{ PROM\_FAULT, RAM\_FAULT, CPU\_FAULT, ALU\_FAULT, CLOCK\_STOPPED \}.$$ \subsection{Temperature Controller Software Elements FMMD} Identified Software Components: \begin{itemize} \item --- Monitor (which calls PID algorithm and sets status LEDS) \item --- PID (which calls determine\_set\_point\_error and output\_control) \item --- determine\_set\_point\_error (which calls convert\_ADC\_to\_T) \item --- convert\_ADC\_to\_T (which calls read\_ADC which we can re-use from the last example) \item --- read\_ADC \item --- output\_control (which sets the PWM hardware according to the PID demand value) \end{itemize} With the call tree structure defined (see figure~\ref{fig:context_calltree}), we can now analyse these components from the bottom-up, starting with the afferent flow, the reading in of the temperature and its conversion to a PID calculated heater output demand. \subsubsection{Afferent flow FMMD analysis, Pt100, temperature, set point error, PID output demand.} We start with the afferent flow from the Pt100. %with the software, and consider the hardware elements %used (if any) by each software function. Starting at the bottom, we form a {\fg} with the function read\_ADC and the Pt100. This gives us a {\dc} which we call ReadPt100. % { \tiny \begin{table}[h+] \center \caption{ Read\_Pt100: Failure Mode Effects Analysis} % title of Table \label{tbl:readPt100} \begin{tabular}{|| l | c | l ||} \hline % \textbf{Failure} & \textbf{failure} & \textbf{Symptom} \\ % \textbf{Scenario} & \textbf{effect} & \textbf{RADC } \\ \hline \hline \textbf{Failure} & \textbf{Failure } & \textbf{Derived Component} \\ \textbf{cause} & \textbf{Effect} & \textbf{Failure Mode} \\ \hline FC1: $RI_{VRGE}$ & voltage & $VOLTAGE\_HIGH$ \\ & outside range & \\ \hline FC2: $RADC_{VV_ERR}$ & voltage & $VAL\_ERR$ \\ & incorrect & \\ \hline \hline FC3: $RADC_{HIGH}$ & voltage value & $VAL\_ERR$ \\ & incorrect & \\ \hline FC4: $RADC_{LOW}$ & ADC may read & $VOLTAGE\_LOW$ \\ \hline FC5: post condition fails & software failure & $VAL\_ERR$ \\ in function read\_ADC & & \\ \hline \end{tabular} \end{table} } % The {\dc} Read\_Pt100 is a failure mode model of the Read\_ADC function and the Pt100 hardware, and has the following failure modes: $$ fm (Read\_Pt100) = \{ VOLTAGE\_HIGH, VAL\_ERR, VOLTAGE\_LOW \}. $$ We move along the afferent flow, and we come to the convert\_ADC\_to\_T function. This will call Read\_ADC twice, one for the high Pt100 value, again for the lower. % and once for to read a current sense. We then, calculate the resistance of the Pt100 element, and with this---using a polynomial or a lookup table~\cite{eurothermtables}---calculate the temperature. The pre-conditions for the function are that: \begin{itemize} % \item The current calculated is within pre-defined bounds i.e. Pt100\_current, \item The lower Pt100 value is within an acceptable voltage range i.e. Pt100\_lower\_voltage, \item The higher Pt100 value is within an acceptable voltage range i.e. Pt100\_higher\_voltage, \item The lower and higher values agree to within a given tolerance i.e. Pt100\_high\_low\_mismatch. \end{itemize} Any violation of these pre-conditions is equivalent to a failure mode. Note that a temperature outside the pre-defined range will also cause these errors. The postcondition is that it returns a temperature within a given tolerance to the temperature at the sensor. A failure of this post-condition can be termed temp\_incorrect. \clearpage We apply FMMD to the {\fg} formed by Read\_Pt100 and the function convert\_ADC\_to\_T. We can call the resulting {\dc} Get\_Temperature. { \tiny \begin{table}[h+] \center \caption{ Get\_Temperature: Failure Mode Effects Analysis} % title of Table \label{tbl:gettemperature} \begin{tabular}{|| l | c | l ||} \hline % \textbf{Failure} & \textbf{failure} & \textbf{Symptom} \\ % \textbf{Scenario} & \textbf{effect} & \textbf{RADC } \\ \hline \hline \textbf{Failure} & \textbf{Failure } & \textbf{Derived Component} \\ \textbf{cause} & \textbf{Effect} & \textbf{Failure Mode} \\ \hline FC1: $Pt100:Voltage\_High$ & Pt100 voltage too high & Pt100\_out\_of\_range \\ & Pt100\_higher\_voltage & \\ & OR Pt100\_current & \\ \hline FC2: $Pt100:Voltage\_Low$ & Pt100 voltage too low & Pt100\_out\_of\_range \\ & Pt100\_lower\_voltage & \\ & OR Pt100\_current & \\ \hline FC3: $Pt100\_high\_low\_mismatch$ & temperature can be calculated & Pt100\_out\_of\_range \\ & from either high or low & \\ & reading, but should correlate & \\ \hline % FC4: $Pt100\_current$ & the current applied is & Pt100\_out\_of\_range \\ % & necessary to calculate resistance, & \\ % & but should be within given bounds & \\ \hline % % FC4: $Pt100:VAL\_ERR$ & could cause an out of & temp\_incorrect\\ & range error, but may also & \\ & cause us to read an & \\ & incorrect temperature & \\ \hline FC5: post condition fails & software failure & temp\_incorrect \\ in function convert\_ADC\_to\_T & & \\ \hline \hline \end{tabular} \end{table} } We collect the failure symptoms for the {\dc} Get\_Temperature and can state: $$fm(Get\_Temperature) = \{ Pt100\_out\_of\_range, temp\_incorrect \}$$ \clearpage Following the afferent flow further, we come to a function to determine the control error value. This is simply the target temperature subtracted from the measured. We thus form a {\fg} with our newly {\dc} Get\_Temperature and the function determine\_set\_point\_error. % The pre-condition for determine\_set\_point\_error is that the temperature read by it is accurate, and its post condition is to return the correct control error value. Most failure modes from a Pt100 are observable. we can divide the post condition into two variants, a known incorrect error value, KnownIncorrectErrorValue where we can detect the Pt100 value is suspect, and IncorrectErrorValue where we simply have an incorrect error value. { \tiny \begin{table}[h+] \center \caption{ GetError: Failure Mode Effects Analysis} % title of Table \label{tbl:geterror} \begin{tabular}{|| l | c | l ||} \hline % \textbf{Failure} & \textbf{failure} & \textbf{Symptom} \\ % \textbf{Scenario} & \textbf{effect} & \textbf{RADC } \\ \hline \hline \textbf{Failure} & \textbf{Failure } & \textbf{Derived Component} \\ \textbf{cause} & \textbf{Effect} & \textbf{Failure Mode} \\ \hline FC1: $ Pt100\_out\_of\_range $ & pre-condition violated & KnownIncorrectErrorValue \\ & observable/detectable & \\ & failure mode & \\ \hline FC2: $temp\_incorrect$ & pre-condition violated & IncorrectErrorValue \\ & unobservable & \\ & undetectable failure mode & \\ \hline FC3: post condition fails & software failure & IncorrectErrorValue \\ in function determine\_set\_point\_error & & \\ \hline \end{tabular} \end{table} } We collect failure mode symptoms, and can create a new {\dc} GetError where $$fm(GetError) = \{ KnownIncorrectErrorValue, IncorrectErrorValue \}.$$ We now follow the afferent path to the PID algorithm. Here we assume that the PID constants are fixed (i.e. are not parameters). We use the $GetError$ {\dc} and the PID function to form a {\fg}. The pre-condition for the PID function is that % are that it is called %iat the correct frequency and that it receives the correct error value. The post-condition is that it outputs correct control values. % RESP FOR TIMEING IS ON CALLING FUNCTION AND IS A SEPARATE ERROR- TGHINK ABOUT JITTER..... % and controll values..... Jitter might not matter, wrong int times would % controlling function provdes context of use. Those familiar with the PID algorithm may realise that digital signal processing algorithms are sensitive to calling frequency. Were this function to be called at an incorrect rate its output would be wrong (the differential and integral parameters would effectively have been changed). % However this problem is a failure mode for the function calling it. % The calling function sets the context for the PID algorithm (i.e. what it is used for). If this PID were to be used, say as some form of low pass filter, we could consider jitter for instance. % In a control environment with PID, jitter would not be a significant factor. % This harks back to the context of use (see section~\ref{sec:subjectiveobjective}) discussion, the subjective being the context the {\dc} is used for/in, and the objective being the logic and process of the failure mode analysis. { \tiny \begin{table}[h+] \center \caption{ PID: Failure Mode Effects Analysis} % title of Table \label{tbl:pidfunction} \begin{tabular}{|| l | c | l ||} \hline % \textbf{Failure} & \textbf{failure} & \textbf{Symptom} \\ % \textbf{Scenario} & \textbf{effect} & \textbf{RADC } \\ \hline \hline \textbf{Failure} & \textbf{Failure } & \textbf{Derived Component} \\ \textbf{cause} & \textbf{Effect} & \textbf{Failure Mode} \\ \hline FC1: $ KnownIncorrectErrorValue $ & pre-condition violated & KnownControlValueErrorV \\ & observable/detectable & \\ & failure mode & \\ \hline FC2: $ IncorrectErrorValue $ & pre-condition violated & IncorrectControlErrorV \\ & unobservable & \\ & undetectable failure mode & \\ \hline FC3: post condition fails & software failure & IncorrectControlErrorV \\ in function PID & & \\ \hline \end{tabular} \end{table} } We now create a PID {\dc}, with the following failure modes: $$ fm(PID) = \{ KnownControlValueErrorV, IncorrectControlErrorV \} .$$ \begin{figure}[h] \centering \includegraphics[width=400pt]{./CH5_Examples/euler_afferent_PID.png} % euler_afferent_PID.png: 1002x342 pixel, 72dpi, 35.35x12.06 cm, bb=0 0 1002 342 \caption{Euler diagram representing the hierarchy of FMMD analysis applied to the afferent branch of call tree for the PID temperature controller example.} \label{fig:euler_afferent_PID} \end{figure} We have now modelled the the software call tree for the afferent flow, we represent this as an Euler diagram in figure~\ref{fig:euler_afferent_PID}. Two call tree branches remain. The LED indication branch and the PWM/heater output. \subsubsection{Efferent flow, PID demand value to PWM output} The monitor function calls the output\_control function with the PID demand. The output\_control function then sets the PWM hardware register, which causes the mark space output of the PWM module to apply the demanded power. We form a {\fg} with the Heating element, a PWM module and the output\_control function to model this branch of the efferent flow. We apply FMMD analysis to this {\fg} in table~\ref{tbl:heateroutput}. For the output\_control function, we have a pre-condition that the PWM module is configured and working, and has the correct clock frequency. A second pre-condition is that the heating element is connected and working. The post condition is that it sets the correct value into the PWM register to implement the power output demand. { \tiny \begin{table}[h+] \center \caption{ HeaterOutput: Failure Mode Effects Analysis} % title of Table \label{tbl:heateroutput} \begin{tabular}{|| l | c | l ||} \hline % \textbf{Failure} & \textbf{failure} & \textbf{Symptom} \\ % \textbf{Scenario} & \textbf{effect} & \textbf{RADC } \\ \hline \hline \textbf{Failure} & \textbf{Failure } & \textbf{Derived Component} \\ \textbf{cause} & \textbf{Effect} & \textbf{Failure Mode} \\ \hline FC1: $ PWM stuck HIGH $ & pre-condition violated & HeaterOnFull \\ & PWM module not working & \\ \hline FC2: $ PWM stuck LOW $ & pre-condition violated & HeaterOff \\ & PWM module not working & \\ \hline FC3: HEATER $SHORT$ & heating element resistor & HeaterOff \\ & SHORT no heating effect & \\ \hline FC4: HEATER $OPEN $ & heating element resistor & HeaterOff \\ & OPEN no heating effect & \\ \hline FC5: $ output\_control$ post & The software supplies the wrong & HeaterOutputIncorrect \\ condition failure & value to the PWM register & \\ \hline \end{tabular} \end{table} } We now create a {\dc} called HeaterOutput with the following failure modes: $$fm(HeaterOutput) = \{ HeaterOnFull, HeaterOff, HeaterOutputIncorrect \}$$ \begin{figure}[h] \centering \includegraphics[width=300pt]{./CH5_Examples/euler_heater_output.png} % euler_heater_output.png: 392x141 pixel, 72dpi, 13.83x4.97 cm, bb=0 0 392 141 \caption{Euler diagram showing HeaterOutput with its two hardware components, PWM and HEATER, and its software component output\_control.} \label{fig:eulerheateroutput} \end{figure} \subsubsection{Efferent flow: LED status LEDs} The status LEDS will be controlled by general purpose (GPIO) I/O pins. % We could have say, three LEDS one flashing with a human readable mark space ratio representing the heater output, one flashing at a regular interval to indicate the processor is alive and another flashing at an interval related to the temperature, (to indicate if the temperature readings are within expected ranges). % Each LED should flash in normal operation, and any LED being permanently on or off would indicate to the operator that an error had occurred. % The pre-condition for this function is that the GPIO is connected to working LEDS. % The post condition is that the function setLEDS will supply correct indication by flashing the LEDs. % We form a {\fg} from the GPIO, the LEDs and the software function setLEDs. % We apply FMMD analysis to this {\fg} in table~\ref{tbl:ledoutput}. { \tiny \begin{table}[h+] \center \caption{ LEDOutput: Failure Mode Effects Analysis} % title of Table \label{tbl:ledoutput} \begin{tabular}{|| l | c | l ||} \hline % \textbf{Failure} & \textbf{failure} & \textbf{Symptom} \\ % \textbf{Scenario} & \textbf{effect} & \textbf{RADC } \\ \hline \hline \textbf{Failure} & \textbf{Failure } & \textbf{Derived Component} \\ \textbf{cause} & \textbf{Effect} & \textbf{Failure Mode} \\ \hline FC1: $ Temp LED fails $ & LED will not light & FailureIndicated \\ & & \\ \hline FC2: $ Processor LED fails $ & LED will not light & FailureIndicated \\ & & \\ \hline FC3: $ PWM LED fails $ & LED will not light & FailureIndicated \\ & & \\ \hline FC4: GPIO stuck HIGH & LED permanently OFF & FailureIndicated \\ \hline FC5: GPIO stuck Low & LED permanently ON & FailureIndicated \\ \hline FC6: Software SetLEDs & Incorrect Indication & IndicationError \\ fails to set outputs correctly & Post condition failure & \\ \hline \end{tabular} \end{table} } \begin{figure}[h] \centering \includegraphics[width=300pt]{./CH5_Examples/euler_led_output.png} % euler_heater_output.png: 392x141 pixel, 72dpi, 13.83x4.97 cm, bb=0 0 392 141 \caption{Euler diagram showing LEDOutput with its three LEDs and GPIO hardware elements, and its and its software component setLEDS.} \label{fig:eulerheateroutput} \end{figure} Our {\dc} for the setLED function, GPIO and LEDs has the following failure modes: $$ fm(LEDoutput) = \{FailureIndicated, IndicationError \} $$ \subsubsection{Final Analysis Stage: PID Temperature Controller} The possibility of each software function failing its post condition without a direct underlying cause from one of its components has been included in each analysis stage involving software. This is because software introduces the possibility of anything going wrong! The common causes for software failing are: \begin{itemize} \item Value/RAM corruption typically from interrupt contention problems or accidental over writing~\cite{swseatbelt}, but can be from external sources such as radiation changing bits/values at runtime~\cite{5963919, 5488118}; \item Address bus errors leading to program errors (program sequence); \item ROM memory failures; \item Unintended behaviour of software. \end{itemize} Because the software is running on a medium, that of the processor or micro-controller our design at the final or highest level (see table~\ref{tbl:pid}), must include all possible failure modes of this medium i.e. $$fm (micro-controller) =\{ PROM\_FAULT, RAM\_FAULT, CPU\_FAULT, ALU\_FAULT, CLOCK\_STOPPED \}.$$ We perform the final FMMD stage by forming a functional group with the {\dcs} determined previously: % \begin{itemize} \item PID \item HeaterOutput \item LEDoutput \item and the function `monitor'. \end{itemize} The post condition for the monitor function is that it implements the PID control task correctly. { \tiny \begin{table}[h+] \center \caption{ standalone temperature controller: Failure Mode Effects Analysis} % title of Table \label{tbl:pid} \begin{tabular}{|| l | c | l ||} \hline % \textbf{Failure} & \textbf{failure} & \textbf{Symptom} \\ % \textbf{Scenario} & \textbf{effect} & \textbf{RADC } \\ \hline \hline \textbf{Failure} & \textbf{Failure } & \textbf{Derived Component} \\ \textbf{cause} & \textbf{Effect} & \textbf{Failure Mode} \\ \hline FC1: PID KnownControlValueError & As error is detectable/ & ControlFailureIndicated \\ & observable error can be indicated & \\ \hline FC2: PID IncorrectControlerrorV & undetectable/unobservable & ControlFailure \\ & failure PID will not control properly & \\ \hline FC3: HeaterOutput & Heater will constantly & ControlFailureIndicated \\ HeaterOnFULL & apply maximum power & \\ \hline FC4: HeaterOutput & heater will supply & ControlFailureIndicated \\ \hline HeaterOFF & no power & \\ FC5: HeaterOutput & with incorrect hower applied & ControlFailure \\ \hline HeaterOutputIncorrect & control will not be effective & \\ FC6: LEDOutput & failure of LED system & KnownIndicationError \\ FailureIndicated & where failure is observable & \\ \hline FC7: LEDOutput & failure of LED system & UnknownIndicationError \\ IndicationError & where failure is unobservable & \\ \hline %% PROM\_FAULT, RAM\_FAULT, CPU\_FAULT, ALU\_FAULT, CLOCK\_STOPPED FC8: micro-controller & un-defined behaviour & ControlFailure \\ PROM\_FAULT & & \\ \hline FC9: micro-controller & un-defined behaviour & ControlFailure \\ RAM\_FAULT & & \\ \hline FC10: micro-controller & un-defined behaviour & ControlFailure \\ CPU\_FAULT & & \\ \hline FC11: micro-controller & incorrect arithmetic & ControlFailure \\ ALU\_FAULT & performed in processing & \\ \hline FC12: micro-controller & processor will not run & ControlFailureIndicated \\ CLOCK\_STOPPED & indicator leds will not flash & \\ \hline FC13: monitor: & postcondition fails & ControlFailure \\ software fails & & \\ \hline \hline \end{tabular} \end{table} } We can now create a {\dc} for the standalone temperature controller, and give it the name TempController. It will have the following failure modes: $$fm ( TempController ) = \{ ControlFailureIndicated, ControlFailure, KnownIndicationError, UnknownIndicationError \}.$$ We can now represent this failure mode analysis as an Euler diagram, see figure~\ref{fig:euler_temp_controller}. \begin{figure}[h] \centering \includegraphics[width=300pt]{./CH5_Examples/euler_temp_controller.png} % euler_temp_controller.png: 714x251 pixel, 72dpi, 25.19x8.85 cm, bb=0 0 714 251 \caption{Euler diagram of the temperature controller final anaysis stage, showing the hybrid software/hardware {\dcs} and the function at the head of the call tree `monitor'.} \label{fig:euler_temp_controller} \end{figure} \subsection{Conclusion: Standalone system, PID Temperature Controller} The PID temperature control example above, shows that complete hybrid software/electronic systems can be modelled using FMMD. The analysis has revealed system level failure modes that are un-handled and some that are unobservable, but the FMMD analysis shows which failure modes they are. For the failure modes caused by electronics we can apply reliability statistics. % For software errors, we could, if necessary provide extra functions to provide self checking. We could follow EN61508 high reliability software measures such as duplication of functions with checking functions arbitrating them (diverse programming~\cite{en61508}[C.3.5]). % We could for instance validate the processor clocking with an external watchdog and a simple communications protocol. For PROM and RAM faults we can implement measures such as checksums and ram complement checking. % Using FMMD on these extra safety measures we can ensure no single failure could lead to a system failure, something impossible with current FMEA techniques. %OK STOP AT PID and follow the other data flows until we are ready to bring them to the top: i.e. % %the monitor program....... %\clearpage