diff --git a/papers/fmmd_software_hardware/software_fmmd.tex b/papers/fmmd_software_hardware/software_fmmd.tex index e46ef81..e7345ef 100644 --- a/papers/fmmd_software_hardware/software_fmmd.tex +++ b/papers/fmmd_software_hardware/software_fmmd.tex @@ -136,34 +136,37 @@ failure mode of the component or sub-system}}} %endurance and Electro Magnetic Compatibility (EMC) testing. Theoretical, or 'static testing', %is often also required. % -Failure Mode Effects Analysis (FMEA), is a is a bottom-up technique that aims to assess the effect all +Failure Mode Effects Analysis (FMEA), is a bottom-up technique that aims to assess the effect all component failure modes on a system. It is used both as a design tool (to determine weaknesses), and is a requirement of certification of safety critical products. FMEA has been successfully applied to mechanical, electrical and hybrid electro-mechanical systems. Work on software FMEA (SFMEA) is beginning, but at present no technique for SFMEA that -integrates hardware and software models known to the authors exists. +integrates hardware and software models% known to the authors +exists. % -Software generally, sits on top of most modern safety critical control systems +Software generally sits on top of most modern safety critical control systems and defines its most important system wide behaviour and communications. Currently standards that demand FMEA for hardware (e.g. EN298, EN61508), -do not specify it for Software, but instead specify, good practise, +do not specify it for software, but instead specify, good practise, review processes and language feature constraints. -This is a weakness; where FMEA % scientifically +%This is a weakness; w +Where FMEA % scientifically traces component {\fms} to resultant system failures, software has been left in a non-analytical limbo of best practises and constraints. % -If software FMEA were possible, electro-mechanical-software hybrids could +If software and hardware integrated FMEA were possible, electro-mechanical-software hybrids could be modelled; and could thus be `complete' failure mode models. %Failure modes in components in say a sensor, could be traced %up through the electronics and then through the controlling software. Presently FMEA, stops at the glass ceiling of the computer program. -This paper presents an FMEA methodology which can be applied to software, and is compatible -and integrate-able with FMEA performed on mechanical and electronic systems. +This paper presents a modular variant of FMEA, Failure Mode Modular De-Composition (FMMD) methodology which +can be applied to software, and is compatible +and integrate-able with FMMD performed on mechanical and electronic systems. } \today @@ -206,7 +209,8 @@ traditional FMEA being associated with the manufacturing industry, with the aims the failures to fix in order of cost. Deisgn FMEA (DFMEA) is FMEA applied at the design or approvals stage -where the aim is to ensure single component failures cannot cause unacceptable system level events. +where the aim is to ensure that single component failures cannot +cause unacceptable system level events. Failure Mode Effect Criticality Analysis (FMECA) is applied to determine the most potentially dangerous or damaging failure modes to fix. @@ -231,10 +235,12 @@ introduce automation into the FMEA process~\cite{appswfmea} and code analysis automation~\cite{modelsfmea}. Although the SFMEA and hardware FMEAs are performed separately some schools of thought aim for FTA~\cite{nasafta}~\cite{nucfta} (top down - deductive) and FMEA (bottom-up inductive) to be performed on the same system to provide insight into the -software hardware/interface~\cite{embedsfmea}, although this +software hardware/interface~\cite{embedsfmea}. +% +Although this would give a better picture of the failure mode behaviour, it is by no means a rigorous approach to tracing errors that may occur in hardware -through the top (and therfore untimately controlling) layer of software. +through the top (and therefore ultimately controlling) layer of software. \subsection{Current FMEA techniques are not suitable for software} @@ -254,11 +260,11 @@ FMEA concept of simply mapping a base component failure to a system level event. One strategy would be to modularise FMEA. To break down the failure effect reasoning into small modules. % -If we pre-analyse modules, and then they +If we pre-analyse modules, and they can be combined with others, into larger sub-systems, we can eventually form a hierarchy of failure mode behaviour for the entire system. % -With higher level modules, we can reach the level that the software resides in. +With higher level modules, we can reach the level in which the software resides. % For instance, to read a voltage into software via an ADC we rely on an electronic sub-system that conditions the input signal and then routes it through a multiplexer to the ADC. @@ -323,7 +329,7 @@ In other words we have taken a {\fg}, and analysed how it can fail according to the failure modes of its components, and then determine the {\fg} failure symptoms. We then create a new {\dc} which has as its {\fms} the failure symptoms -of the {\fg} that it was derived from. +of the {\fg} from which it was derived. % \paragraph{Creating a derived component.} % We create a new `{\dc}' which has @@ -390,15 +396,18 @@ of components, {\fgs} and symptoms of failure for a functional group. A programmatic function has similarities with a {\fg} as defined by the FMMD process. % An FMMD {\fg} is placed into a hierarchy. -A Software function is placed into a hierarchy, that of its call-tree. +A software function is placed into a hierarchy, that of its call-tree. A software function typically calls other functions and uses data sources via hardware interaction, which could be viewed as its `components'. It has outputs, i.e. it can perform actions on data or hardware -which will be used by functions that may call it. +which will be used by functions that may call upon it. -We can map a software function to a {\fg} in FMMD. Its failure modes +We can map a software function to a {\fg} in FMMD. +% +Its failure modes are the failure modes of the software components (other functions it calls) -and the hardware its reads values from. +and the hardware from which it reads values.% from. +% Its outputs are the data it changes, or the hardware actions it performs. When we have analysed a software function---using failure conditions @@ -423,14 +432,18 @@ and the subsequent hierarchy. With software already written, that hierarchy is f Software written for safety critical systems is usually constrained to be modular~\cite{en61508}[3] and non recursive~\cite{misra}[15.2]. %{iec61511}. -Because of this we can assume a direct call tree. Functions call functions +Because of this we can assume a direct call tree. +% +Functions call functions from the top down and eventually call the lowest level library or IO functions that interact with hardware/electronics. What is potentially difficult with a software function, is deciding what -are failure modes, and later what a failure symptoms. +are its `failure~modes', and later what are its `failure~symptoms'. +% With electronic components, we can use literature to point us to suitable sets of {\fms}~\cite{fmd91}~\cite{mil1991}~\cite{en298}.%~\cite{en61508}~\cite{en298}. +% With software, only some library functions are well known and rigorously documented enough to have the equivalent of known failure modes. Most software is `bespoke'. We need a different strategy to @@ -466,7 +479,7 @@ Invariants in contract programming may apply to inputs to the function (where th and to outputs (where they can be considered {failure symptoms} in FMMD terminology). -\subsection{Software FMEA} +\subsection{Software FMMD} For the purpose of example, we chose a simple common safety critical industrial circuit that is nearly always used in conjunction with a programmatic element. @@ -583,7 +596,7 @@ We now look at the function called by \textbf{read\_4\_20\_input}, \textbf{read\ voltage for a given ADC channel. % This function -deals directly with the hardware in the micro-controller that we are running the software on. +deals directly with the hardware in the micro-controller on which we are running the software. % Its job is to select the correct channel (ADC multiplexer) and then to initiate a conversion by setting an ADC 'go' bit (see code sample in figure~\ref{fig:code_read_ADC}). @@ -747,7 +760,7 @@ We now create a {\dc} to represent this called $CMATV$. We can express this using the `$\bowtie$' function thus: $$ CMATV = \; \bowtie (G_1) .$$ -As its failure modes, are the symptoms of failure from the functional group we can now state: +As its failure modes are the symptoms of failure from the functional group we can now state: $$fm ( CMATV ) = \{ HIGH , LOW, V\_ERR \} .$$ @@ -897,7 +910,7 @@ The postcondition for the function $read\_4\_20\_input$, {\em /* ensure: value i % \paragraph{Final Functional Group} For single failures these are the two ways in which this function can fail. An $OUT\_OF\_RANGE$ will be flagged by the error flag variable. -The $VAL\_ERR$ will simply mean that the value read is simply wrong. +The $VAL\_ERR$ will mean that the value read is simply wrong. We can finally make a {\dc} to represent a failure mode model for our function $read\_4\_20\_input$ thus: