%%% OUTLINE % Software FMEA % % % Glaring hole in approvals FMEA is performed on hardware % and electronics, but with software we only get guidlines ( which mostly consist of constraints!) % % No known method of software failure mode effects analysis--- some work has been done on % Sofware FTA a top down approach--- % Bottom up approach means all known failure modes must be modelled. % SIL does not have metric or tools to analyse software for safety, % it instead applies best practises and constraints on computer language features (i.e. % in C limited use of pointers no recursion etc). % % % Introduce concept of FMEA % * bottom up % * all failure modes for all componnts % % Concept of FMMD % % Look at the structure of software % * a natural hierarchy % % Software written for a controlled % Contract programming % * describe concept % * describe how this fits in with failure modes and failure symptoms concepts % % Describe how contract programming represents the failure modes of software % % Now describe how this fits in with the structure of FMMD \documentclass[twocolumn]{article} %\documentclass[twocolumn,10pt]{report} \usepackage{graphicx} \usepackage{fancyhdr} \usepackage{tikz} \usepackage{amsfonts,amsmath,amsthm} \usetikzlibrary{shapes.gates.logic.US,trees,positioning,arrows} %\input{../style} \usepackage{ifthen} \usepackage{lastpage} \usetikzlibrary{shapes,snakes} \newcommand{\tickYES}{\checkmark} \newcommand{\fc}{fault~scenario} \newcommand{\fcs}{fault~scenarios} \date{} %\renewcommand{\encodingdefault}{T1} %\renewcommand{\rmdefault}{tnr} %\newboolean{paper} %\setboolean{paper}{true} % boolvar=true or false \newcommand{\ft}{\ensuremath{4\!\!\rightarrow\!\!20mA} } \newcommand{\oc}{\ensuremath{^{o}{C}}} \newcommand{\adctw}{{${\mathcal{ADC}}_{12}$}} \newcommand{\adcten}{{${\mathcal{ADC}}_{10}$}} \newcommand{\ohms}[1]{\ensuremath{#1\Omega}} \newcommand{\fm}{failure~mode} \newcommand{\fms}{failure~modes} \newcommand{\fg}{functional~group} \newcommand{\fgs}{functional~groups} \newcommand{\dc}{derived~component} \newcommand{\dcs}{derived~components} \newcommand{\bc}{base~component} \newcommand{\bcs}{base~components} \newcommand{\irl}{in real life} \newcommand{\enc}{\ensuremath{\stackrel{enc}{\longrightarrow}}} \newcommand{\pin}{\ensuremath{\stackrel{pi}{\longleftrightarrow}}} %\newcommand{\pic}{\em pure~intersection~chain} \newcommand{\pic}{\em pair-wise~intersection~chain} \newcommand{\wrt}{\em with~respect~to} \newcommand{\abslevel}{\ensuremath{\Psi}} \newcommand{\fmmdgloss}{\glossary{name={FMMD},description={Failure Mode Modular De-Composition, a bottom-up methodolgy for incrementally building failure mode models, using a procedure taking functional groups of components and creating derived components representing them, and in turn using the derived components to create higher level functional groups, and so on, that are used to build a failure mode model of a system}}} \newcommand{\fmodegloss}{\glossary{name={failure mode},description={The way in which a failure occurs. A component or sub-system may fail in a number of ways, and each of these is a failure mode of the component or sub-system}}} \newcommand{\fmeagloss}{\glossary{name={FMEA}, description={Failure Mode and Effects analysis (FMEA) is a process where each potential failure mode within a system, is analysed to determine system level failure modes, and to then classify them {\wrt} perceived severity}}} \newcommand{\frategloss}{\glossary{name={failure rate}, description={The number of failure within a population (of size N), divided by N over a given time interval}}} \newcommand{\pecgloss}{\glossary{name={PEC},description={A Programmable Electronic controller, will typically consist of sensors and actuators interfaced electronically, with some firmware/software component in overall control}}} \newcommand{\bcfm}{base~component~failure~mode} \def\layersep{1.8cm} \newboolean{pld} \setboolean{pld}{false} % boolvar=true or false : draw analysis using propositional logic diagrams \newboolean{dag} \setboolean{dag}{true} % boolvar=true or false : draw analysis using directed acylic graphs \setlength{\topmargin}{0in} \setlength{\headheight}{0in} \setlength{\headsep}{0in} \setlength{\textheight}{22cm} \setlength{\textwidth}{18cm} \setlength{\oddsidemargin}{0in} \setlength{\evensidemargin}{0in} \setlength{\parindent}{0.0in} \setlength{\parskip}{6pt} \begin{document} %\pagestyle{fancy} %\fancyhf{} %\fancyhead[LO]{} %\fancyhead[RE]{\leftmark} %\cfoot{Page \thepage\ of \pageref{LastPage}} %\rfoot{\today} %\lhead{Developing a rigorous bottom-up modular static failure mode modelling methodology} %\lhead{Developing a rigorous bottom-up modular static failure modelling methodology} % numbers at outer edges \pagenumbering{arabic} % Arabic page numbers hereafter \author{R.Clark$^\star$ \\ % , A.~Fish$^\dagger$ , C.~Garrett$^\dagger$, J.~Howse$^\dagger$ \\ $^\star${\em Energy Technology Control, UK. r.clark@energytechnologycontrol.com} \and $^\dagger${\em University of Brighton, UK} } %\title{Developing a rigorous bottom-up modular static failure mode modelling methodology} \title{Applying FMEA to Software} %\nodate \maketitle \paragraph{Keywords:} static failure mode modelling safety-critical software fmea %\small \abstract{ \em %The certification process of safety critical products for European and %other international standards often demand environmental stress, %endurance and Electro Magnetic Compatibility (EMC) testing. Theoretical, or 'static testing', %is often also required. % Failure Mode Effects Analysis (FMEA), is a is a bottom-up technique that aims to assess the effect all component failure modes on a system. It is used both as a design tool (to determine weakness), and is a requirement of certification of safety critical products. FMEA has been successfully applied to mechanical, electrical and hybrid electro-mechanical systems. At present no technique for Software FMEA known to the authors exists. Software generally, sits on top of most safety critical control systems and defines its most important system wide behaviour and communications. Standards~\cite{en298}~\cite{en61508} that use FMEA do not specify it for Software, but do specify, good practise, review processes and language feature constraints. This is a weakness; where FMEA scientifically traces component {\fms} to resultant system failures; software has been left in a non-analytical limbo of best practises and constraints. If software FMEA were possible electro-mechanical-software hybrids could be modelled; and would thus be a {\em complete} failure mode model. %Failure modes in components in say a sensor, could be traced %up through the electronics and then through the controlling software. Present FMEA stops at the glass ceiling of the computer program. This paper presents an FMEA methodology which can be applied to software, and is compatible and integrate-able with FMEA performed on mechanical and electronic systems. } \section{Introduction} { This paper describes a modular FMEA process that can be applied to software. This modular variant of FMEA is called Failure Mode Modular de-composition (FMMD). Because this process is based on failure modes of components it can be applied to electrical and/or mechanical systems. The hierarchical structure of software is then examined, and then definitions from contract programming are used to define failure modes and failure symptoms in software functions. With these definitions we can apply FMEA to existing software\footnote{Existing software excluding recursive code, and unstructured non-functional languages}. } \section{FMEA Process} %What FMEA is, briefly variants... Failure Mode effects Analysis is the process of taking component failure modes, and by reasoning, tracing its effects through a system and determining what system level failure modes could be caused. Several variants of FMEA exist, traditional FMEA being a associated with the manufacturing industry, with the aims of prioritising the failures to fix in order of cost. Deisgn FMEA (DFMEA) is FMEA applied at the design or approvals stage where the aim is to ensure single component failures cannot cause unacceptable system level events. Failure Mode effect Criticality Analysis (FMECA) is applied to determine the most potentially dangerous or damaging failure modes to fix. Failure Mode Effects and Diagnostics Analysis, is FMEA peformed to determine a statistical level of safety. This is associated with SIL classification levels~\cite{en61508}~\cite{en61511}. FMMD is a modularisation of FMEA and can produce failure~mode models that can be used in all the above variants of FMEA. \section{Modularising FMEA} In outline, in order to modularise FMEA, we must create small modules form the bottom-up. We can do this by taking collections of base~components that perform (ideally) a simple and well defined task. We can call these {\fgs}. We can then analyse the failure mode behaviour of a {\fg} using all the failure modes of its components. When we have its failure mode behaviour, or the symptoms of failure from the perspective of the {\fg}. We now treat the {\fg} as a {\dc}; where the failure modes of the {\dc} are the symptoms of failure of the {\fg}. We can now use {\dcs} to build higher level {\fgs} until we have a complete hierarchical model of the failure mode behaviour of a system. An example of this process, applied to an inverting op-amp configuration is given in~\cite{syssafe2011}. \paragraph{Modularising FMEA: Creating a fault hierarchy.} The main concept of Failure Mode Modular Discrimination (FMMD) is to build a hierarchy of failure behaviour from the {\bc} level up to the top, or system level, with analysis stages between each transition to a higher level in the hierarchy. The first stage is to choose {\bcs} that interact and naturally form {\fgs}. The initial {\fgs} are collections of base components. %These parts all have associated fault modes. A module is a set fault~modes. From the point of view of fault analysis, we are not interested in the components themselves, but in the ways in which they can fail. A {\fg} is a collection of components that perform some simple task or function. % In order to determine how a {\fg} can fail, we need to consider all failure modes of its components. % By analysing the fault behaviour of a `{\fg}' with respect to all its components failure modes, we can determine its symptoms of failure. %In fact we can call these %the symptoms of failure for the {\fg}. With these symptoms (a set of derived faults from the perspective of the {\fg}) we can now state that the {\fg} (as an entity in its own right) can fail in a number of well defined ways. % In other words we have taken a {\fg}, and analysed how \textbf{it} can fail according to the failure modes of its components, and then determined the {\fg} failure modes. \paragraph{Creating a derived component.} We create a new `{\dc}' which has the failure symptoms of the {\fg} from which it was derived, as its set of failure modes. This new {\dc} is at a higher `failure~mode~abstraction~level' than {\bcs}. % \paragraph{An example of a {\dc}.} To give an example of this, we could look at the components that form, say an amplifier. We look at how all the components within it could fail and how that would affect the amplifier. % The ways in which the amplifier can be affected are its symptoms. % When we have determined the symptoms, we can create a {\dc} which has a {\em known set of failure modes} (i.e. its symptoms). We can now treat $AMP1$ as a pre-analysed, higher level component. The amplifier is an abstract concept, in terms of the components. To a make an `amplifier' we have to connect a a group of components in a specific configuration. This specific configuration corresponds to a {\fg}. Our use of it as a building block corresponds to a {\dc}. We can use the symbol $\bowtie$ to represent the creation of a derived component from a {\fg}. We show an FMMD hierarchy in figure~\ref{fig:fmmdh}. Using this diagram we can follow the creation of the hierarcy in a theoretical system. There are three functional groups comprised of {\bcs}. These are analysed individually using FMEA. That is to say their component failure modes are examined, and the the ways in which the {\fgs} fail; its symptoms of failure are determined. The `$\bowtie$' function is now applied to create {\dcs}. These are shown in figure~\ref{fig:fmmdh} above the {\fgs}. Now that we have {\dcs} we can use them to form a higher level functional group. We apply the same FMEA process to this and can derive a top level derived component (which has the system---or top---level failure modes). \begin{figure} \centering \includegraphics[width=200pt]{./fmmdh.png} % fmmdh.png: 365x405 pixel, 72dpi, 12.88x14.29 cm, bb=0 0 365 405 \caption{FMMD Hierarchy} \label{fig:fmmdh} \end{figure} Note the diagram of the FMMD hierarchy is very similar to a simple non-recursive programmatic function call tree. \section{Software: How can we apply FMEA} If FMEA can be applied to software we can build complete failure models of typical modern safety critical systems. With modular FMEA (FMMD) we have the concepts of failure~modes of components, {\fgs} and symptoms of failure for a functional group. A programatic function is very similar to a functional group. It calls other functions, and uses data sources, which could be viewed as its `components'. It has outputs which will be used by functions that may call it. However, we need to define a clear concept of failure modes of a function in order to map FMMD to software. \subsection{Software, a natural hierarchy} Same as FMMD ! Software written for safety critical systems is usually constrained to be modular~\cite{en61508}[3]~\cite{misra}[cc] and non recursive~\cite{misra}[aa]~\cite{iec61511}. Because of this we can assume a direct call tree. Functions call functions from the top down and eventually call the lowest level library or IO functions that interact with hardware/electronics. \subsection{Contract programming description} Contract programming is a discipline~\cite{dbcbe} for building software functions in a controlled and traceable way. Each function is subject to pre-conditions (constraints on its inputs), post-conditions (constraints on its outputs) and function wide invariants (rules). \subsubsection{Mapping contract pre-condition violations to failure modes} A precondition, or requirement for a contract software function defines the correct ranges of input conditions for the function to operate successfully. For a software function, a violation of a pre-condition is in effect a failure mode of one of its components. \subsubsection{Mapping contract post-condition violations to symptoms} A post condition is a definition of correct behaviour by a function. This could be an action performed or an output value. A violated post condition is a symptom of failure of a function. \subsection{Software FMEA} \subsection{Simple Software Example} Consider a function that reads a {\ft} input, and returns a value between 0 and 999 representing the current detected with an error indication flag. Let us assume the {\ft} detection is via a \ohms{220} resistor., and that we read a voltage from an ADC into the software. Let us define any value outside the 4mA to 20mA range as an error condition. As a voltage, we use ohms law~\cite{aoe} to determine the voltage ranges: $V=IR$, $0.004A * \ohms{220} = 0.88V$ and $0.020A * \ohms{220} = 4.4V$. Our acceptable voltage range is therefore $V >= 0.88 \wedge V<= 4.4$. This voltage range forms our input requirement. We can now examine software function. For the purpose of example the `C' programming language is used. We assume a function {\em read\_ADC()} which returns a double precision value which holds the voltage read. {\vbox{ \footnotesize \begin{verbatim} /* Software function to read 4mA to 20mA input */ /* returns a value from 0-999 proportional */ /* to the current input. */ int read_4_20_input ( int * value ) { double input_volts; int error_flag; /* require: input from ADC to be between 0.88 and 4.4 volts */ input_volts = read_ADC(INPUT_4_20_mA); if ( input_volts < 0.88 || input_volts > 4.4 ) { error_flag = 1; /* Error flag set to TRUE */ } else { *value = (input_volts - 0.88) * ( 4.4 - 0.88 ) * 999.0; error_flag = 0; /* indicate current input in range */ } /* ensure: value is proportional (0-999) to the 4 to 20mA input */ return error_flag; } \end{verbatim} } } %\clearpage \section{Conclusion} Its solved. Hoooo-ray !!!!!!!!!!!!!!!!!!!!!!!! \paragraph{Future work} \begin{itemize} \item \item \item \end{itemize} %\today % { %\tiny %\footnotesize \bibliographystyle{plain} \bibliography{vmgbibliography,mybib} } %\today \end{document}