Robin_PHD/papers/fmmd_software_hardware/software_fmmd.tex



%%% OUTLINE

% Software FMEA
%
%
% Glaring hole in approvals FMEA is performed on hardware
% and electronics, but with software we only get guidlines ( which mostly consist of constraints!)
%
% No known method of software failure mode effects analysis--- some work has been done on
% Sofware FTA a top down approach---
% Bottom up approach means all known failure modes must be modelled.
% SIL does not have metric or tools to analyse software for safety,
% it instead applies best practises and constraints on computer language features (i.e.
% in C limited use of pointers  no recursion etc).
%
%
% Introduce concept of FMEA
% * bottom up
% * all failure modes for all componnts
%
% Concept of FMMD
%
% Look at the structure of software
% * a natural hierarchy
%
% Software written for  a controlled
% Contract programming
% * describe concept
% * describe how this fits in with failure modes and failure symptoms concepts
%
% Describe how contract programming represents the failure modes of software
%
% Now describe how this fits in with the structure of FMMD


\documentclass[twocolumn]{article}
%\documentclass[twocolumn,10pt]{report}
\usepackage{graphicx}
\usepackage{fancyhdr}
%\usepackage{wassysym}
\usepackage{tikz}
\usepackage{amsfonts,amsmath,amsthm}
\usetikzlibrary{shapes.gates.logic.US,trees,positioning,arrows}
%\input{../style}
\usepackage{ifthen}
\usepackage{lastpage}
\usetikzlibrary{shapes,snakes}
\newcommand{\tickYES}{\checkmark}
\newcommand{\fc}{fault~scenario}
\newcommand{\fcs}{fault~scenarios}
\date{}
%\renewcommand{\encodingdefault}{T1}
%\renewcommand{\rmdefault}{tnr}
%\newboolean{paper}
%\setboolean{paper}{true} % boolvar=true or false
\newcommand{\derivec}{{D}}
\newcommand{\ft}{\ensuremath{4\!\!\rightarrow\!\!20mA} }
\newcommand{\permil}{\ensuremath{{ }^0/_{00}}}
\newcommand{\oc}{\ensuremath{^{o}{C}}}
\newcommand{\adctw}{{${\mathcal{ADC}}_{12}$}}
\newcommand{\adcten}{{${\mathcal{ADC}}_{10}$}}
\newcommand{\ohms}[1]{\ensuremath{#1\Omega}}
\newcommand{\fm}{failure~mode}
\newcommand{\fms}{failure~modes}
\newcommand{\fg}{functional~group}
\newcommand{\FG}{\mathcal{G}}
\newcommand{\DC}{\mathcal{DC}}
\newcommand{\fgs}{functional~groups}
\newcommand{\dc}{derived~component}
\newcommand{\dcs}{derived~components}
\newcommand{\bc}{base~component}
\newcommand{\FMMD}{ModularFMEA}
\newcommand{\bcs}{base~components}
\newcommand{\irl}{in real life}
\newcommand{\enc}{\ensuremath{\stackrel{enc}{\longrightarrow}}}
\newcommand{\pin}{\ensuremath{\stackrel{pi}{\longleftrightarrow}}}
%\newcommand{\pic}{\em pure~intersection~chain}
\newcommand{\pic}{\em pair-wise~intersection~chain}
\newcommand{\wrt}{\em with~respect~to}
\newcommand{\abslevel}{\ensuremath{\Psi}}
\newcommand{\fmmdgloss}{\glossary{name={FMMD},description={Failure Mode Modular De-Composition, a bottom-up methodolgy for incrementally building failure mode models, using a procedure taking functional groups of components and creating derived components representing them, and in turn using the derived components to create higher level functional groups, and so on, that are used to build a failure mode model of a system}}}
\newcommand{\fmodegloss}{\glossary{name={failure mode},description={The way in which a failure occurs. A component or sub-system may fail in a number of ways, and each of these is a
failure mode of the component or sub-system}}}
\newcommand{\fmeagloss}{\glossary{name={FMEA}, description={Failure Mode and Effects analysis (FMEA) is a process where each potential failure mode within a system, is analysed to determine system level failure modes, and  to then classify them {\wrt} perceived severity}}}
\newcommand{\frategloss}{\glossary{name={failure rate}, description={The number of failure within a population (of size N), divided by N over a given time interval}}}
\newcommand{\pecgloss}{\glossary{name={PEC},description={A Programmable Electronic controller, will typically consist of sensors and actuators interfaced electronically, with some firmware/software component in overall control}}}
\newcommand{\bcfm}{base~component~failure~mode}
\def\layersep{1.8cm}

\newboolean{pld}
\setboolean{pld}{false} % boolvar=true or false : draw analysis using propositional logic diagrams

\newboolean{dag}
\setboolean{dag}{true} % boolvar=true or false : draw analysis using directed acylic graphs

\setlength{\topmargin}{0in}
\setlength{\headheight}{0in}
\setlength{\headsep}{0in}
\setlength{\textheight}{22cm}
\setlength{\textwidth}{18cm}
\setlength{\oddsidemargin}{0in}
\setlength{\evensidemargin}{0in}
\setlength{\parindent}{0.0in}
\setlength{\parskip}{6pt}

\begin{document}
%\pagestyle{fancy}
%\fancyhf{}
%\fancyhead[LO]{}
%\fancyhead[RE]{\leftmark}

%\cfoot{Page \thepage\ of \pageref{LastPage}}
%\rfoot{\today}
%\lhead{Developing a rigorous bottom-up modular static failure mode modelling methodology}
%\lhead{Developing a rigorous bottom-up modular static failure modelling methodology}
                                   % numbers at outer edges
\pagenumbering{arabic}                        % Arabic page numbers hereafter
\author{R.Clark$^\star$ \\ % , A.~Fish$^\dagger$ , C.~Garrett$^\dagger$, J.~Howse$^\dagger$  \\
         $^\star${\em Energy Technology Control, UK. r.clark@energytechnologycontrol.com} \and $^\dagger${\em University of Brighton, UK}
}

%\title{Developing a rigorous bottom-up modular static failure mode modelling methodology}
\title{Applying FMMD across the Software/Hardware Interface}
%\nodate
\maketitle


\paragraph{Keywords:} static failure mode modelling safety-critical software fmea
%\small

\abstract{ \em
%The certification process of safety critical products for European and
%other international standards often demand environmental stress,
%endurance and Electro Magnetic Compatibility (EMC) testing. Theoretical, or 'static testing',
%is often also required.
%
Failure Mode Effects Analysis (FMEA), is a bottom-up technique that aims to assess the effect all
component failure modes on a system.
It is used both as a design tool (to determine weaknesses), and is a requirement of certification of safety critical products.
FMEA has been successfully applied to mechanical, electrical and hybrid electro-mechanical systems.

Work on software FMEA (SFMEA) is beginning, but
at present no technique for SFMEA that
integrates hardware and software models % known to the authors
exists.
%
Software generally sits on top of most modern safety critical control systems
and defines its most important system wide behaviour and communications.
Currently standards  that demand FMEA for hardware (e.g. EN298, EN61508),
do not specify it for software, but instead specify, good practise,
review processes and language feature constraints.

%This is a weakness; w
Where FMEA % scientifically
traces component {\fms}
to resultant system failures, software has been left in a non-analytical
limbo of best practises and constraints.
%
If software and hardware integrated FMEA were possible, electro-mechanical-software hybrids could
be modelled; and could thus be  `complete' failure mode models.
%Failure modes in components in say a sensor, could be traced
%up through the electronics and then through the controlling software.
Presently FMEA, stops at the glass ceiling of the computer program.

This paper presents a modular variant of FMEA, Failure Mode Modular De-Composition (FMMD), a methodology which
can be applied to software, and is compatible
and integrate-able with FMMD performed on mechanical and electronic systems.
}

\today
\nocite{en298}
\nocite{en61508}

\section{Introduction}
{
This paper describes a modular FMEA process that can be applied to software.
This modular variant of FMEA is called Failure Mode Modular de-composition (FMMD).
%
Because this process is based on failure modes of components,
it can be applied to electrical and/or mechanical systems.
%
The hierarchical structure of software is then examined,
and definitions from contract programming are used
to define failure modes and failure symptoms for
software functions.
%
With these definitions we can apply the FMMD modular form of FMEA
to existing software\footnote{Existing software excluding recursive~\cite{misra}[16.2] code, and unstructured non-functional languages}.
}

\section{FMEA Background}

%What FMEA is, briefly variants...

Failure Mode effects Analysis is the process of taking
component failure modes, and by reasoning, tracing their effects through a system
and determining what system level failure modes could be caused.
FMEA dates from the 1940s where simple electro-mechanical systems were the norm.
Modern control systems nearly always have a significant software/firmware element,
and not being able to model software with current FMEA methodologies
is a cause for criticism~\cite{easw}~\cite{safeware}~\cite{bfmea}.


Several variants of FMEA exist,
traditional FMEA being associated with the manufacturing industry, with the aims of prioritising
the failures to fix in order of cost.

Deisgn FMEA (DFMEA) is FMEA applied at the design or approvals stage
where the aim is to ensure that single component failures cannot
cause unacceptable system level events.

Failure Mode Effect Criticality Analysis (FMECA)  is applied to determine the most potentially dangerous or damaging
failure modes to fix.


Failure Mode Effects and Diagnostics Analysis, is FMEA peformed to
determine a statistical level of safety.
This is associated with Safety Integrity Levels (SIL)~\cite{en61508}~\cite{en61511} classification.

FMMD is a modularisation of FMEA and can produce failure~mode models that can be used in
all the above variants of FMEA.

\subsection{Current work on Software FMEA}

Work on  SFMEA usually does not seek to integrate
hardware and software models, but to perform
FMEA on the software in isolation~\cite{procsfmea}.
Some work has been performed using databases
to track the relationships between variables
and system failure modes~\cite{procsfmeadb}, and work has been performed to
introduce automation into the FMEA process~\cite{appswfmea} and code analysis
automation~\cite{modelsfmea}. Although the SFMEA and hardware FMEAs are performed separately
some schools of thought aim for FTA~\cite{nasafta}~\cite{nucfta} (top down - deductive) and FMEA (bottom-up inductive)
to be performed on the same system to provide insight into the
software hardware/interface~\cite{embedsfmea}.
%
Although this
would give a better picture of the failure mode behaviour, it
is by no means a rigorous approach to tracing errors that may occur in hardware
through the top (and therefore ultimately controlling) layer of software.

\subsection{Current FMEA techniques are not suitable for software}

The main FMEA methodologies are all based on the concept of taking
base component {\fms}, and translating them into system level events/failures~\cite{sfmea}~\cite{sfmeaa}.
In a complicated system, mapping a component failure mode to a system level failure
will mean a long reasoning distance; that is to say the actions of the failed component will have to be traced through
several sub-systems and the effects of other components on the way.
%
With software at the higher levels of these sub-systems
we have yet another layer of complication.

In order to integrate software, in a meaningful way we need to re-think the
FMEA concept of simply mapping a base component failure to a system level event.


One strategy would be to modularise FMEA. To break down the failure effect
reasoning into small modules.
%
If we pre-analyse modules, and they
can be combined with others, into
larger sub-systems, we can eventually form a hierarchy of failure mode behaviour for the entire system.
%
With higher level modules, we can reach the level in which the software resides.
%
For instance, to read a voltage into software via an ADC we rely on an electronic sub-system
that conditions the input signal and then routes it through a multiplexer to the ADC.
%
We could easily consider this electronics a `module', and with a
failure mode model for it,  modelling the software to hardware interface
becomes far simpler.
%
The failure mode model, would give us the ways in which the signal conditioning
and multiplexer could fail. We can use this to work out how our software
could fail, and with this create a modular FMEA model of the software.


\section{Modularising FMEA: The FMMD process in more detail.}

In outline, in order to modularise FMEA, we must create small modules from the bottom-up.
We can do this by taking collections of base~components that
perform (ideally) a simple and well defined task.
%
We can call these {\fgs}. We can then analyse the failure mode behaviour of a {\fg}
using all the failure modes of all its components.
%
When we have its failure mode behaviour, or the symptoms of failure from the perspective of the {\fg},
we now treat the {\fg} as a {\dc}, where the failure modes of the {\dc} are the symptoms of failure of the {\fg}.
%
%
We can now use {\dcs} to build higher level {\fgs} until we have a complete hierarchical model
of the failure mode behaviour of a system. An example of this process, applied to an inverting op-amp configuration
is given in~\cite{syssafe2011}.

\paragraph{FMMD, the process.}

The main aim of Failure Mode Modular Discrimination (FMMD) is to build a hierarchy of failure behaviour from the {\bc}
level up to the top, or system level, with analysis stages ({\fgs}) %and corresponding {\dcs}
between each
transition to a higher level in the hierarchy.


The first stage is to choose
{\bcs} that interact and naturally form {\fgs}. The initial {\fgs} are   collections of base components.
%These parts all have associated fault modes. A module is a set fault~modes.
From the point of view of fault analysis, we are not interested in the components themselves, but in the ways in which they can fail.

A {\fg} is a collection of components that perform some simple task or function.
%
In order to determine how a  {\fg} can fail,
we need to consider all the failure modes of its components.
%
By analysing the fault behaviour of a `{\fg}' with respect to all its components failure modes,
we can determine its symptoms of failure.
%In fact we can call these
%the symptoms of failure for the {\fg}.

With these symptoms (a set of derived faults from the perspective of the {\fg})
we can now state that the {\fg}
% (as an entity in its own right)
can fail in a number of well defined ways.
%
In other words we have taken a {\fg}, and analysed how
%\textbf{it}
it can fail according to the failure modes of its components, and then
determine the {\fg} failure symptoms.
We then create a new {\dc} which has as its {\fms} the failure symptoms
of the {\fg} from which it was derived.

% \paragraph{Creating a derived component.}
% We create a new `{\dc}' which has
% the failure symptoms of the {\fg} from which it was derived, as its set of failure modes.
% This new {\dc} is at a higher `failure~mode~abstraction~level' than {\bcs}.
% %
% \paragraph{An example of a {\dc}.}
% To give an example of this, we could look at the components that
% form, say an amplifier. We look at how all the components within it
% could fail and how that would affect the amplifier.
% %
% The ways in which the amplifier can be affected are its symptoms.
% %
% When we have determined the symptoms, we can
% create a {\dc}  which has a {\em known set of failure modes} (i.e. its symptoms).
% We can now treat $AMP1$ as a pre-analysed, higher level component.
% The amplifier is an abstract concept, in terms of the components.
%
% To a make an `amplifier' we have to connect a a group of components
% in a specific configuration. This specific configuration corresponds to
% a {\fg}. Our use of it as a building block corresponds to a {\dc}.

We can use the symbol `$\derivec$' to represent the creation of a derived component
from a {\fg}. This symbol is convenient for drawn hierarchy diagrams. % (see figure~\ref{fmmdh}).
We define the $\derivec$ function, where $\FG$ is the set of all {\fgs} and $\DC$ is the set of all {\dcs},

$$ \derivec ( {\FG} ) \mapsto {\DC} .$$

We show an FMMD hierarchy in figure~\ref{fig:fmmdh}.
Using this diagram, we can follow the creation of the hierarchy in
a theoretical system.
%
There are three functional groups comprised of
{\bcs}. These are analysed individually using FMEA.
That is to say their component failure modes are examined, and thus
the ways in which the {\fgs} can fail. The ways in which a
{\fg} can fail, can be viewed as symptoms of failure for the {\fg}.
%
The `$\derivec$' function is now applied to create {\dcs}.
These are shown in  figure~\ref{fig:fmmdh} above the {\fgs}.
Now that we have {\dcs}, we can use them to form a higher level functional group.
We apply the same FMEA process to this and can derive a top level
derived component (which has the system---or top---level failure modes).

\begin{figure}
 \centering
 \includegraphics[width=200pt]{./fmmdh.png}
 % fmmdh.png: 365x405 pixel, 72dpi, 12.88x14.29 cm, bb=0 0 365 405
 \caption{FMMD Hierarchy}
 \label{fig:fmmdh}
\end{figure}

Note the diagram of the FMMD hierarchy is very similar to a simple non-recursive
programmatic function call tree.

\section{Software: How can we apply FMEA}

If FMEA can be applied to software we can build complete failure models
of typical modern safety critical systems.
With modular FMEA i.e. FMMD %(FMMD)
we have the concepts of failure~modes
of components, {\fgs} and symptoms of failure for a functional group.

A  programmatic function has similarities with a {\fg} as defined by the FMMD process.
%
An FMMD {\fg} is placed into a hierarchy.
A software function is placed into a hierarchy, that of its call-tree.
A software function typically calls other functions and uses data sources via hardware interaction, which could be viewed as its `components'.
It has outputs, i.e. it can perform actions
on data or hardware
which will be used by functions that may call upon it.

We can map a software function to a {\fg} in FMMD.
%
Its failure modes
are the failure modes of the software components (other functions it calls)
and the hardware from which it reads values.% from.
%
Its outputs are the data it changes, or the hardware actions it performs.

When we have analysed a software function---using failure conditions
of its inputs as failure modes---we can
determine its symptoms of failure (i.e. how calling functions will see its failure mode behaviour).

We can thus apply the $\derivec$ function to software functions, by viewing them in terms of their failure
mode behaviour. To simplify things as well, software already fits into a hierarchy.
For Electronics and Mechanical systems, although we may be guided by the original designers
concepts of modularity and sub-systems in design, applying FMMD means deciding on the members for {\fgs}
and the subsequent hierarchy. With software already written, that hierarchy is fixed.

%                    map   the FMMD concepts of {\fms}, {\fgs} and {\dcs}
%to software functions.
%
%However, we need to map a the FMMD concepts of {\fms}, {\fgs} and {\dcs}
%to software functions.
% failure modes of a function in order to
%map FMMD to software.

\subsection{Software, a natural hierarchy}

Software written for safety critical systems is usually constrained to
be modular~\cite{en61508}[3] and non recursive~\cite{misra}[15.2]. %{iec61511}.
Because of this we can assume a direct call tree.
%
Functions call functions
from the top down and eventually call the lowest level library or IO
functions that interact with hardware/electronics.

What is potentially difficult with a software function, is deciding  what
are its `failure~modes', and later what are its `failure~symptoms'.
%
With electronic components, we can use literature to point us to suitable sets of
{\fms}~\cite{fmd91}~\cite{mil1991}~\cite{en298}.%~\cite{en61508}~\cite{en298}.
%
With software, only some library functions are well known and rigorously documented
enough to have the equivalent of known failure modes.
Most software is `bespoke'. We need a different strategy to
describe the failure mode behaviour of software functions.
We can use definitions from contract programming to assist here.

\subsection{Contract programming description}

Contract programming is a discipline~\cite{dbcbe} for building software functions in a controlled
and traceable way. Each function is subject to pre-conditions (constraints on its inputs),
post-conditions (constraints on its outputs) and function wide invariants (rules).


\paragraph{Mapping contract `pre-condition' violations to failure modes}

A precondition, or requirement for a contract software function
defines the correct ranges of input conditions for the function
to operate successfully.

For a software function, a violation of a pre-condition is
in effect a failure mode of `one of its components'.


\paragraph{Mapping contract `post-condition' violations to symptoms}

A post condition is a definition of correct behaviour by a function.
A violated post condition is a symptom of failure of a function.
Post conditions could  be either  actions performed (i.e. the state of hardware changed) or an output value of a function.

\paragraph{Mapping contract `invariant' violations to symptoms and failure modes}

Invariants in contract programming may apply to inputs to the function (where they can be considered {\fms} in FMMD terminology),
and to outputs (where they can be considered {failure symptoms} in FMMD terminology).


\subsection{Software FMMD}

For the purpose of example, we chose a simple common safety critical industrial circuit
that is nearly always used in conjunction with a programmatic element.
A common method for delivering a quantitative value in analogue electronics is
to supply a current signal to represent the value to be sent~\cite{aoe}[p.934].
Usually, $4mA$ represents a zero or starting value and $20mA$ represents the full scale,
and this is referred to as {\ft} signalling.
%
{\ft} has a an electrical advantage as well, because the current in a loop is constant~\cite{aoe}[p.20]
resistance in the wires between the source and the receiving end is not an issue
that can alter the accuracy of the signal.
%
This circuit has many advantages for safety. If the signal becomes disconnected
it reads an out of range $0mA$ at the receiving end. This is outside the {\ft} range,
and is therefore easy to detect as an error rather than an incorrect value.
%
Should the driving electronics go wrong at the source end, it will usually
supply far too little or far too much current, making an error condition easy to detect.
%
At the receiving end, we only require one simple component to convert the
current signal into a voltage that we can read with an ADC: the humble resistor!


%BLOCK DIAGRAM HERE WITH FT CIRCUIT LOOP

\begin{figure}[h]
 \centering
 \includegraphics[width=230pt]{./ftcontext.png}
 % ftcontext.png: 767x385 pixel, 72dpi, 27.06x13.58 cm, bb=0 0 767 385
 \caption{Context Diagram for {\ft} loop}
 \label{fig:ftcontext}
\end{figure}


The diagram in figure~\ref{fig:ftcontext}, shows some equipment which is sending a {\ft}
signal to a micro-controller system.
The signal is locally driven over a load resistor, and then read into the micro-controller via
an ADC and its multiplexer.
With the voltage detected at the ADC the multiplexer can read the intended quantitative
value from the external equipment.

\subsection{Simple Software Example}


Consider a software function that reads a {\ft} input, and returns a value between 0 and 999 (i.e. per mil $\permil$)
representing the current detected with an additional error indication flag .
%
Let us assume the {\ft} detection is via a \ohms{220} resistor, and that we read a voltage
from an ADC into the software.
Let us define any value outside the 4mA to 20mA range as an error condition.
%
As a voltage, we use ohms law~\cite{aoe} to determine the voltage ranges: $V=IR$, $0.004A * \ohms{220} = 0.88V$
and $0.020A * \ohms{220} = 4.4V$.
%
Our acceptable voltage range is therefore

$$(V \ge  0.88) \wedge (V \le 4.4) \; .$$

This voltage range forms our input requirement.
%
We can now examine a software function that performs a conversion from the voltage read to
a per~mil representation of the {\ft} input current.
%
For the purpose of example the `C' programming language~\cite{kandr} is used.
We initially  assume a function \textbf{read\_ADC} which returns a floating point %double precision
value which represents the voltage read (see code sample in figure~\ref{fig:code_read_4_20_input}).


%%{\vbox{
\begin{figure}[h+]

\footnotesize
\begin{verbatim}
/***********************************************/
/* read_4_20_input()                           */
/***********************************************/
/* Software function to read 4mA to 20mA input */
/* returns a value from 0-999 proportional     */
/* to the current input.                       */
/***********************************************/
int  read_4_20_input ( int * value ) {
  double input_volts;
  int error_flag;

  /* require: input from ADC to be
              between 0.88 and 4.4 volts */


  input_volts = read_ADC(INPUT_4_20_mA);

  if ( input_volts < 0.88 || input_volts > 4.4 ) {
    error_flag = 1; /* Error flag set to TRUE */
  }
  else {
    *value = (input_volts - 0.88) * ( 4.4 - 0.88 ) * 999.0;
    error_flag = 0; /* indicate current input in range */
  }

  /* ensure: value is proportional (0-999) to the
             4 to 20mA input                      */

  return error_flag;
}
\end{verbatim}
%}
%}

\caption{Software Function:  \textbf{read\_4\_20\_input}}
\label{fig:code_read_4_20_input}
%\label{fig:420i}
\end{figure}

We now look at the function called by \textbf{read\_4\_20\_input}, \textbf{read\_ADC}, which returns a
voltage for a given ADC channel.
%
This function
deals directly with the hardware in the micro-controller on which we are running the software.
%
Its job is to select the correct channel (ADC multiplexer) and then to initiate a
conversion by setting an ADC 'go' bit (see code sample in figure~\ref{fig:code_read_ADC}).
%
It takes the raw ADC reading and converts it into a
floating point\footnote{the type, `double' or `double precision', is a standard C language floating point type~\cite{kandr}.}
voltage value.


%{\vbox{
\begin{figure}[h+]

\footnotesize
\begin{verbatim}
/***********************************************/
/* read_ADC()                                  */
/***********************************************/
/* Software function to read voltage from a    */
/* specified ADC MUX channel                   */
/* Assume 10 ADC MUX channels 0..9             */
/* ADC_CHAN_RANGE = 9                          */
/* Assume ADC is 12 bit and ADCRANGE = 4096    */
/* returns voltage read as double precision    */
/***********************************************/
double  read_ADC( int  channel ) {
  int timeout = 0;
  /* require: a) input channel from ADC to be
              in valid ADC range
              b) voltage ref is 0.1% of 5V     */

  /* return out of range result  */
  /* if invalid channel selected */
  if ( channnel > ADC_CHAN_RANGE )
     return -2.0;

  /* set the multiplexer to the desired channel */
  ADCMUX = channel;

  ADCGO = 1; /* initiate ADC conversion hardware */

  /* wait for ADC conversion with timeout */
  while ( ADCGO == 1 || timeout < 100 )
     timeout++;

  if ( timeout < 100 )
       dval = (double) ADCOUT * 5.0 / ADCRANGE;
  else
       dval = -1.0; /* indicate invalid reading */

  /* return voltage as a floating point value */

  /* ensure: value is voltage input to within 0.1% */

  return dval;
}
\end{verbatim}
\caption{Software Function: \textbf{read\_ADC}}
\label{fig:code_read_ADC}
\end{figure}
%}
%}


We now have a very simple software structure, a call tree, shown in figure~\ref{fig:ct1}.

\begin{figure}[h]
 \centering
 \includegraphics[width=100pt]{./ct1.png}
 % ct1.png: 151x224 pixel, 72dpi, 5.33x7.90 cm, bb=0 0 151 224
 \caption{Call tree for software example}
 \label{fig:ct1}
\end{figure}

This software is above the hardware in the conceptual call tree---from a programmatic perspective---%in software terms---the
software is reading values from the `lower~level' electronics.
%
FMEA is always a bottom-up process and so we must begin with this hardware.
%
The hardware is simply a load resistor, connected across an ADC input
pin on the micro-controller and ground.
%
We can identify the resistor and the ADC module of the micro-controller as
the base components in this design.
%
We now apply FMMD starting with the hardware.


\subsection{FMMD Process}

\paragraph{Functional Group - Convert mA to Voltage - CMATV}

This functional group contains the load resistor
and the physical Analogue to Digital Converter (ADC).
Our functional group, $G_1$ is thus the set of base components: $G_1 = \{R, ADC\}$.
We now determine the {\fms} of all the components in $G_1$.
For the resistor we can use a failure mode set from the literature~\cite{en298}.
Where the function $fm$ returns a set of failure modes for a given component we can state:

$$ fm(R) = \{OPEN,SHORT\}. $$
\vbox{
For the ADC we can determine the following failure modes:

\begin{itemize}
 \item STUCKAT --- The ADC outputs a constant value,
 \item MUXFAIL --- The ADC cannot select its input channel correctly,
 \item LOW --- The ADC output is always LOW, or zero ADC counts,
 \item HIGH --- The ADC output is always HIGH, or max ADC counts.
\end{itemize}
}
We can use the function $fm$ to define the {\fms} of an ADC thus:
$$ fm(ADC) = \{ STUCKAT, MUXFAIL,LOW, HIGH \}. $$

With these failure modes, we can analyse our first functional group, see table~\ref{tbl:cmatv}.

{
\tiny
\begin{table}[h+]
\caption{$G_1$: Failure Mode Effects Analysis} % title of Table
\label{tbl:cmatv}

\begin{tabular}{|| l   | c |   l ||} \hline
 \textbf{Failure}   &  \textbf{failure}     & \textbf{Symptom}          \\
 \textbf{Scenario}  &  \textbf{effect}      &     \textbf{ADC }                     \\ \hline
               \hline
    1: $R_{OPEN}$           &  resistor open,               &      $HIGH$       \\
                                &   voltage on pin high         &                 \\ \hline

     2: $R_{SHORT}$               &  resistor shorted,         &    $LOW$          \\
                                   &   voltage on pin low         &                   \\   \hline \hline


     3: $ADC_{STUCKAT}$            &  ADC reads out         &      $V\_ERR$          \\
                                      &  fixed value          &                 \\ \hline


     4: $ADC_{MUXFAIL}$        & ADC may read      &         $V\_ERR$           \\
                                  &   wrong channel   &                  \\ \hline

     5: $ADC_{LOW}$        &  output low   &       $LOW$              \\
     6: $ADC_{HIGH}$      &  output high   &       $HIGH$                \\  \hline


\hline


\hline

\end{tabular}
\end{table}
}


We now collect the symptoms for the hardware functional group, $\{ HIGH , LOW, V\_ERR \} $.
We now create a {\dc} to represent this called $CMATV$.

We can express this using the `$\derivec$' function thus:
$$ CMATV = \; \derivec (G_1) .$$

As its failure modes are the symptoms of failure from the functional group we can now state:
$$fm ( CMATV ) =  \{ HIGH , LOW, V\_ERR \} .$$


\paragraph{Functional Group - Software -  Read\_ADC - RADC}

The software function $Read\_ADC$ uses the ADC hardware analysed
as the {\dc} CMATV above.


The code fragment in figure~\ref{fig:code_read_ADC} states pre-conditions, as
{\em/* require: a) input channel from ADC to be
              in valid ADC range
              b) voltage ref is 0.1\% of 5V     */}.
%
From the above contractual programming requirements, we see that
the function must be sent the correct channel number.
%
A violation  of this  can be considered a {\fm} of the function,
which we can call $ CHAN\_NO $.
%
The reference voltage for the ADC has a 0.1\% accuracy requirement.
%
If the reference value is outside of this, it is also a {\fm}
of this function, which we can call $V\_REF$.

Taken as a component for use in FMEA/FMMD our function has
two failure modes. We can therefore treat it as a generic component, $Read\_ADC$,
by stating:

$$ fm(Read\_ADC) = \{ CHAN\_NO, VREF \} $$

As we have a failure mode model for our function, we can now use it in conjunction with
with the ADC hardware {\dc} CMATV, to form a {\fg} $G_2$, where $G_2 =\{ CMSTV, Read\_ADC \}$.

We now analyse this hardware/software combined {\fg}.


{
\tiny
\begin{table}[h+]
\caption{$G_2$: Failure Mode Effects Analysis} % title of Table
\label{tbl:radc}

\begin{tabular}{|| l   | c |   l ||} \hline
 \textbf{Failure}   &  \textbf{failure}     & \textbf{Symptom}          \\
 \textbf{Scenario}  &  \textbf{effect}      &     \textbf{RADC }                     \\ \hline
               \hline
    1: ${CHAN\_NO}$           &  wrong voltage               &      $VV\_ERR$       \\
                                &   read         &                 \\ \hline

     2: ${VREF}$          &  ADC volt-ref         &    $VV\_ERR$          \\
                             &  incorrect         &                   \\    \hline  \hline


     3: $CMATV_{V\_ERR}$      &  voltage value     &      $VV\_ERR$          \\
                             &  incorrect          &                 \\ \hline


     4: $CMATV_{HIGH}$        & ADC may read      &         $HIGH$           \\
                               &   wrong channel  &                  \\ \hline

     5: $CMATV_{LOW}$        &  output low   &       $LOW$              \\  \hline

\hline


\hline

\end{tabular}
\end{table}
}


We  now collect the symptoms of failure for the {\fg} analysed (see table~\ref{tbl:radc})
as $\{ VV\_ERR, HIGH, LOW \}$. We can add as well the violation of the postcondition
for the function.
This postcondition, {\em /* ensure: value is voltage input to within 0.1\% */ },
corresponds to $VV\_ERR$, and is already in the {\fm} set for this {\fg}.

We can now create a {\dc} called $RADC$ thus: $$RADC = \; \derivec(G_2)$$ which has the following
{\fms}:

$$ fm(RADC) = \{ VV\_ERR, HIGH, LOW \} .$$


\paragraph{Functional Group - Software - voltage to per mil - VTPM }

This function sits on top of the $RADC$ {\dc} determined above.
We look at the pre-conditions for the function $read\_4\_20\_input$  , % which we can call $RI$
to determine its {\fms}.
Its pre-condition is, {\em  /* require: input from ADC to be between 0.88 and 4.4 volts */}.
We can map this violation of the pre-condition, to the {\fm} VRNGE; %As this function has one pre-condition
we can state,

$$ fm(read\_4\_20\_input) = \{ VRNGE \} .$$

We can now form a functional group with the {\dc} $RADC$ and the
software component $read\_4\_20\_input$, i.e. $G_3 = \{read\_4\_20\_input, RADC\} $.


{
\tiny
\begin{table}[h+]
\caption{$G_3$: Read\_4\_20: Failure Mode Effects Analysis} % title of Table
\label{tbl:r420i}

\begin{tabular}{|| l   | c |   l ||} \hline
 \textbf{Failure}   &  \textbf{failure}     & \textbf{Symptom}          \\
 \textbf{Scenario}  &  \textbf{effect}      &     \textbf{RADC }                     \\ \hline
               \hline
    1: $RI_{VRGE}$           &  voltage     &      $OUT\_OF\_$       \\
                                & outside range    &    $RANGE$             \\ \hline

     2: $RADC_{VV_ERR}$          &  voltage           &    $VAL\_ERR$          \\
                             &  incorrect         &                   \\    \hline  \hline


     3: $RADC_{HIGH}$      &  voltage value     &      $VAL\_ERR$          \\
                             &  incorrect          &                 \\ \hline


     4: $RADC_{LOW}$        & ADC may read      &      $OUT\_OF\_$           \\
                               &   wrong channel  &  $RANGE$                 \\ \hline

\hline


\hline

\end{tabular}
\end{table}
}

The failure symptoms for the {\fg} are $\{OUT\_OF\_RANGE, VAL\_ERR\}$.
The postcondition for the function $read\_4\_20\_input$, {\em /* ensure: value is proportional (0-999) to the
             4 to 20mA input                      */} corresponds to the $VAL\_ERR$ and is already in the set of failure modes.
% \paragraph{Final Functional Group}
For single failures these are the two ways in which this function
can fail. An $OUT\_OF\_RANGE$ will be flagged by the error flag variable.
The $VAL\_ERR$ will mean that the value read is simply wrong.

We can finally make a {\dc} to represent a failure mode model for our function $read\_4\_20\_input$ thus:

$$ R420I = \; \derivec(G_3) .$$

This new {\dc} has the following {\fms}:
$$fm(R420I) = \{OUT\_OF\_RANGE, VAL\_ERR\} .$$

%
% Using the derived components, CMATV and VTPM we create
% a new functional group. This
% integrates FMEA's from software and eletronics
% into the same failure mode model.


We can now represent the software/hardware FMMD analysis
as a hierarchical diagram, see figure~\ref{fig:hd}.

\begin{figure}[h]
 \centering
 \includegraphics[width=200pt]{./hd.png}
 % hd.png: 363x520 pixel, 72dpi, 12.81x18.34 cm, bb=0 0 363 520
 \caption{FMMD hierarchy with hardware and software elements}
 \label{fig:hd}
\end{figure}


We can represent the hierarchy in figure~\ref{fig:hd} algebraically, using the `$\derivec$' function
using the groups as intermediate stages:
\begin{eqnarray*}
G_1 &=&  \{R,ADC\} \\
CMATV &=&  \;\derivec (G_1) \\
G_2 &=& \{CMATV, read\_ADC \} \\
RADC &=&  \; \derivec (G_2) \\
G_3 &=& \{ RADC, read\_4\_20\_input \} \\
R420I &=&  \; \derivec (G_3) \\
\end{eqnarray*}
or, a nested definition,
$$ \derivec \Big( \derivec \big( \derivec(R,ADC), read\_4\_20\_input \big), read\_4\_20\_input \Big). $$


This nested structure means that we have multiple traceable
stages of failure mode reasoning in our analysis. Traditional FMEA would have only one stage
of reasoning for each component failure mode.

%\clearpage
\section{Conclusion}

The {\dc} representing the {\ft} reader
in software shows that by taking a modular approach for FMEA, we can integrate
software and electro-mechanical FMEA models.
With this analysis
we have a complete `reasoning~path' linking the failures modes from the
electronics to those in the software.
Each functional group to {\dc} transition represents a
reasoning stage.
With traditional FMEA methods the reasoning~distance is large, because
it stretches from the component failure mode to the top---or---system level failure.
For this reason applying traditional FMEA to software stretches
the reasoning distance even further.

We now have a {\dc} for a {\ft} input in software.
Typically, more than one such input could be present in a real-world system.
Not only have we integrated electronics and software in an FMEA, we can also
re-use the analysis for each {\ft} input in the system.

The unsolved symptoms, or unobservable errors, i.e. $VAL\_ERR$ could be addressed
by another software function to read other known signals
via the MUX (i.e. voltage references). This strategy would
detect ADC\_STUCK\_AT and MUX\_FAIL failure modes.
%
Detailing this however,  is beyond the scope %and page-count
of this paper.


%Its solved. Hoooo-ray !!!!!!!!!!!!!!!!!!!!!!!!


%\paragraph{Future work}
%\begin{itemize}
% %\item  A complete software/electrical/mechanical system analysed
% \item
% \item
% \end{itemize}
% %\today
% %
{ %\tiny %
\footnotesize
\bibliographystyle{plain}
\bibliography{vmgbibliography,mybib}
}

%\today
\end{document}