added a context/circuit diagram for ft loop
This commit is contained in:
parent
470f9e52d9
commit
e5cad5d363
@ -1,5 +1,5 @@
|
||||
|
||||
PNG = fmmdh.png ct1.png hd.png
|
||||
PNG = fmmdh.png ct1.png hd.png ftcontext.png
|
||||
|
||||
%.png:%.dia
|
||||
dia -t png $<
|
||||
|
BIN
papers/software_fmea/ftcontext.dia
Normal file
BIN
papers/software_fmea/ftcontext.dia
Normal file
Binary file not shown.
@ -138,9 +138,10 @@ component failure modes on a system.
|
||||
It is used both as a design tool (to determine weakness), and is a requirement of certification of safety critical products.
|
||||
FMEA has been successfully applied to mechanical, electrical and hybrid electro-mechanical systems.
|
||||
|
||||
Work on software FMEA is begining~\cite{sfmea}~\cite{sfmeaa}, but
|
||||
at present no technique for Software FMEA that
|
||||
Work on software FMEA is beginning~\cite{sfmea}~\cite{sfmeaa}, but
|
||||
at present no technique for software FMEA that
|
||||
integrates hardware and software models known to the authors exists.
|
||||
%
|
||||
Software generally, sits on top of most modern safety critical control systems
|
||||
and defines its most important system wide behaviour and communications.
|
||||
Standards~\cite{en298}~\cite{en61508} that use FMEA
|
||||
@ -148,9 +149,10 @@ do not specify it for Software, but do specify, good practise,
|
||||
review processes and language feature constraints.
|
||||
|
||||
This is a weakness; where FMEA scientifically traces component {\fms}
|
||||
to resultant system failures; software has been left in a non-analytical
|
||||
to resultant system failures, software has been left in a non-analytical
|
||||
limbo of best practises and constraints.
|
||||
If software FMEA were possible electro-mechanical-software hybrids could
|
||||
%
|
||||
If software FMEA were possible, electro-mechanical-software hybrids could
|
||||
be modelled; and could thus be `complete' failure mode models.
|
||||
%Failure modes in components in say a sensor, could be traced
|
||||
%up through the electronics and then through the controlling software.
|
||||
@ -164,13 +166,16 @@ and integrate-able with FMEA performed on mechanical and electronic systems.
|
||||
{
|
||||
This paper describes a modular FMEA process that can be applied to software.
|
||||
This modular variant of FMEA is called Failure Mode Modular de-composition (FMMD).
|
||||
Because this process is based on failure modes of components
|
||||
%
|
||||
Because this process is based on failure modes of components,
|
||||
it can be applied to electrical and/or mechanical systems.
|
||||
%
|
||||
The hierarchical structure of software is then examined,
|
||||
and then definitions from contract programming are used
|
||||
and definitions from contract programming are used
|
||||
to define failure modes and failure symptoms in
|
||||
software functions.
|
||||
With these definitions we can apply FMEA
|
||||
%
|
||||
With these definitions we can apply a modular form of FMEA
|
||||
to existing software\footnote{Existing software excluding recursive~\cite{misra}[16.2] code, and unstructured non-functional languages}.
|
||||
}
|
||||
|
||||
@ -195,7 +200,7 @@ the failures to fix in order of cost.
|
||||
Deisgn FMEA (DFMEA) is FMEA applied at the design or approvals stage
|
||||
where the aim is to ensure single component failures cannot cause unacceptable system level events.
|
||||
|
||||
Failure Mode effect Criticality Analysis (FMECA) is applied to determine the most potentially dangerous or damaging
|
||||
Failure Mode Effect Criticality Analysis (FMECA) is applied to determine the most potentially dangerous or damaging
|
||||
failure modes to fix.
|
||||
|
||||
|
||||
@ -207,6 +212,40 @@ FMMD is a modularisation of FMEA and can produce failure~mode models that can be
|
||||
all the above variants of FMEA.
|
||||
|
||||
|
||||
\subsection{Current FMEA techniques are not suitable for software}
|
||||
|
||||
The main FMEA methodologies are all based on the concept of taking
|
||||
base component {\fms}, and translating them into system level events/failures.
|
||||
In a complicated system, mapping a component failure mode to a system level failure
|
||||
will mean a long reasoning distance; that is to say the actions of the failed component will have to be traced through
|
||||
several sub-systems and the effects of other components on the way.
|
||||
With software at the higher levels of these sub-systems
|
||||
we have another layer of complication.
|
||||
|
||||
In order to integrate software, in a meaningful way we need to re-think the
|
||||
FMEA concept of mapping a base component failure to a system level event.
|
||||
|
||||
|
||||
One strategy would be to modularise FMEA. To break down the failure effect
|
||||
reasoning into small modules.
|
||||
%
|
||||
If we pre-analyse modules, and then they
|
||||
can be combined with others, into
|
||||
larger sub-systems, and eventually form a hierarchy of failure mode behaviour for the entire system.
|
||||
%
|
||||
With higher level modules, we can approach the level that the software re-sides in.
|
||||
%
|
||||
For instance, to read a voltage into software via an ADC we rely on an electronic sub-system
|
||||
that conditions the input signal and then routes it through a multiplexer to the ADC.
|
||||
%
|
||||
We could easily consider this electronics a module, and with a
|
||||
failure mode model for it, it makes modelling the software to hardware interface
|
||||
far simpler.
|
||||
%
|
||||
The failure mode model, would give us the ways in which the signal conditioning
|
||||
and multiplexer could fail. We can use this to work out how our software
|
||||
could fail, and with this create a modular FMEA model of the software.
|
||||
|
||||
|
||||
|
||||
\section{Modularising FMEA}
|
||||
@ -219,7 +258,7 @@ We can call these {\fgs}. We can then analyse the failure mode behaviour of a {\
|
||||
using all the failure modes of all its components.
|
||||
%
|
||||
When we have its failure mode behaviour, or the symptoms of failure from the perspective of the {\fg},
|
||||
we now treat the {\fg} as a {\dc}; where the failure modes of the {\dc} are the symptoms of failure of the {\fg}.
|
||||
we now treat the {\fg} as a {\dc}, where the failure modes of the {\dc} are the symptoms of failure of the {\fg}.
|
||||
%
|
||||
%
|
||||
We can now use {\dcs} to build higher level {\fgs} until we have a complete hierarchical model
|
||||
@ -229,8 +268,8 @@ is given in~\cite{syssafe2011}.
|
||||
\paragraph{FMMD, the process.}
|
||||
|
||||
The main aim of Failure Mode Modular Discrimination (FMMD) is to build a hierarchy of failure behaviour from the {\bc}
|
||||
level up to the top, or system level, with analysis stages, {\fgs} %and corresponding {\dcs}
|
||||
, between each
|
||||
level up to the top, or system level, with analysis stages ({\fgs}) %and corresponding {\dcs}
|
||||
between each
|
||||
transition to a higher level in the hierarchy.
|
||||
|
||||
|
||||
@ -242,7 +281,7 @@ From the point of view of fault analysis, we are not interested in the component
|
||||
A {\fg} is a collection of components that perform some simple task or function.
|
||||
%
|
||||
In order to determine how a {\fg} can fail,
|
||||
we need to consider all failure modes of its components.
|
||||
we need to consider all the failure modes of its components.
|
||||
%
|
||||
By analysing the fault behaviour of a `{\fg}' with respect to all its components failure modes,
|
||||
we can determine its symptoms of failure.
|
||||
@ -250,11 +289,16 @@ we can determine its symptoms of failure.
|
||||
%the symptoms of failure for the {\fg}.
|
||||
|
||||
With these symptoms (a set of derived faults from the perspective of the {\fg})
|
||||
we can now state that the {\fg} (as an entity in its own right) can fail in a number of well defined ways.
|
||||
we can now state that the {\fg}
|
||||
% (as an entity in its own right)
|
||||
can fail in a number of well defined ways.
|
||||
%
|
||||
In other words we have taken a {\fg}, and analysed how
|
||||
\textbf{it} can fail according to the failure modes of its components, and then
|
||||
determined the {\fg} failure modes.
|
||||
%\textbf{it}
|
||||
it can fail according to the failure modes of its components, and then
|
||||
determine the {\fg} failure symptoms.
|
||||
We then create a new {\dc} which has as its {\fms} the failure symptoms
|
||||
of the {\fg} that it was derived from.
|
||||
|
||||
% \paragraph{Creating a derived component.}
|
||||
% We create a new `{\dc}' which has
|
||||
@ -279,15 +323,17 @@ determined the {\fg} failure modes.
|
||||
|
||||
We can use the symbol $\bowtie$ to represent the creation of a derived component
|
||||
from a {\fg}. We show an FMMD hierarchy in figure~\ref{fig:fmmdh}.
|
||||
Using this diagram we can follow the creation of the hierarcy in
|
||||
Using this diagram, we can follow the creation of the hierarchy in
|
||||
a theoretical system.
|
||||
%
|
||||
There are three functional groups comprised of
|
||||
{\bcs}. These are analysed individually using FMEA.
|
||||
That is to say their component failure modes are examined, and the
|
||||
the ways in which the {\fgs} fail; its symptoms of failure are determined.
|
||||
the ways in which the {\fgs} fail; and how its symptoms of failure are determined.
|
||||
%
|
||||
The `$\bowtie$' function is now applied to create {\dcs}.
|
||||
These are shown in figure~\ref{fig:fmmdh} above the {\fgs}.
|
||||
Now that we have {\dcs} we can use them to form a higher level functional group.
|
||||
Now that we have {\dcs}, we can use them to form a higher level functional group.
|
||||
We apply the same FMEA process to this and can derive a top level
|
||||
derived component (which has the system---or top---level failure modes).
|
||||
|
||||
@ -306,17 +352,38 @@ programmatic function call tree.
|
||||
|
||||
If FMEA can be applied to software we can build complete failure models
|
||||
of typical modern safety critical systems.
|
||||
With modular FMEA (FMMD) we have the concepts of failure~modes
|
||||
With modular FMEA i.e. FMMD %(FMMD)
|
||||
we have the concepts of failure~modes
|
||||
of components, {\fgs} and symptoms of failure for a functional group.
|
||||
|
||||
A programmatic function is very similar to a f via hardware interactionunctional group.
|
||||
It calls other functions, and uses data sources via hardware interaction, which could be viewed as its `components'.
|
||||
It has outputs which will be used by functions that may call it.
|
||||
map the FMMD concepts of {\fms}, {\fgs} and {\dcs}
|
||||
to software functions.
|
||||
A programmatic function has similariies with a {\fg} as defined by the FMMD process.
|
||||
%
|
||||
An FMMD {\fg} is placed into a hierarchy.
|
||||
A Software function is placed into a hierarchy, that of its call-tree.
|
||||
A software function typically calls other functions and uses data sources via hardware interaction, which could be viewed as its `components'.
|
||||
It has outputs, i.e. it can perform actions
|
||||
on data or hardware
|
||||
which will be used by functions that may call it.
|
||||
|
||||
We can map a software function to a {\fg} in FMMD. Its failure modes
|
||||
are the failure modes of the software components (other functions it calls)
|
||||
and the hardware its reads values from.
|
||||
Its outputs are the data it changes, or the hardware actions it performs.
|
||||
|
||||
When we have analysed a software function, initially usin its input failure modes
|
||||
we can determine its symptoms of failure (how calling functions will see its failure mode behaviour).
|
||||
|
||||
We can thus apply the $\bowtie$ process to software functions, by viewing them in terms of their failure
|
||||
mode behaviour. To simplify things as well, software already fits into a hierarchy.
|
||||
For Electronics and Mechanical systems, although we may be guided by the original designers
|
||||
concepts of modularity and sub-systems in design, applying FMMD means deciding on the members for {\fgs}
|
||||
and the subsequent hierarchy. With software already written, that hierarchy is fixed.
|
||||
|
||||
% map the FMMD concepts of {\fms}, {\fgs} and {\dcs}
|
||||
%to software functions.
|
||||
%
|
||||
%However, we need to map a the FMMD concepts of {\fms}, {\fgs} and {\dcs}
|
||||
to software functions.
|
||||
%to software functions.
|
||||
% failure modes of a function in order to
|
||||
%map FMMD to software.
|
||||
|
||||
@ -328,12 +395,21 @@ Because of this we can assume a direct call tree. Functions call functions
|
||||
from the top down and eventually call the lowest level library or IO
|
||||
functions that interact with hardware/electronics.
|
||||
|
||||
What is potentially difficult with a software function, is deciding what
|
||||
are failure modes, and later what a failure symptoms.
|
||||
With electronic components, we can use literature to point us to suitable sets of
|
||||
{\fms}~\cite{en298}~\cite{fmd91}~\cite{mil1991}~\cite{en61508}.
|
||||
With software, only some library functions are well known and rigorously documented
|
||||
enough to have the equivalent of known failure modes.
|
||||
Most software is `bespoke'. We need a different strategy to
|
||||
describe the failure mode behaviour of software functions.
|
||||
We can use definitions from contract programming to assist here.
|
||||
|
||||
\subsection{Contract programming description}
|
||||
|
||||
Contract programming is a discipline~\cite{dbcbe} for building software functions in a controlled
|
||||
and traceable way. Each function is subject to pre-conditions (constraints on its inputs),
|
||||
post-conditions (constraints` on its outpu'ts) and function wide invariants (rules).
|
||||
post-conditions (constraints on its outputs) and function wide invariants (rules).
|
||||
|
||||
|
||||
\paragraph{Mapping contract `pre-condition' violations to failure modes}
|
||||
@ -343,7 +419,7 @@ defines the correct ranges of input conditions for the function
|
||||
to operate successfully.
|
||||
|
||||
For a software function, a violation of a pre-condition is
|
||||
in effect a failure mode of `one of its com'ponents.
|
||||
in effect a failure mode of `one of its components'.
|
||||
|
||||
|
||||
\paragraph{Mapping contract `post-condition' violations to symptoms}
|
||||
@ -354,19 +430,52 @@ Post conditions could be either actions performed (i.e. the state of hardware
|
||||
|
||||
\paragraph{Mapping contract `invariant' violations to symptoms and failure modes}
|
||||
|
||||
Invariants in contract programming may apply to inputs to the function (where the can be considered {\fms} in FMMD terminology),
|
||||
and to outputs (where the can be considered {failure symptoms} in FMMD terminology).
|
||||
Invariants in contract programming may apply to inputs to the function (where they can be considered {\fms} in FMMD terminology),
|
||||
and to outputs (where they can be considered {failure symptoms} in FMMD terminology).
|
||||
|
||||
|
||||
\subsection{Software FMEA}
|
||||
|
||||
For the purpose of example, we chose a simple common safety critical industrial circuit
|
||||
that is nearly always used in conjunction with a programmatic element.
|
||||
A common method for delivering a quantitative value in analogue electronics is
|
||||
to supply a current signal to represent it~\cite{aoe}[p.849].
|
||||
Usually, 4mA represents a zero or starting value and 20mA represents the full scale,
|
||||
and this is referred to as {\ft} signalling.
|
||||
%
|
||||
{\ft} has a an electrical advantage as well, because the current in a loop is constant~\cite{aoe}[p.20]
|
||||
resistance in the wires between the source and the receiving end is not an issue
|
||||
that can alter the accuracy of the signal.
|
||||
%
|
||||
This circuit has many advantages for safety. If the signal becomes discontented
|
||||
it reads an out of range 0mA at the receiving end. This is outside the {\ft} range,
|
||||
and is therefore easy to detect as an error rather than an incorrect value.
|
||||
%
|
||||
Should the driving electronics go wrong at the source end, it will usually
|
||||
supply far too little or far too much current, making an error condition easy to detect.
|
||||
%
|
||||
At the receiving end, we only require one simple component to convert the
|
||||
current signal into a voltage that we can read with an ADC: the humble resistor!
|
||||
|
||||
|
||||
%BLOCK DIAGRAM HERE WITH FT CIRCUIT LOOP
|
||||
|
||||
\begin{figure}[h]
|
||||
\centering
|
||||
\includegraphics[width=230pt]{./ftcontext.png}
|
||||
% ftcontext.png: 767x385 pixel, 72dpi, 27.06x13.58 cm, bb=0 0 767 385
|
||||
\caption{Context Diagram for {\ft} loop}
|
||||
\label{fig:ftcontext}
|
||||
\end{figure}
|
||||
|
||||
|
||||
\subsection{Simple Software Example}
|
||||
|
||||
|
||||
Consider a function that reads a {\ft} input, and returns a value between 0 and 999 (i.e. per mil $\permil$)
|
||||
representing the current detected with an additional error indication flag .
|
||||
|
||||
Let us assume the {\ft} detection is via a \ohms{220} resistor., and that we read a voltage
|
||||
Let us assume the {\ft} detection is via a \ohms{220} resistor, and that we read a voltage
|
||||
from an ADC into the software.
|
||||
Let us define any value outside the 4mA to 20mA range as an error condition.
|
||||
%
|
||||
@ -423,15 +532,20 @@ int read_4_20_input ( int * value ) {
|
||||
%}
|
||||
\label{fig:code_read_4_20_input}
|
||||
\caption{Software Function: \textbf{read\_4\_20\_input}}
|
||||
\label{fig:420i}
|
||||
%\label{fig:420i}
|
||||
\end{figure}
|
||||
|
||||
We now look at the function called by \textbf{read\_4\_20\_input}, \textbf{read\_ADC}, which returns a
|
||||
voltage for a given ADC channel. This function
|
||||
deals directly with the hardware in the micro-controller we are running the software on.
|
||||
voltage for a given ADC channel.
|
||||
%
|
||||
This function
|
||||
deals directly with the hardware in the micro-controller that we are running the software on.
|
||||
%
|
||||
Its job is to select the correct channel (ADC multiplexer) and then to initiate a
|
||||
conversion by setting an ADC 'go' bit (see code sample in figure~\ref{code_read_ADC}).
|
||||
It takes the raw ADC reading and converts it into a floating point\footnote{the type, `double' or `double precision', is a standard C language floating point type~\cite{kandr}.}
|
||||
%
|
||||
It takes the raw ADC reading and converts it into a i
|
||||
floating point\footnote{the type, `double' or `double precision', is a standard C language floating point type~\cite{kandr}.}
|
||||
voltage value.
|
||||
|
||||
|
||||
@ -497,12 +611,17 @@ We now have a very simple software structure, a call tree, shown in figure~\ref{
|
||||
\label{fig:ct1}
|
||||
\end{figure}
|
||||
|
||||
This software is above the hardware in the call tree.
|
||||
FMEA is always a bottom-up process and so we must being with the hardware.
|
||||
This software is above the hardware in the conceptual call tree---by that, in software terms---the
|
||||
software is reading values from the `lower~level' electronics.
|
||||
%
|
||||
FMEA is always a bottom-up process and so we must begin with this hardware.
|
||||
%
|
||||
The hardware is simply a load resistor, connected across an ADC input
|
||||
pin on the micro-controller and ground.
|
||||
%
|
||||
We can identify the resistor and the ADC module of the micro-controller as
|
||||
the base components in this design.
|
||||
%
|
||||
We now apply FMMD starting with the hardware.
|
||||
|
||||
|
||||
@ -573,7 +692,7 @@ With these failure modes, we can analyse our first functional group, see table~r
|
||||
|
||||
|
||||
We now have the symptoms for the hardware functional group, $\{ HIGH , LOW, V\_ERR \} $.
|
||||
We can now create a {\dc} to represent this called $CMATV$.
|
||||
We now create a {\dc} to represent this called $CMATV$.
|
||||
As its failure modes, are the symptoms of failure from the functional group we can now state:
|
||||
|
||||
$$fm ( CMATV ) = \{ HIGH , LOW, V\_ERR \} $$
|
||||
@ -604,7 +723,7 @@ $$ fm(RA) = \{ CHAN\_NO, VREF \} $$
|
||||
As we have a failure mode model for our function, we can now use it in conjunction with
|
||||
with the ADC hardware {\dc} CMATV, to form a {\fg}, where $G=\{ CMSTV, Read\_ADC \}$.
|
||||
|
||||
We can now analyse this hardware/software combined {\fg}.
|
||||
We now analyse this hardware/software combined {\fg}.
|
||||
|
||||
|
||||
|
||||
@ -647,7 +766,7 @@ We can now analyse this hardware/software combined {\fg}.
|
||||
|
||||
|
||||
|
||||
We can now see that the symptoms of failure for the {\fg} analysed
|
||||
We now have the symptoms of failure for the {\fg} analysed (see table~\ref{tbl:radc})
|
||||
as $\{ VV\_ERR, HIGH, LOW \}$. We can add as well the violation of the postcondition
|
||||
for the function.
|
||||
This postcondition, {\em /* ensure: value is voltage input to within 0.1\% */ },
|
||||
@ -720,7 +839,7 @@ For single failures these are the two ways in which this function
|
||||
can fail. An $OUT\_OF\_RANGE$ will be flagged by the error flag variable.
|
||||
The $VAL\_ERR$ will simply mean that the value read is simply wrong.
|
||||
|
||||
We can now finally make a {\dc} to represent a failure mode model for our function $read\_4\_20\_input$ thus:
|
||||
We can finally make a {\dc} to represent a failure mode model for our function $read\_4\_20\_input$ thus:
|
||||
|
||||
$$fm(R420I) = \{OUT\_OF\_RANGE, VAL\_ERR\}$$
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user