PUT some DAGS in CH5 and tried to shorten the

software FMMD paper
This commit is contained in:
Robin Clark 2012-06-16 20:45:47 +01:00
parent 443732764e
commit 5b322c0695
2 changed files with 342 additions and 134 deletions

View File

@ -101,10 +101,21 @@ failure mode of the component or sub-system}}}
\setlength{\headsep}{0in} \setlength{\headsep}{0in}
\setlength{\textheight}{22cm} \setlength{\textheight}{22cm}
\setlength{\textwidth}{18cm} \setlength{\textwidth}{18cm}
\setlength{\textheight}{24cm}
%\setlength{\textwidth}{20cm}
\setlength{\oddsidemargin}{0in} \setlength{\oddsidemargin}{0in}
\setlength{\evensidemargin}{0in} \setlength{\evensidemargin}{0in}
\setlength{\parindent}{0.0in} \setlength{\parindent}{0.0in}
\setlength{\parskip}{6pt} %\setlength{\parskip}{6pt}
% \setlength{\parskip}{1cm plus4mm minus3mm}
\setlength{\parskip}{0pt}
\setlength{\parsep}{0pt}
\setlength{\headsep}{0pt}
\setlength{\topskip}{0pt}
\setlength{\topmargin}{0pt}
\setlength{\topsep}{0pt}
\setlength{\partopsep}{0pt}
%\linespread{0.5}
\begin{document} \begin{document}
%\pagestyle{fancy} %\pagestyle{fancy}
@ -123,7 +134,7 @@ failure mode of the component or sub-system}}}
} }
%\title{Developing a rigorous bottom-up modular static failure mode modelling methodology} %\title{Developing a rigorous bottom-up modular static failure mode modelling methodology}
\title{Applying FMMD across the Software/Hardware Interface} \title{Applying Failure Mode Modular De-Composition (FMMD) across the Software/Hardware Interface}
%\nodate %\nodate
\maketitle \maketitle
@ -131,44 +142,43 @@ failure mode of the component or sub-system}}}
\paragraph{Keywords:} static failure mode modelling safety-critical software fmea \paragraph{Keywords:} static failure mode modelling safety-critical software fmea
%\small %\small
\abstract{ \em \abstract{ % \em
\input{abs} %\input{abs}
%The certification process of safety critical products for European and %The certification process of safety critical products for European and
%other international standards often demand environmental stress, %other international standards often demand environmental stress,
%endurance and Electro Magnetic Compatibility (EMC) testing. Theoretical, or 'static testing', %endurance and Electro Magnetic Compatibility (EMC) testing. Theoretical, or 'static testing',
%is often also required. %is often also required.
% %
% Failure Mode Effects Analysis (FMEA), is a bottom-up technique that aims to assess the effect all Failure Mode Effects Analysis (FMEA), is a bottom-up technique that aims to assess the effect all
% component failure modes on a system. component failure modes on a system.
% It is used both as a design tool (to determine weaknesses), and is a requirement of certification of safety critical products. It is used both as a design tool (to determine weaknesses), and is a requirement of certification of safety critical products.
% FMEA has been successfully applied to mechanical, electrical and hybrid electro-mechanical systems. FMEA has been successfully applied to mechanical, electrical and hybrid electro-mechanical systems.
% %
% Work on software FMEA (SFMEA) is beginning, but Work on software FMEA (SFMEA) is beginning, but
% at present no technique for SFMEA that at present no technique for SFMEA that
% integrates hardware and software models % known to the authors integrates hardware and software models % known to the authors
% exists. exists.
% % % %
% Software generally sits on top of most modern safety critical control systems Software generally sits on top of most modern safety critical control systems
% and defines its most important system wide behaviour and communications. and defines its most important system wide behaviour and communications.
% Currently standards that demand FMEA for hardware (e.g. EN298, EN61508), Currently standards that demand FMEA for hardware (e.g. EN298, EN61508),
% do not specify it for software, but instead specify, good practise, do not specify it for software, but instead specify, good practise,
% review processes and language feature constraints. review processes and language feature constraints.
% %
% %This is a weakness; w This is a weakness
% Where FMEA % scientifically Where FMEA % scientifically
% traces component {\fms} traces component {\fms}
% to resultant system failures, software has been left in a non-analytical to resultant system failures, software has been left in a non-analytical
% limbo of best practises and constraints. limbo of best practises and constraints.
% % % %
% If software and hardware integrated FMEA were possible, electro-mechanical-software hybrids could If software and hardware integrated FMEA were possible, electro-mechanical-software hybrids could
% be modelled; and could thus be `complete' failure mode models. be modelled; and could thus be `complete' failure mode models.
% %Failure modes in components in say a sensor, could be traced Failure modes in components in say a sensor, could be traced
% %up through the electronics and then through the controlling software. up through the electronics and then through the controlling software.
% Presently FMEA, stops at the glass ceiling of the computer program. Presently FMEA, stops at the glass ceiling of the computer program.
% This paper presents a modular variant of FMEA, Failure Mode Modular De-Composition (FMMD), a methodology which
% This paper presents a modular variant of FMEA, Failure Mode Modular De-Composition (FMMD), a methodology which can be applied to software, and is compatible
% can be applied to software, and is compatible and integrate-able with FMMD performed on mechanical and electronic systems.
% and integrate-able with FMMD performed on mechanical and electronic systems.
} }
\today \today
@ -204,26 +214,24 @@ Modern control systems nearly always have a significant software/firmware elemen
and not being able to model software with current FMEA methodologies and not being able to model software with current FMEA methodologies
is a cause for criticism~\cite{easw}~\cite{safeware}~\cite{bfmea}. is a cause for criticism~\cite{easw}~\cite{safeware}~\cite{bfmea}.
%Several variants of FMEA exist,
% traditional FMEA being associated with the manufacturing industry, with the aims of prioritising
Several variants of FMEA exist, % the failures to fix in order of cost.
traditional FMEA being associated with the manufacturing industry, with the aims of prioritising %
the failures to fix in order of cost. % Deisgn FMEA (DFMEA) is FMEA applied at the design or approvals stage
% where the aim is to ensure that single component failures cannot
Deisgn FMEA (DFMEA) is FMEA applied at the design or approvals stage % cause unacceptable system level events.
where the aim is to ensure that single component failures cannot %
cause unacceptable system level events. % Failure Mode Effect Criticality Analysis (FMECA) is applied to determine the most potentially dangerous or damaging
% failure modes to fix.
Failure Mode Effect Criticality Analysis (FMECA) is applied to determine the most potentially dangerous or damaging %
failure modes to fix. %
% Failure Mode Effects and Diagnostics Analysis, is FMEA peformed to
% determine a statistical level of safety.
Failure Mode Effects and Diagnostics Analysis, is FMEA peformed to % This is associated with Safety Integrity Levels (SIL)~\cite{en61508}~\cite{en61511} classification.
determine a statistical level of safety. %
This is associated with Safety Integrity Levels (SIL)~\cite{en61508}~\cite{en61511} classification. %FMMD is a modularisation of FMEA and can produce failure~mode models that can be used in
%all the above variants of FMEA.
FMMD is a modularisation of FMEA and can produce failure~mode models that can be used in
all the above variants of FMEA.
\subsection{Current work on Software FMEA} \subsection{Current work on Software FMEA}
@ -257,14 +265,14 @@ we have yet another layer of complication.
In order to integrate software, in a meaningful way we need to re-think the In order to integrate software, in a meaningful way we need to re-think the
FMEA concept of simply mapping a base component failure to a system level event. FMEA concept of simply mapping a base component failure to a system level event.
%
One strategy would be to modularise FMEA. To break down the failure effect One strategy would be to modularise FMEA. To break down the failure effect
reasoning into small modules. reasoning into small modules from the bottom-up.
% %
If we pre-analyse modules, and they If we pre-analyse modules, and they
can be combined with others, into can be combined with others, into
larger sub-systems, we can eventually form a hierarchy of failure mode behaviour for the entire system. larger sub-systems, we eventually form a hierarchy
of failure mode behaviour for the entire system.
% %
With higher level modules, we can reach the level in which the software resides. With higher level modules, we can reach the level in which the software resides.
% %
@ -297,20 +305,22 @@ we now treat the {\fg} as a {\dc}, where the failure modes of the {\dc} are the
We can now use {\dcs} to build higher level {\fgs} until we have a complete hierarchical model We can now use {\dcs} to build higher level {\fgs} until we have a complete hierarchical model
of the failure mode behaviour of a system. An example of this process, applied to an inverting op-amp configuration of the failure mode behaviour of a system. An example of this process, applied to an inverting op-amp configuration
is given in~\cite{syssafe2011}. is given in~\cite{syssafe2011}.
%This methodology is called Failure Mode Modular De-Composition (FMMD).
\paragraph{FMMD, the process.} \paragraph{FMMD, the process.}
The main aim of Failure Mode Modular Discrimination (FMMD) is to build a hierarchy of failure behaviour from the {\bc} The main aim of %Failure Mode Modular Discrimination (FMMD)
FMMD is to build a hierarchy of failure behaviour from the {\bc}
level up to the top, or system level, with analysis stages ({\fgs}) %and corresponding {\dcs} level up to the top, or system level, with analysis stages ({\fgs}) %and corresponding {\dcs}
between each between each
transition to a higher level in the hierarchy. transition to a higher level in the hierarchy.
%
The first stage is to choose The first stage is to choose
{\bcs} that interact and naturally form {\fgs}. The initial {\fgs} are collections of base components. {\bcs} that interact and naturally form {\fgs}. The initial {\fgs} are collections of base components.
%These parts all have associated fault modes. A module is a set fault~modes. %These parts all have associated fault modes. A module is a set fault~modes.
From the point of view of fault analysis, we are not interested in the components themselves, but in the ways in which they can fail. From the point of view of fault analysis,
we are not interested in the components themselves, but in the ways in which they can fail.
%
A {\fg} is a collection of components that perform some simple task or function. A {\fg} is a collection of components that perform some simple task or function.
% %
In order to determine how a {\fg} can fail, In order to determine how a {\fg} can fail,
@ -320,7 +330,7 @@ By analysing the fault behaviour of a `{\fg}' with respect to all its components
we can determine its symptoms of failure. we can determine its symptoms of failure.
%In fact we can call these %In fact we can call these
%the symptoms of failure for the {\fg}. %the symptoms of failure for the {\fg}.
%
With these symptoms (a set of derived faults from the perspective of the {\fg}) With these symptoms (a set of derived faults from the perspective of the {\fg})
we can now state that the {\fg} we can now state that the {\fg}
% (as an entity in its own right) % (as an entity in its own right)
@ -357,9 +367,7 @@ of the {\fg} from which it was derived.
We can use the symbol `$\derivec$' to represent the creation of a derived component We can use the symbol `$\derivec$' to represent the creation of a derived component
from a {\fg}. This symbol is convenient for drawn hierarchy diagrams. % (see figure~\ref{fmmdh}). from a {\fg}. This symbol is convenient for drawn hierarchy diagrams. % (see figure~\ref{fmmdh}).
We define the $\derivec$ function, where $\FG$ is the set of all {\fgs} and $\DC$ is the set of all {\dcs}, We define the $\derivec$ function, where $\FG$ is the set of all {\fgs} and $\DC$ is the set of all {\dcs},
$ \derivec ( {\FG} ) \mapsto {\DC} .$
$$ \derivec ( {\FG} ) \mapsto {\DC} .$$
We show an FMMD hierarchy in figure~\ref{fig:fmmdh}. We show an FMMD hierarchy in figure~\ref{fig:fmmdh}.
Using this diagram, we can follow the creation of the hierarchy in Using this diagram, we can follow the creation of the hierarchy in
a theoretical system. a theoretical system.
@ -394,7 +402,7 @@ of typical modern safety critical systems.
With modular FMEA i.e. FMMD %(FMMD) With modular FMEA i.e. FMMD %(FMMD)
we have the concepts of failure~modes we have the concepts of failure~modes
of components, {\fgs} and symptoms of failure for a functional group. of components, {\fgs} and symptoms of failure for a functional group.
%
A programmatic function has similarities with a {\fg} as defined by the FMMD process. A programmatic function has similarities with a {\fg} as defined by the FMMD process.
% %
An FMMD {\fg} is placed into a hierarchy. An FMMD {\fg} is placed into a hierarchy.
@ -403,7 +411,7 @@ A software function typically calls other functions and uses data sources via ha
It has outputs, i.e. it can perform actions It has outputs, i.e. it can perform actions
on data or hardware on data or hardware
which will be used by functions that may call upon it. which will be used by functions that may call upon it.
%
We can map a software function to a {\fg} in FMMD. We can map a software function to a {\fg} in FMMD.
% %
Its failure modes Its failure modes
@ -415,7 +423,7 @@ Its outputs are the data it changes, or the hardware actions it performs.
When we have analysed a software function---using failure conditions When we have analysed a software function---using failure conditions
of its inputs as failure modes---we can of its inputs as failure modes---we can
determine its symptoms of failure (i.e. how calling functions will see its failure mode behaviour). determine its symptoms of failure (i.e. how calling functions will see its failure mode behaviour).
%
We can thus apply the $\derivec$ function to software functions, by viewing them in terms of their failure We can thus apply the $\derivec$ function to software functions, by viewing them in terms of their failure
mode behaviour. To simplify things as well, software already fits into a hierarchy. mode behaviour. To simplify things as well, software already fits into a hierarchy.
For Electronics and Mechanical systems, although we may be guided by the original designers For Electronics and Mechanical systems, although we may be guided by the original designers
@ -464,7 +472,7 @@ post-conditions (constraints on its outputs) and function wide invariants (rules
A precondition, or requirement for a contract software function A precondition, or requirement for a contract software function
defines the correct ranges of input conditions for the function defines the correct ranges of input conditions for the function
to operate successfully. to operate successfully.
%
For a software function, a violation of a pre-condition is For a software function, a violation of a pre-condition is
in effect a failure mode of `one of its components'. in effect a failure mode of `one of its components'.
@ -553,7 +561,7 @@ value which represents the voltage read (see code sample in figure~\ref{fig:code
%%{\vbox{ %%{\vbox{
\begin{figure}[h+] \begin{figure}[h+]
\footnotesize \tiny
\begin{verbatim} \begin{verbatim}
/***********************************************/ /***********************************************/
/* read_4_20_input() */ /* read_4_20_input() */
@ -565,11 +573,8 @@ value which represents the voltage read (see code sample in figure~\ref{fig:code
int read_4_20_input ( int * value ) { int read_4_20_input ( int * value ) {
double input_volts; double input_volts;
int error_flag; int error_flag;
/* require: input from ADC to be /* require: input from ADC to be
between 0.88 and 4.4 volts */ between 0.88 and 4.4 volts */
input_volts = read_ADC(INPUT_4_20_mA); input_volts = read_ADC(INPUT_4_20_mA);
if ( input_volts < 0.88 || input_volts > 4.4 ) { if ( input_volts < 0.88 || input_volts > 4.4 ) {
@ -579,10 +584,8 @@ int read_4_20_input ( int * value ) {
*value = (input_volts - 0.88) * ( 4.4 - 0.88 ) * 999.0; *value = (input_volts - 0.88) * ( 4.4 - 0.88 ) * 999.0;
error_flag = 0; /* indicate current input in range */ error_flag = 0; /* indicate current input in range */
} }
/* ensure: value is proportional (0-999) to the /* ensure: value is proportional (0-999) to the
4 to 20mA input */ 4 to 20mA input */
return error_flag; return error_flag;
} }
\end{verbatim} \end{verbatim}
@ -614,7 +617,7 @@ voltage value.
%{\vbox{ %{\vbox{
\begin{figure}[h+] \begin{figure}[h+]
\footnotesize \tiny
\begin{verbatim} \begin{verbatim}
/***********************************************/ /***********************************************/
/* read_ADC() */ /* read_ADC() */
@ -636,25 +639,18 @@ double read_ADC( int channel ) {
/* if invalid channel selected */ /* if invalid channel selected */
if ( channnel > ADC_CHAN_RANGE ) if ( channnel > ADC_CHAN_RANGE )
return -2.0; return -2.0;
/* set the multiplexer to the desired channel */ /* set the multiplexer to the desired channel */
ADCMUX = channel; ADCMUX = channel;
ADCGO = 1; /* initiate ADC conversion hardware */ ADCGO = 1; /* initiate ADC conversion hardware */
/* wait for ADC conversion with timeout */ /* wait for ADC conversion with timeout */
while ( ADCGO == 1 || timeout < 100 ) while ( ADCGO == 1 || timeout < 100 )
timeout++; timeout++;
if ( timeout < 100 ) if ( timeout < 100 )
dval = (double) ADCOUT * 5.0 / ADCRANGE; dval = (double) ADCOUT * 5.0 / ADCRANGE;
else else
dval = -1.0; /* indicate invalid reading */ dval = -1.0; /* indicate invalid reading */
/* return voltage as a floating point value */ /* return voltage as a floating point value */
/* ensure: value is voltage input to within 0.1% */ /* ensure: value is voltage input to within 0.1% */
return dval; return dval;
} }
\end{verbatim} \end{verbatim}
@ -695,7 +691,9 @@ We now apply FMMD starting with the hardware.
This functional group contains the load resistor This functional group contains the load resistor
and the physical Analogue to Digital Converter (ADC). and the physical Analogue to Digital Converter (ADC).
%
Our functional group, $G_1$ is thus the set of base components: $G_1 = \{R, ADC\}$. Our functional group, $G_1$ is thus the set of base components: $G_1 = \{R, ADC\}$.
%
We now determine the {\fms} of all the components in $G_1$. We now determine the {\fms} of all the components in $G_1$.
For the resistor we can use a failure mode set from the literature~\cite{en298}. For the resistor we can use a failure mode set from the literature~\cite{en298}.
Where the function $fm$ returns a set of failure modes for a given component we can state: Where the function $fm$ returns a set of failure modes for a given component we can state:
@ -758,10 +756,10 @@ With these failure modes, we can analyse our first functional group, see table~\
We now collect the symptoms for the hardware functional group, $\{ HIGH , LOW, V\_ERR \} $. We now collect the symptoms for the hardware functional group, $\{ HIGH , LOW, V\_ERR \} $.
We now create a {\dc} to represent this called $CMATV$. We now create a {\dc} to represent this called $CMATV$.
%
We can express this using the `$\derivec$' function thus: We can express this using the `$\derivec$' function thus:
$$ CMATV = \; \derivec (G_1) .$$ $$ CMATV = \; \derivec (G_1) .$$
%
As its failure modes are the symptoms of failure from the functional group we can now state: As its failure modes are the symptoms of failure from the functional group we can now state:
$$fm ( CMATV ) = \{ HIGH , LOW, V\_ERR \} .$$ $$fm ( CMATV ) = \{ HIGH , LOW, V\_ERR \} .$$
@ -770,8 +768,7 @@ $$fm ( CMATV ) = \{ HIGH , LOW, V\_ERR \} .$$
The software function $Read\_ADC$ uses the ADC hardware analysed The software function $Read\_ADC$ uses the ADC hardware analysed
as the {\dc} CMATV above. as the {\dc} CMATV above.
%
The code fragment in figure~\ref{fig:code_read_ADC} states pre-conditions, as The code fragment in figure~\ref{fig:code_read_ADC} states pre-conditions, as
{\em/* require: a) input channel from ADC to be {\em/* require: a) input channel from ADC to be
in valid ADC range in valid ADC range
@ -787,7 +784,7 @@ The reference voltage for the ADC has a 0.1\% accuracy requirement.
% %
If the reference value is outside of this, it is also a {\fm} If the reference value is outside of this, it is also a {\fm}
of this function, which we can call $V\_REF$. of this function, which we can call $V\_REF$.
%
Taken as a component for use in FMEA/FMMD our function has Taken as a component for use in FMEA/FMMD our function has
two failure modes. We can therefore treat it as a generic component, $Read\_ADC$, two failure modes. We can therefore treat it as a generic component, $Read\_ADC$,
by stating: by stating:
@ -796,7 +793,7 @@ $$ fm(Read\_ADC) = \{ CHAN\_NO, VREF \} $$
As we have a failure mode model for our function, we can now use it in conjunction with As we have a failure mode model for our function, we can now use it in conjunction with
with the ADC hardware {\dc} CMATV, to form a {\fg} $G_2$, where $G_2 =\{ CMSTV, Read\_ADC \}$. with the ADC hardware {\dc} CMATV, to form a {\fg} $G_2$, where $G_2 =\{ CMSTV, Read\_ADC \}$.
%
We now analyse this hardware/software combined {\fg}. We now analyse this hardware/software combined {\fg}.
@ -845,7 +842,7 @@ as $\{ VV\_ERR, HIGH, LOW \}$. We can add as well the violation of the postcondi
for the function. for the function.
This postcondition, {\em /* ensure: value is voltage input to within 0.1\% */ }, This postcondition, {\em /* ensure: value is voltage input to within 0.1\% */ },
corresponds to $VV\_ERR$, and is already in the {\fm} set for this {\fg}. corresponds to $VV\_ERR$, and is already in the {\fm} set for this {\fg}.
%
We can now create a {\dc} called $RADC$ thus: $$RADC = \; \derivec(G_2)$$ which has the following We can now create a {\dc} called $RADC$ thus: $$RADC = \; \derivec(G_2)$$ which has the following
{\fms}: {\fms}:
@ -860,6 +857,7 @@ $$ fm(RADC) = \{ VV\_ERR, HIGH, LOW \} .$$
This function sits on top of the $RADC$ {\dc} determined above. This function sits on top of the $RADC$ {\dc} determined above.
We look at the pre-conditions for the function $read\_4\_20\_input$ , % which we can call $RI$ We look at the pre-conditions for the function $read\_4\_20\_input$ , % which we can call $RI$
to determine its {\fms}. to determine its {\fms}.
%
Its pre-condition is, {\em /* require: input from ADC to be between 0.88 and 4.4 volts */}. Its pre-condition is, {\em /* require: input from ADC to be between 0.88 and 4.4 volts */}.
We can map this violation of the pre-condition, to the {\fm} VRNGE; %As this function has one pre-condition We can map this violation of the pre-condition, to the {\fm} VRNGE; %As this function has one pre-condition
we can state, we can state,
@ -913,7 +911,7 @@ The postcondition for the function $read\_4\_20\_input$, {\em /* ensure: value i
For single failures these are the two ways in which this function For single failures these are the two ways in which this function
can fail. An $OUT\_OF\_RANGE$ will be flagged by the error flag variable. can fail. An $OUT\_OF\_RANGE$ will be flagged by the error flag variable.
The $VAL\_ERR$ will mean that the value read is simply wrong. The $VAL\_ERR$ will mean that the value read is simply wrong.
%
We can finally make a {\dc} to represent a failure mode model for our function $read\_4\_20\_input$ thus: We can finally make a {\dc} to represent a failure mode model for our function $read\_4\_20\_input$ thus:
$$ R420I = \; \derivec(G_3) .$$ $$ R420I = \; \derivec(G_3) .$$
@ -944,21 +942,18 @@ as a hierarchical diagram, see figure~\ref{fig:hd}.
We can represent the hierarchy in figure~\ref{fig:hd} algebraically, using the `$\derivec$' function We can represent the hierarchy in figure~\ref{fig:hd} algebraically, using the `$\derivec$' function
using the groups as intermediate stages: using the groups as intermediate stages:
\begin{eqnarray*} % \begin{eqnarray*}
G_1 &=& \{R,ADC\} \\ % G_1 &=& \{R,ADC\} \\
CMATV &=& \;\derivec (G_1) \\ % CMATV &=& \;\derivec (G_1) \\
G_2 &=& \{CMATV, read\_ADC \} \\ % G_2 &=& \{CMATV, read\_ADC \} \\
RADC &=& \; \derivec (G_2) \\ % RADC &=& \; \derivec (G_2) \\
G_3 &=& \{ RADC, read\_4\_20\_input \} \\ % G_3 &=& \{ RADC, read\_4\_20\_input \} \\
R420I &=& \; \derivec (G_3) \\ % R420I &=& \; \derivec (G_3) \\
\end{eqnarray*} % \end{eqnarray*}
or, a nested definition, %or,
with a nested definition,
$$ \derivec \Big( \derivec \big( \derivec(R,ADC), read\_4\_20\_input \big), read\_4\_20\_input \Big). $$ $$ \derivec \Big( \derivec \big( \derivec(R,ADC), read\_4\_20\_input \big), read\_4\_20\_input \Big). $$
This nested structure means that we have multiple traceable This nested structure means that we have multiple traceable
stages of failure mode reasoning in our analysis. Traditional FMEA would have only one stage stages of failure mode reasoning in our analysis. Traditional FMEA would have only one stage
of reasoning for each component failure mode. of reasoning for each component failure mode.
@ -971,19 +966,18 @@ if anything goes wrong, we should be able to detect it.
In fact unless all electrical elements in the loop In fact unless all electrical elements in the loop
are in working order we will detect a failure in are in working order we will detect a failure in
the majority of cases. the majority of cases.
\subsection{Sending side of a {\ft} loop} \paragraph{Sending side of a {\ft} loop}
A current loop has to be actively maintained. If the sending side looses power, A current loop has to be actively maintained. If the sending side looses power,
the current will drop to zero, and thus be detectable as an error because it is below 4mA. the current will drop to zero, and thus be detectable as an error because it is below 4mA.
Should the sending circuitry fail, it is far more likely to drive too high or too low, rather than supply Should the sending circuitry fail, it is far more likely to drive too high or too low, rather than supply
an erroneous but in bounds ($4mA \ge \wedge \le 20mA$) value. an erroneous but in bounds ($4mA \ge \wedge \le 20mA$) value.
\subsection{Receiving side of a {\ft} loop} \paragraph{Receiving side of a {\ft} loop}
The most common fault is disconnection, and this is easily detected ($0mA\; \le \; 4mA$--out of bounds). The most common fault is disconnection, and this is easily detected ($0mA\; \le \; 4mA$--out of bounds).
Other failure modes, such as the resistor going open or shorted Other failure modes, such as the resistor going open or shorted
also immediately push the voltage signal out of bounds. also immediately push the voltage signal out of bounds.
The software side of the interface, is easy to test, either as software modules The software side of the interface, is easy to test, either as software modules
or as an integrated system (hand-held precision current sources are cheaply available). or as an integrated system (hand-held precision current sources are cheaply available).
\paragraph{What could go wrong---Production}
\subsection{What could go wrong---Production}
PCB construction contractors are well known for random polarity placement of diodes. PCB construction contractors are well known for random polarity placement of diodes.
Less likely is that the resistor fitted will be an incorrect value, which could Less likely is that the resistor fitted will be an incorrect value, which could
lead to the range being incorrect. Were this the case, we would have to be very unlucky lead to the range being incorrect. Were this the case, we would have to be very unlucky
@ -996,7 +990,7 @@ erroneously chosen (this would be a cheaper component), and could contribute sma
%\clearpage %\clearpage
\section{Conclusion} \section{Conclusion}
%
The {\dc} representing the {\ft} reader The {\dc} representing the {\ft} reader
in software shows that by taking a modular approach for FMEA, we can integrate in software shows that by taking a modular approach for FMEA, we can integrate
software and electro-mechanical FMEA models. software and electro-mechanical FMEA models.
@ -1009,12 +1003,12 @@ With traditional FMEA methods the reasoning~distance is large, because
it stretches from the component failure mode to the top---or---system level failure. it stretches from the component failure mode to the top---or---system level failure.
For this reason applying traditional FMEA to software stretches For this reason applying traditional FMEA to software stretches
the reasoning distance even further. the reasoning distance even further.
%
We now have a {\dc} for a {\ft} input in software. We now have a {\dc} for a {\ft} input in software.
Typically, more than one such input could be present in a real-world system. Typically, more than one such input could be present in a real-world system.
Not only have we integrated electronics and software in an FMEA, we can also Not only have we integrated electronics and software in an FMEA, we can also
re-use the analysis for each {\ft} input in the system. re-use the analysis for each {\ft} input in the system.
%
The unsolved symptoms, or unobservable errors, i.e. $VAL\_ERR$ could be addressed The unsolved symptoms, or unobservable errors, i.e. $VAL\_ERR$ could be addressed
by another software function to read other known signals by another software function to read other known signals
via the MUX (i.e. voltage references). This strategy would via the MUX (i.e. voltage references). This strategy would
@ -1022,7 +1016,7 @@ detect ADC\_STUCK\_AT and MUX\_FAIL failure modes.
% %
Detailing this however, is beyond the scope %and page-count Detailing this however, is beyond the scope %and page-count
of this paper. of this paper.
%
A software specification for a hardware interface will concentrate on A software specification for a hardware interface will concentrate on
how to interpret raw readings, or what signals to apply for actuators. how to interpret raw readings, or what signals to apply for actuators.
Using FMMD we can determine an accurate failure model for the interface as well. Using FMMD we can determine an accurate failure model for the interface as well.

View File

@ -12,13 +12,14 @@
\label{sec:chap5} \label{sec:chap5}
This chapter demonstrates FMMD applied to This chapter demonstrates FMMD applied to
a variety of common electronic circuits including analogue/digital and electronics/software hybrids. a variety of typical embedded system components including analogue/digital and electronics/software hybrids.
%In order to implement FMMD in practise, we review the basic concepts and processes of the methodology.% %In order to implement FMMD in practise, we review the basic concepts and processes of the methodology.%
%Each example has been chosen to demonstrate %Each example has been chosen to demonstrate
%FMMD applied to %FMMD applied to
% %
The first section~\ref{sec:determine_fms} looks at how we determine failure mode sets for {\bcs} in the context of the safety standards The first section~\ref{sec:determine_fms} looks at how we determine failure mode sets for {\bcs}
we are conforming to for our particular project. (in the context of the safety standards
we are conforming to for our particular project).
% %
This is followed by several example FMMD analyses, the first analysing a common configuration of This is followed by several example FMMD analyses, the first analysing a common configuration of
the inverting amplifier (see section~\ref{sec:invamp}) using the inverting amplifier (see section~\ref{sec:invamp}) using
@ -32,16 +33,17 @@ Re-use of the potential divider model is discussed in the context of this circui
not in the second. not in the second.
% %
Section~\ref{sec:fivepolelp} analyes a sallen-key based five pole low pass filter. Section~\ref{sec:fivepolelp} analyes a sallen-key based five pole low pass filter.
This demonstrates FMMD being able to re-use the first Salen-Key {\dc}, thus This demonstrates FMMD being able to re-use the first Salen-Key encountered as a {\dc}, thus
saving time and effort for the analyst. saving time and effort for the analyst.
% %
Section~\ref{sec:bubba} shows FMMD tackling a circuit with a circular signal path---the `Bubba' oscillator---which uses Section~\ref{sec:bubba} shows FMMD tackling a circuit with a circular signal path---the `Bubba' oscillator---which uses
four op-amp stages with supporting components. four op-amp stages with supporting components.
% %
Section~\ref{sec:sigmadelta} shows FMMD analysing the sigma delta analogue to digital converter---which operates on mixed Section~\ref{sec:sigmadelta} shows FMMD analysing the sigma delta analogue to digital converter---which operates on both
analogue and digital signals. analogue and digital signals.
% %
Finally section~\ref{sec:Pt100} demonstrates both statistical analysis for top level events traced back to {\bc} failure modes Finally section~\ref{sec:Pt100} demonstrates both statistical
failure mode classification % analysis for top level events traced back to {\bc} failure modes
and the analysis of double simultaneous failure modes. and the analysis of double simultaneous failure modes.
% \section{Basic Concepts Of FMMD} % \section{Basic Concepts Of FMMD}
@ -105,11 +107,14 @@ and the analysis of double simultaneous failure modes.
\section{Determining the failure modes of components} \section{Determining the failure modes of components}
\label{sec:determine_fms} \label{sec:determine_fms}
In order to apply any form of Failure Mode Effects Analysis (FMEA) we need to know the ways in which the components we are using can fail. In order to apply any form of Failure Mode Effects Analysis (FMEA) we need to know the ways in which
the components we are using can fail.
%
A good introduction to hardware and software failure modes may be found in~\cite{sccs}[pp.114-124]. A good introduction to hardware and software failure modes may be found in~\cite{sccs}[pp.114-124].
%
Typically when choosing components for a design, we look at manufacturers' data sheets Typically when choosing components for a design, we look at manufacturers' data sheets
which describe functionality and dimensions which describe functionality, physical dimensions
and also describe environmental ranges and tolerances, and can indicate how a component may fail/misbehave environmental ranges, tolerances and can indicate how a component may fail/misbehave
under given conditions. under given conditions.
% %
How base components could fail internally, is not of interest to an FMEA investigation. How base components could fail internally, is not of interest to an FMEA investigation.
@ -165,7 +170,7 @@ We look at the reasons why some known failure modes % are omitted, or presented
%specific but unintuitive ways. %specific but unintuitive ways.
%We compare the US. military published failure mode specifications wi %We compare the US. military published failure mode specifications wi
can be found in one source but not in the others and vice versa. can be found in one source but not in the others and vice versa.
%
Finally we compare and contrast the failure modes determined for these components Finally we compare and contrast the failure modes determined for these components
from the FMD-91 reference source and from the guidelines of the from the FMD-91 reference source and from the guidelines of the
European burner standard EN298. European burner standard EN298.
@ -265,7 +270,8 @@ $$ fm(R) = \{ OPEN, SHORT \} . $$
\label{fig:lm258} \label{fig:lm258}
\end{figure} \end{figure}
The operational amplifier (op-amp) is a differential amplifier and is very widely used in nearly all fields of modern analogue electronics. The operational amplifier (op-amp) %is a differential amplifier and
is very widely used in nearly all fields of modern analogue electronics.
They are typically packaged in dual or quad configurations---meaning They are typically packaged in dual or quad configurations---meaning
that a chip will typically contain two or four amplifiers. that a chip will typically contain two or four amplifiers.
For the purpose of example, we look at For the purpose of example, we look at
@ -286,7 +292,8 @@ For OP-AMP failures modes, FMD-91\cite{fmd91}{3-116] states,
Again these are mostly internal causes of failure, more of interest to the component manufacturer Again these are mostly internal causes of failure, more of interest to the component manufacturer
than a designer looking for the symptoms of failure. than a designer looking for the symptoms of failure.
We need to translate these failure causes within the OP-AMP into {\fms}. We need to translate these failure causes within the OP-AMP into {\fms}.
We can look at each failure cause in turn, and map it to potential {\fms}. We can look at each failure cause in turn, and map it to potential {\fms} suitable for use in FMEA
investigations.
\paragraph{OP-AMP failure cause: Poor Die attach} \paragraph{OP-AMP failure cause: Poor Die attach}
The symptom for this is given as a low slew rate. This means that the op-amp The symptom for this is given as a low slew rate. This means that the op-amp
@ -314,7 +321,11 @@ We map this failure cause to $HIGH$ or $LOW$.
\paragraph{Collecting OP-AMP failure modes from FMD-91} \paragraph{Collecting OP-AMP failure modes from FMD-91}
We can define an OP-AMP, under FMD-91 definitions to have the following {\fms}. We can define an OP-AMP, under FMD-91 definitions to have the following {\fms}.
$$fm(OP-AMP) = \{ HIGH, LOW, NOOP, LOW_{slew} \} $$ \begin{equation}
\label{eqn:opampfms}
fm(OP-AMP) = \{ HIGH, LOW, NOOP, LOW_{slew} \}
\end{equation}
\paragraph{Failure Modes of an OP-AMP according to EN298} \paragraph{Failure Modes of an OP-AMP according to EN298}
@ -324,18 +335,19 @@ annex~A~\cite{en298}[A.1 note e].
This demands that all open connections, and shorts between adjacent pins be considered as failure scenarios. This demands that all open connections, and shorts between adjacent pins be considered as failure scenarios.
We examine these failure scenarios on the dual packaged $LM358$~\cite{lm358}%\mu741$ We examine these failure scenarios on the dual packaged $LM358$~\cite{lm358}%\mu741$
and determine its {\fms} in table ~\ref{tbl:lm358}. and determine its {\fms} in table ~\ref{tbl:lm358}.
Collecting the op-amp failure modes from table ~\ref{tbl:lm358} we obtain the same {\fms}
that we got from FMD-91, listed in equation~\ref{eqn:opampfms}.
%\paragraph{EN298: Open and shorted pin failure symptom determination technique}
\paragraph{EN298: Open and shorted pin failure symptom determination technique}
\begin{table}[h+] \begin{table}[h+]
\caption{LM358: EN298 Single failure symptom extraction} \caption{LM358: EN298 Open and shorted pin failure symptom determination technique}
\begin{tabular}{|| l | l | c | c | l ||} \hline \begin{tabular}{|| l | l | c | c | l ||} \hline
\textbf{Failure Scenario} & & \textbf{Amplifier Effect} & & \textbf{Symptom(s)} \\ \textbf{Failure Scenario} & & \textbf{Amplifier Effect} & & \textbf{Symptom(s)} \\
\hline \hline
@ -662,7 +674,7 @@ component {\fms} in FMEA or FMMD and require interpretation.
There are two obvious ways in which we can model this circuit: There are two obvious ways in which we can model this circuit:
One is to do this in two stages, by considering the gain resistors to be an inverted potential divider One is to do this in two stages, by considering the gain resistors to be an inverted potential divider
and then combining it with the OPAMP failure mode model. and then combining it with the OPAMP failure mode model.
The second is to place all three components in a {\fg}. The second is to place all three components in one {\fg}.
Both approaches are followed in the next two sub-sections. Both approaches are followed in the next two sub-sections.
\subsection{First Approach: Inverting OPAMP using a Potential Divider {\dc}} \subsection{First Approach: Inverting OPAMP using a Potential Divider {\dc}}
@ -687,6 +699,49 @@ If we consider the input will only be positive, we can invert the potential divi
\label{tbl:pdneg} \label{tbl:pdneg}
\end{table} \end{table}
\begin{figure}[h]
\centering
\begin{tikzpicture}[shorten >=1pt,->,draw=black!50, node distance=\layersep]
\tikzstyle{every pin edge}=[<-,shorten <=1pt]
\tikzstyle{fmmde}=[circle,fill=black!25,minimum size=30pt,inner sep=0pt]
\tikzstyle{component}=[fmmde, fill=green!50];
\tikzstyle{failure}=[fmmde, fill=red!50];
\tikzstyle{symptom}=[fmmde, fill=blue!50];
\tikzstyle{annot} = [text width=4em, text centered]
\node[component] (R1) at (0,-0.7) {$R_1$};
\node[component] (R2) at (0,-1.9) {$R_2$};
\node[failure] (R1SHORT) at (\layersep,-0) {$R1_{Sh}$};
\node[failure] (R1OPEN) at (\layersep,-1.1) {$R1_{Op}$};
\node[failure] (R2SHORT) at (\layersep,-2.4) {$R2_{Sh}$};
\node[failure] (R2OPEN) at (\layersep,-3.7) {$R2_{Op}$};
\path (R1) edge (R1SHORT);
\path (R1) edge (R1OPEN);
\path (R2) edge (R2SHORT);
\path (R2) edge (R2OPEN);
% Potential divider failure modes
%
\node[symptom] (PDHIGH) at (\layersep*2,-0.7) {$PD_{HIGH}$};
\node[symptom] (PDLOW) at (\layersep*2,-2.2) {$PD_{LOW}$};
\path (R1OPEN) edge (PDLOW);
\path (R2SHORT) edge (PDLOW);
\path (R2OPEN) edge (PDHIGH);
\path (R1SHORT) edge (PDHIGH);
\end{tikzpicture}
\caption{Failure symptoms of the `Inverted Potential Divider' $INVPD$}
\label{fig:pdneg}
\end{figure}
We can form a {\dc} from this, and call it an inverted potential divider $INVPD$. We can form a {\dc} from this, and call it an inverted potential divider $INVPD$.
We can now form a {\fg} from the OP-AMP and the $INVPD$ We can now form a {\fg} from the OP-AMP and the $INVPD$
@ -715,6 +770,101 @@ We can now form a {\fg} from the OP-AMP and the $INVPD$
%%This gives the same results as the analysis from figure~\ref{fig:invampanalysis}. %%This gives the same results as the analysis from figure~\ref{fig:invampanalysis}.
\begin{figure}[h+]
\centering
\begin{tikzpicture}[shorten >=1pt,->,draw=black!50, node distance=\layersep]
\tikzstyle{every pin edge}=[<-,shorten <=1pt]
\tikzstyle{fmmde}=[circle,fill=black!25,minimum size=30pt,inner sep=0pt]
\tikzstyle{component}=[fmmde, fill=green!50];
\tikzstyle{failure}=[fmmde, fill=red!50];
\tikzstyle{symptom}=[fmmde, fill=blue!50];
\tikzstyle{annot} = [text width=4em, text centered]
% Draw the input layer nodes
%\foreach \name / \y in {1,...,4}
% This is the same as writing \foreach \name / \y in {1/1,2/2,3/3,4/4}
% \node[component, pin=left:Input \#\y] (I-\name) at (0,-\y) {};
\node[component] (OPAMP) at (0,-1.8) {$OPAMP$};
\node[component] (R1) at (0,-6) {$R_1$};
\node[component] (R2) at (0,-7.6) {$R_2$};
%\node[component] (C-3) at (0,-5) {$C^0_3$};
%\node[component] (K-4) at (0,-8) {$K^0_4$};
%\node[component] (C-5) at (0,-10) {$C^0_5$};
%\node[component] (C-6) at (0,-12) {$C^0_6$};
%\node[component] (K-7) at (0,-15) {$K^0_7$};
% Draw the hidden layer nodes
%\foreach \name / \y in {1,...,5}
% \path[yshift=0.5cm]
\node[failure] (OPAMPLU) at (\layersep,-0) {l-up};
\node[failure] (OPAMPLD) at (\layersep,-1.2) {l-dn};
\node[failure] (OPAMPNP) at (\layersep,-2.5) {noop};
\node[failure] (OPAMPLS) at (\layersep,-3.8) {lowslew};
\node[failure] (R1SHORT) at (\layersep,-5.1) {$R1_{Sh}$};
\node[failure] (R1OPEN) at (\layersep,-6.4) {$R1_{Op}$};
\node[failure] (R2SHORT) at (\layersep,-7.7) {$R2_{Sh}$};
\node[failure] (R2OPEN) at (\layersep,-9.0) {$R2_{Op}$};
% Draw the output layer node
% % Connect every node in the input layer with every node in the
% % hidden layer.
% %\foreach \source in {1,...,4}
% % \foreach \dest in {1,...,5}
\path (OPAMP) edge (OPAMPLU);
\path (OPAMP) edge (OPAMPLD);
\path (OPAMP) edge (OPAMPNP);
\path (OPAMP) edge (OPAMPLS);
\path (R1) edge (R1SHORT);
\path (R1) edge (R1OPEN);
\path (R2) edge (R2SHORT);
\path (R2) edge (R2OPEN);
% Potential divider failure modes
%
\node[symptom] (PDHIGH) at (\layersep*2,-6) {$PD_{HIGH}$};
\node[symptom] (PDLOW) at (\layersep*2,-7.6) {$PD_{LOW}$};
\path (R1OPEN) edge (PDLOW);
\path (R2SHORT) edge (PDLOW);
\path (R2OPEN) edge (PDHIGH);
\path (R1SHORT) edge (PDHIGH);
\node[symptom] (AMPHIGH) at (\layersep*3.4,-3) {$AMP_{HIGH}$};
\node[symptom] (AMPLOW) at (\layersep*3.4,-5) {$AMP_{LOW}$};
\node[symptom] (AMPLP) at (\layersep*3.4,-7) {$LOWPASS$};
\path (PDLOW) edge (AMPHIGH);
\path (OPAMPLU) edge (AMPHIGH);
\path (PDHIGH) edge (AMPLOW);
\path (OPAMPNP) edge (AMPLOW);
\path (OPAMPLD) edge (AMPLOW);
\path (OPAMPLS) edge (AMPLP);
\end{tikzpicture}
% End of code
\caption{Full DAG representing failure modes and symptoms of the Inverting Op-amp Circuit}
\label{fig:invdag1}
\end{figure}
%The differences are the root causes or component failure modes that %The differences are the root causes or component failure modes that
%lead to the symptoms (i.e. the symptoms are the same but causation tree will be different). %lead to the symptoms (i.e. the symptoms are the same but causation tree will be different).
@ -778,7 +928,7 @@ $$ fm(INVAMP) = \{ HIGH, LOW, LOW PASS \} $$
%\clearpage \clearpage
\subsection{Comparison between the two approaches} \subsection{Comparison between the two approaches}
\label{sec:invampcc} \label{sec:invampcc}
@ -819,9 +969,10 @@ and for the second analysis a CC of $8.(3-2)=16$.
The circuit in figure~\ref{fig:circuit1} amplifies the difference between The circuit in figure~\ref{fig:circuit1} amplifies the difference between
the input voltages $+V1$ and $+V2$. the input voltages $+V1$ and $+V2$.
The circuit is configured so that both inputs use the non-inverting, and thus high impedance inputs, meaning that they will not The circuit is configured so that both inputs use the non-inverting,
electrically over-load/influence and thus high impedance inputs, meaning that they will not
the sensors used for measurement. electrically over-load and/or unduly influence
the sensors supplying the voltage signals used for measurement.
It would be desirable to represent this circuit as a {\dc} called say $DiffAMP$. It would be desirable to represent this circuit as a {\dc} called say $DiffAMP$.
We begin by identifying functional groups from the components in the circuit. We begin by identifying functional groups from the components in the circuit.
@ -908,11 +1059,74 @@ a functional group we can analyse its failure mode behaviour.
Collecting the symptoms we can see that this amplifier fails Collecting the symptoms we can see that this amplifier fails
in 3 ways $\{ AMPHigh, AMPLow, LowPass \}$. in 3 ways $\{ AMPHigh, AMPLow, LowPass \}$.
We can now create a derived component, $NI\_AMP$, to represent it. We can now create a derived component, $NI\_AMP$, to represent it.
The FMMD reasoning process is represented in the DAG in figure~\ref{fig:noninvdag11}.
$$ fm(NI\_AMP) = \{ AMPHigh, AMPLow, LowPass \} $$ $$ fm(NI\_AMP) = \{ AMPHigh, AMPLow, LowPass \} $$
\begin{figure}[h+]
\centering
\begin{tikzpicture}[shorten >=1pt,->,draw=black!50, node distance=\layersep]
\tikzstyle{every pin edge}=[<-,shorten <=1pt]
\tikzstyle{fmmde}=[circle,fill=black!25,minimum size=30pt,inner sep=0pt]
\tikzstyle{component}=[fmmde, fill=green!50];
\tikzstyle{failure}=[fmmde, fill=red!50];
\tikzstyle{symptom}=[fmmde, fill=blue!50];
\tikzstyle{annot} = [text width=4em, text centered]
\node[component] (OPAMP) at (0,-1.8) {$OPAMP$};
\node[component] (R1) at (0,-6) {$R_1$};
\node[component] (R2) at (0,-7.6) {$R_2$};
\node[failure] (OPAMPLU) at (\layersep,-0) {l-up};
\node[failure] (OPAMPLD) at (\layersep,-1.2) {l-dn};
\node[failure] (OPAMPNP) at (\layersep,-2.5) {noop};
\node[failure] (OPAMPLS) at (\layersep,-3.8) {lowslew};
\node[failure] (R1SHORT) at (\layersep,-5.1) {$R1_{Sh}$};
\node[failure] (R1OPEN) at (\layersep,-6.4) {$R1_{Op}$};
\node[failure] (R2SHORT) at (\layersep,-7.7) {$R2_{Sh}$};
\node[failure] (R2OPEN) at (\layersep,-9.0) {$R2_{Op}$};
\path (OPAMP) edge (OPAMPLU);
\path (OPAMP) edge (OPAMPLD);
\path (OPAMP) edge (OPAMPNP);
\path (OPAMP) edge (OPAMPLS);
\path (R1) edge (R1SHORT);
\path (R1) edge (R1OPEN);
\path (R2) edge (R2SHORT);
\path (R2) edge (R2OPEN);
% Potential divider failure modes
%
\node[symptom] (PDHIGH) at (\layersep*2,-6) {$PD_{HIGH}$};
\node[symptom] (PDLOW) at (\layersep*2,-7.6) {$PD_{LOW}$};
\path (R1OPEN) edge (PDHIGH);
\path (R2SHORT) edge (PDHIGH);
\path (R2OPEN) edge (PDLOW);
\path (R1SHORT) edge (PDLOW);
\node[symptom] (AMPHIGH) at (\layersep*3.4,-3) {$AMP_{HIGH}$};
\node[symptom] (AMPLOW) at (\layersep*3.4,-5) {$AMP_{LOW}$};
\node[symptom] (AMPLP) at (\layersep*3.4,-7) {$LOWPASS$};
\path (PDLOW) edge (AMPHIGH);
\path (OPAMPLU) edge (AMPHIGH);
\path (PDHIGH) edge (AMPLOW);
\path (OPAMPNP) edge (AMPLOW);
\path (OPAMPLD) edge (AMPLOW);
\path (OPAMPLS) edge (AMPLP);
\end{tikzpicture}
% End of code
\caption{Full DAG representing failure modes and symptoms of the Non Inverting Op-amp Circuit}
\label{fig:noninvdag11}
\end{figure}
\subsection{The second Stage of the amplifier} \subsection{The second Stage of the amplifier}