1434 lines
61 KiB
TeX
1434 lines
61 KiB
TeX
|
|
|
|
%%% CHAPTER 6
|
|
\label{sec:chap6}
|
|
|
|
\section{Software and Hardware Failure Mode Concepts}
|
|
\label{sec:elecsw}
|
|
In this chapter we show that FMMD can be applied to software enabling us to build build complete failure models
|
|
of typical modern safety critical systems.
|
|
With modular FMEA i.e. FMMD %(FMMD)
|
|
we have the concepts of failure~modes
|
|
of components, {\fgs} and symptoms of failure for a functional group.
|
|
%
|
|
A programmatic function has similarities with a {\fg} as defined by the FMMD process.
|
|
%
|
|
An FMMD {\fg} is placed into a hierarchy.
|
|
A software function is typically placed into the hierarchy of its call-tree.
|
|
A software function calls other functions and uses data sources via hardware interaction, which could be viewed as its `components':
|
|
it has outputs, i.e. it can perform actions on data or hardware.
|
|
%which will be used by other functions that may call it.
|
|
%
|
|
We show that we can map a software function to a {\fg} in FMMD: its failure modes
|
|
are the failure modes of the software components (other functions it calls)
|
|
and the hardware from which it reads values.
|
|
Its outputs are the data it changes, or the hardware actions it performs.
|
|
%%
|
|
%% Talk about how software specification will often say how hardware
|
|
%% will react and how to interpret readings---but they do not
|
|
%% always cover the failure modes of the hardware being interfaced too.
|
|
%
|
|
When we have analysed a software function---using failure conditions
|
|
of its inputs as failure modes---we can
|
|
determine its symptoms of failure (i.e. how calling functions will see its failure mode behaviour).
|
|
|
|
%
|
|
We apply the FMMD process to software functions by viewing them in terms of their failure mode behaviour.
|
|
%
|
|
As software already fits into a hierarchy we have one less analysis decision to make, compared
|
|
to analysing electronics.
|
|
%
|
|
For Electronics and Mechanical systems, although we may be guided by the original designers
|
|
concepts of modularity and sub-systems in design, applying FMMD means deciding on the members for {\fgs}
|
|
and the subsequent hierarchy.
|
|
%
|
|
With software already written, the hierarchies are given.
|
|
%
|
|
To apply FMMD to software, we collect the elements used by a software function, along with the function its-self
|
|
to form a {\fg}. When we have analysed the failure mode behaviour of this {\fg}
|
|
and have its failure mode symptoms, we can create a {\dc}. That {\dc} can be
|
|
used by functions that call the function we have just analysed, until
|
|
we form a complete failure mode hierarchy of the system under investigation.
|
|
% map the FMMD concepts of {\fms}, {\fgs} and {\dcs}
|
|
%to software functions.
|
|
%
|
|
%However, we need to map a the FMMD concepts of {\fms}, {\fgs} and {\dcs}
|
|
%to software functions.
|
|
% failure modes of a function in order to
|
|
%map FMMD to software.
|
|
|
|
|
|
|
|
% map the FMMD concepts of {\fms}, {\fgs} and {\dcs}
|
|
%to software functions.
|
|
%
|
|
%However, we need to map a the FMMD concepts of {\fms}, {\fgs} and {\dcs}
|
|
%to software functions.
|
|
% failure modes of a function in order to
|
|
%map FMMD to software.
|
|
|
|
\subsection{Software, a natural hierarchy}
|
|
|
|
Software written for safety critical systems is usually constrained to
|
|
be modular~\cite{en61508}[3] and non recursive~\cite{misra}[15.2]. %{iec61511}.
|
|
Because of this we can assume direct call trees~\footnote{A typical embedded system
|
|
will have a run time call tree, and (possibly multiple) interrupt sourced call tress.}. Functions call functions
|
|
from the top down and eventually call the lowest level library or IO
|
|
functions that interact with hardware/electronics.
|
|
|
|
What is potentially difficult with a software function, is deciding what
|
|
its failure modes and symptoms are.
|
|
With electronic components, we can use literature to point us to suitable sets of
|
|
{\fms}~\cite{fmd91}~\cite{mil1991}~\cite{en298}. %~\cite{en61508}~\cite{en298}.
|
|
With software, only some library functions are well known and rigorously documented
|
|
enough to have the equivalent of known failure modes.
|
|
Most software is `bespoke'.
|
|
%
|
|
We need a different strategy to
|
|
describe the failure mode behaviour of software functions.
|
|
We can use definitions from contract programming to assist here.
|
|
|
|
\subsection{Contract programming description}
|
|
|
|
Contract programming is a discipline~\cite{dbcbe} for building software functions in a controlled
|
|
and traceable way. Each function is subject to pre-conditions (constraints on its inputs),
|
|
post-conditions (constraints on its outputs) and function wide invariants (rules).
|
|
|
|
|
|
\paragraph{Mapping contract `pre-condition' violations to component failure modes.}
|
|
|
|
A precondition, or requirement for a contract software function
|
|
defines the correct ranges of input conditions for the function
|
|
to operate successfully.
|
|
%
|
|
% C Garret said this was unclear so I have added the following two sentences.
|
|
%
|
|
%If we consider a software function to be a {\fg} in the FMMD sense, i.e.
|
|
We can consider a software function to be
|
|
a collection of code, functions called and values/variables used.
|
|
In this way it is similar to an electronic circuit, which is a collection
|
|
of components connected in a specific way.
|
|
Using this analogy for software, the connections are the functions code, and the
|
|
called functions and variables are the components.
|
|
%
|
|
Erroneous behaviour from called functions and variables/inputs has the same effect as component failure modes
|
|
on an electronic {\fg}.
|
|
%
|
|
%
|
|
If we consider the
|
|
called functions and variables/inputs to be components of a function,
|
|
we can build a modular and hierarchical failure mode model
|
|
from existing software.
|
|
%
|
|
Thus for FMMD applied to software, we consider a violation of a pre-condition to be
|
|
equivalent to failure mode of `one of its components'.
|
|
|
|
|
|
\paragraph{Mapping contract `post-condition' violations to symptoms.}
|
|
|
|
A post condition is a definition of correct behaviour of a function.
|
|
%
|
|
A violated post condition is a symptom of failure, or derived failure mode, from a function.
|
|
%
|
|
Post conditions could be either actions performed (i.e. the state of hardware changed) or an output value of a function.
|
|
In pure contract programming, a violation of a pre-condition would not cause the function to
|
|
be executed.
|
|
%
|
|
In implementation code, a pre-condition violation should cause
|
|
an error to be generated, and thus a post condition to fail.
|
|
%
|
|
A function can fail for reasons other than the
|
|
a failure of one the variables/inputs or functions that it calls.
|
|
Variables can become corrupted, by radiation affecting RAM or
|
|
by another software function erroneously overwriting variables.
|
|
Current work on software FMEA generally focuses on mapping
|
|
variable corruption to failure modes~\cite{procsfmea,procsfmeadb,sfmeaauto,sfmea}.
|
|
However, errors other than variable corruption can occur,
|
|
for instance a microprocessor may have subtle bugs in its instruction set or
|
|
incorrectly handled
|
|
interrupt contention could cause side effects in software.
|
|
For the failure mode model of any software function
|
|
we must consider all failure modes of post condition
|
|
violations as well as those caused by `components'.
|
|
|
|
|
|
\paragraph{Mapping contract `invariant' violations to symptoms and failure modes.}
|
|
|
|
Invariants in contract programming may apply to inputs to the function (where violations can be considered {\fms} in FMMD terminology),
|
|
and to outputs (where violations can be considered {failure symptoms} in FMMD terminology).
|
|
|
|
|
|
\subsection{Combined Hardware/Software FMMD}
|
|
|
|
For the purpose of example, we chose a simple common safety critical industrial circuit
|
|
that is nearly always used in conjunction with a programmatic element.
|
|
A common method for delivering a quantitative value in analogue electronics is
|
|
to supply a current signal to represent the value to be sent~\cite{aoe}[p.934].
|
|
Usually, $4mA$ represents a zero or starting value and $20mA$ represents the full scale,
|
|
and this is referred to as {\ft} signalling.
|
|
%
|
|
{\ft} signalling has intrinsic electrical safety advantages.
|
|
%
|
|
Because the current in a loop is constant~\cite{aoe}[p.20]
|
|
resistance in the wires between the source and receiving end is not an issue
|
|
that can alter the accuracy of the signal.
|
|
%
|
|
%This circuit has many advantages for safety.
|
|
If the signal becomes disconnected
|
|
it reads $0mA$ at the receiving end: as this is outside the {\ft} range,
|
|
it is easily detectable as an error condition rather than an incorrect value.
|
|
%
|
|
Should the driving electronics go wrong at the source end, it will usually
|
|
supply far too little or far too much current, also making error conditions easy to detect.
|
|
%
|
|
At the receiving end, we only require one simple component to convert the
|
|
current signal into a voltage that we can read with an AD---a resistor---given
|
|
its properties defined by Ohms law. % the humble resistor!
|
|
|
|
|
|
%BLOCK DIAGRAM HERE WITH FT CIRCUIT LOOP
|
|
|
|
\begin{figure}[h]
|
|
\centering
|
|
\includegraphics[width=230pt]{./CH5_Examples/ftcontext.png}
|
|
% ftcontext.png: 767x385 pixel, 72dpi, 27.06x13.58 cm, bb=0 0 767 385
|
|
\caption{Context Diagram for {\ft} loop}
|
|
\label{fig:ftcontext}
|
|
\end{figure}
|
|
|
|
|
|
The diagram in figure~\ref{fig:ftcontext}, shows some equipment which is sending a {\ft}
|
|
signal to a micro-controller system.
|
|
The signal is locally driven over a load resistor, and then read into the micro-controller via
|
|
an ADC and its multiplexer.
|
|
With the voltage determined at the ADC we read the intended quantitative
|
|
value from the external equipment.
|
|
|
|
\section{Simple Software Example: Reading a \ft input into software}
|
|
|
|
|
|
Consider a software function that reads a {\ft} input, and returns a value between 0 and 999 (i.e. per mil $\permil$)
|
|
representing the current detected with an additional error indication flag .
|
|
%
|
|
Let us assume the {\ft} detection is via a \ohms{220} resistor, and that we read a voltage
|
|
from an ADC into the software.
|
|
Let us define any value outside the 4mA to 20mA range as an error condition.
|
|
%
|
|
As we read a voltage voltage, we use Ohms law~\cite{aoe} to determine the mA current detected: $V=IR$, $0.004A * \ohms{220} = 0.88V$
|
|
and $0.020A * \ohms{220} = 4.4V$.
|
|
%
|
|
Our acceptable voltage range is therefore
|
|
|
|
$$(V \ge 0.88) \wedge (V \le 4.4) \; .$$
|
|
|
|
This voltage range forms our input requirement and can be considered as an invariant condition.
|
|
%
|
|
We can now examine a software function that performs a conversion from the voltage read to
|
|
a per~mil representation of the {\ft} input current.
|
|
%
|
|
For the purpose of example the `C' programming language~\cite{DBLP:books/ph/KernighanR88} is used.
|
|
We initially assume a function \textbf{read\_ADC} which returns a floating point %double precision
|
|
value which represents the voltage read (see code sample in figure~\ref{fig:code_read_4_20_input}).
|
|
|
|
|
|
%%{\vbox{
|
|
\begin{figure}[h+]
|
|
|
|
\footnotesize
|
|
\begin{verbatim}
|
|
/***********************************************/
|
|
/* read_4_20_input() */
|
|
/***********************************************/
|
|
/* Software function to read 4mA to 20mA input */
|
|
/* returns a value from 0-999 proportional */
|
|
/* to the current input. */
|
|
/***********************************************/
|
|
int read_4_20_input ( int * value ) {
|
|
double input_volts;
|
|
int error_flag;
|
|
|
|
/* require: input from ADC to be
|
|
between 0.88 and 4.4 volts */
|
|
|
|
|
|
input_volts = read_ADC(INPUT_4_20_mA);
|
|
|
|
if ( input_volts < 0.88 || input_volts > 4.4 ) {
|
|
error_flag = 1; /* Error flag set to TRUE */
|
|
}
|
|
else {
|
|
*value = (input_volts - 0.88) * ( 4.4 - 0.88 ) * 999.0;
|
|
error_flag = 0; /* indicate current input in range */
|
|
}
|
|
|
|
/* ensure: value is proportional (0-999) to the
|
|
4 to 20mA input */
|
|
|
|
return error_flag;
|
|
}
|
|
\end{verbatim}
|
|
%}
|
|
%}\clearpage
|
|
|
|
\caption{Software Function: \textbf{read\_4\_20\_input}}
|
|
\label{fig:code_read_4_20_input}
|
|
%\label{fig:420i}
|
|
\end{figure}
|
|
\clearpage
|
|
We now look at the function called by \textbf{read\_4\_20\_input}, \textbf{read\_ADC}, which returns a
|
|
voltage for a given ADC channel.
|
|
%
|
|
This function
|
|
deals directly with the hardware in the micro-controller on which the software is running. %software on.
|
|
%
|
|
The software's job is to select the correct channel (ADC multiplexer) and then to initiate a
|
|
conversion by setting an ADC 'go' bit (see code sample in figure~\ref{fig:code_read_ADC}).
|
|
%
|
|
It takes the raw ADC reading and converts it into a
|
|
floating point\footnote{the type, `double' or `double precision', is a
|
|
standard C language floating point type~\cite{DBLP:books/ph/KernighanR88}.}
|
|
voltage value.
|
|
|
|
|
|
|
|
|
|
|
|
%{\vbox{
|
|
\begin{figure}[h+]
|
|
|
|
\footnotesize
|
|
\begin{verbatim}
|
|
/***********************************************/
|
|
/* read_ADC() */
|
|
/***********************************************/
|
|
/* Software function to read voltage from a */
|
|
/* specified ADC MUX channel */
|
|
/* Assume 10 ADC MUX channels 0..9 */
|
|
/* ADC_CHAN_RANGE = 9 */
|
|
/* Assume ADC is 12 bit and ADCRANGE = 4096 */
|
|
/* returns voltage read as double precision */
|
|
/***********************************************/
|
|
double read_ADC( int channel ) {
|
|
int timeout = 0;
|
|
/* require: a) input channel from ADC to be
|
|
in valid ADC range
|
|
b) voltage ref is 0.1% of 5V */
|
|
|
|
/* return out of range result */
|
|
/* if invalid channel selected */
|
|
if ( channnel > ADC_CHAN_RANGE )
|
|
return -2.0;
|
|
|
|
/* set the multiplexer to the desired channel */
|
|
ADCMUX = channel;
|
|
|
|
ADCGO = 1; /* initiate ADC conversion hardware */
|
|
|
|
/* wait for ADC conversion with timeout */
|
|
while ( ADCGO == 1 || timeout < 100 )
|
|
timeout++;
|
|
|
|
if ( timeout < 100 )
|
|
dval = (double) ADCOUT * 5.0 / ADCRANGE;
|
|
else
|
|
dval = -1.0; /* indicate invalid reading */
|
|
|
|
/* return voltage as a floating point value */
|
|
|
|
/* ensure: value is voltage input to within 0.1% */
|
|
|
|
return dval;
|
|
}
|
|
\end{verbatim}
|
|
\caption{Software Function: \textbf{read\_ADC}}
|
|
\label{fig:code_read_ADC}
|
|
\end{figure}
|
|
%}
|
|
%}
|
|
\clearpage
|
|
|
|
We now have a very simple software structure, a call tree, shown in figure~\ref{fig:ct1}.
|
|
|
|
\begin{figure}[h]
|
|
\centering
|
|
\includegraphics[width=100pt]{./CH5_Examples/ct1.png}
|
|
% ct1.png: 151x224 pixel, 72dpi, 5.33x7.90 cm, bb=0 0 151 224
|
|
\caption{Call tree for software example}
|
|
\label{fig:ct1}
|
|
\end{figure}
|
|
|
|
This software is above the hardware in the conceptual call tree---from a programmatic perspective---%in software terms---the
|
|
the software is reading values from the `lower~level' electronics.
|
|
%
|
|
FMEA is always a bottom-up process and so we must begin with this hardware.
|
|
%
|
|
The hardware is simply a load resistor, connected across an ADC input
|
|
pin on the micro-controller and ground.
|
|
%
|
|
We can identify the resistor and the ADC module of the micro-controller as
|
|
the base components in this design.
|
|
%
|
|
We now apply FMMD starting with the hardware.
|
|
|
|
|
|
\subsection{FMMD Process}
|
|
|
|
\paragraph{Functional Group - Convert mA to Voltage - CMATV}
|
|
|
|
This functional group contains the load resistor
|
|
and the physical Analogue to Digital Converter (ADC).
|
|
Our functional group, $G_1$ is thus the set of base components: $G_1 = \{R, ADC\}$.
|
|
We now determine the {\fms} of all the components in $G_1$.
|
|
For the resistor we can use a failure mode set from the literature~\cite{en298}.
|
|
Where the function $fm$ returns a set of failure modes for a given component we can state:
|
|
|
|
$$ fm(R) = \{OPEN,SHORT\}. $$
|
|
\vbox{
|
|
For the ADC we can determine the following failure modes:
|
|
|
|
\begin{itemize}
|
|
\item STUCKAT --- The ADC outputs a constant value,
|
|
\item MUXFAIL --- The ADC cannot select its input channel correctly,
|
|
\item LOW --- The ADC output is always LOW, or zero ADC counts,
|
|
\item HIGH --- The ADC output is always HIGH, or max ADC counts.
|
|
\end{itemize}
|
|
}
|
|
We can use the function $fm$ to define the {\fms} of an ADC thus:
|
|
$$ fm(ADC) = \{ STUCKAT, MUXFAIL,LOW, HIGH \}. $$
|
|
|
|
With these failure modes, we can analyse our first functional group, see table~\ref{tbl:cmatv}.
|
|
|
|
{
|
|
\tiny
|
|
\begin{table}[h+]
|
|
\center
|
|
\caption{$G_1$: Failure Mode Effects Analysis} % title of Table
|
|
\label{tbl:cmatv}
|
|
|
|
\begin{tabular}{|| l | c | l ||} \hline
|
|
%\textbf{Failure} & \textbf{failure} & \textbf{Symptom} \\
|
|
%\textbf{Scenario} & \textbf{effect} & \textbf{ADC } \\ \hline
|
|
% & & & & \\
|
|
|
|
\textbf{Failure} & \textbf{Failure } & \textbf{Derived Component} \\
|
|
\textbf{cause} & \textbf{Effect} & \textbf{Failure Mode} \\
|
|
|
|
|
|
\hline \hline
|
|
1: $R_{OPEN}$ & resistor open, & $HIGH$ \\
|
|
& voltage on pin high & \\ \hline
|
|
|
|
2: $R_{SHORT}$ & resistor shorted, & $LOW$ \\
|
|
& voltage on pin low & \\ \hline \hline
|
|
|
|
|
|
|
|
3: $ADC_{STUCKAT}$ & ADC reads out & $V\_ERR$ \\
|
|
& fixed value & \\ \hline
|
|
|
|
|
|
|
|
4: $ADC_{MUXFAIL}$ & ADC may read & $V\_ERR$ \\
|
|
& wrong channel & \\ \hline
|
|
|
|
5: $ADC_{LOW}$ & output low & $LOW$ \\
|
|
6: $ADC_{HIGH}$ & output high & $HIGH$ \\ \hline
|
|
7: post condition fails & software fails & $V\_ERR$ \\ \hline
|
|
|
|
|
|
\hline
|
|
|
|
|
|
\hline
|
|
|
|
\end{tabular}
|
|
\end{table}
|
|
}
|
|
|
|
|
|
We now collect the symptoms for the hardware functional group, $\{ HIGH , LOW, V\_ERR \} $.
|
|
We now create a {\dc} to represent this called $CMATV$.
|
|
|
|
%We can express this using the `$\derivec$' function thus:
|
|
%$$ CMATV = \; \derivec (G_1) .$$
|
|
|
|
As its failure modes, are the symptoms of failure from the functional group we can now state:
|
|
$$fm ( CMATV ) = \{ HIGH , LOW, V\_ERR \} .$$
|
|
|
|
|
|
\paragraph{Functional Group - Software - Read\_ADC - RADC}
|
|
\label{readADC}
|
|
The software function $Read\_ADC$ uses the ADC hardware analysed
|
|
as the {\dc} CMATV above.
|
|
|
|
|
|
The code fragment in figure~\ref{fig:code_read_ADC} states pre-conditions, as
|
|
{\em/* require: a) input channel from ADC to be
|
|
in valid ADC range
|
|
b) voltage ref is 0.1\% of 5V */}.
|
|
%
|
|
From the above contractual programming requirements, we see that
|
|
the function must be sent the correct channel number.
|
|
%
|
|
A violation of this can be considered a {\fm} of the function,
|
|
which we can call $ CHAN\_NO $.
|
|
%
|
|
The reference voltage for the ADC has a 0.1\% accuracy requirement.
|
|
%
|
|
If the reference value is outside of this, it is also a {\fm}
|
|
of this function, which we can call $V\_REF$.
|
|
|
|
Taken as a component for use in FMEA/FMMD our function has
|
|
two failure modes. We can therefore treat it as a generic component, $Read\_ADC$,
|
|
by stating:
|
|
|
|
$$ fm(Read\_ADC) = \{ CHAN\_NO, VREF \} $$
|
|
|
|
As we have a failure mode model for our function, we can now use it in conjunction with
|
|
with the ADC hardware {\dc} CMATV, to form a {\fg} $G_2$, where $G_2 =\{ CMSTV, Read\_ADC \}$.
|
|
|
|
We now analyse this hardware/software combined {\fg}.
|
|
|
|
|
|
|
|
{
|
|
\tiny
|
|
\begin{table}[h+]
|
|
\caption{$G_2$: Failure Mode Effects Analysis} % title of Table
|
|
\label{tbl:radc}
|
|
|
|
\begin{tabular}{|| l | c | l ||} \hline
|
|
% \textbf{Failure} & \textbf{failure} & \textbf{Symptom} \\
|
|
% \textbf{Scenario} & \textbf{effect} & \textbf{RADC } \\ \hline
|
|
|
|
\textbf{Failure} & \textbf{Failure } & \textbf{Derived Component} \\
|
|
\textbf{cause} & \textbf{Effect} & \textbf{Failure Mode} \\
|
|
|
|
|
|
\hline
|
|
1: ${CHAN\_NO}$ & wrong voltage & $VV\_ERR$ \\
|
|
& read & \\ \hline
|
|
|
|
2: ${VREF}$ & ADC volt-ref & $VV\_ERR$ \\
|
|
& incorrect & \\ \hline \hline
|
|
|
|
|
|
|
|
3: $CMATV_{V\_ERR}$ & voltage value & $VV\_ERR$ \\
|
|
& incorrect & \\ \hline
|
|
|
|
|
|
|
|
4: $CMATV_{HIGH}$ & ADC may read & $HIGH$ \\
|
|
& wrong channel & \\ \hline
|
|
|
|
5: $CMATV_{LOW}$ & output low & $LOW$ \\ \hline
|
|
|
|
6: post condition fails & software fails & $VV\_ERR$ \\ \hline
|
|
|
|
\hline
|
|
|
|
|
|
\hline
|
|
|
|
\end{tabular}
|
|
\end{table}
|
|
}
|
|
|
|
|
|
|
|
We now collect the symptoms of failure for the {\fg} analysed (see table~\ref{tbl:radc})
|
|
as $\{ VV\_ERR, HIGH, LOW \}$. We can add as well the violation of the postcondition
|
|
for the function.
|
|
This postcondition, {\em /* ensure: value is voltage input to within 0.1\% */ },
|
|
corresponds to $VV\_ERR$, and is already in the {\fm} set for this {\fg}.
|
|
|
|
%We can now create a {\dc} called $RADC$ thus: $$RADC = \; \derivec(G_2)$$ which has the following
|
|
%{\fms}:
|
|
We can now create a {\dc} called $RADC$ thus:
|
|
$$ fm(RADC) = \{ VV\_ERR, HIGH, LOW \} .$$
|
|
|
|
|
|
|
|
|
|
|
|
\paragraph{Functional Group - Software - voltage to per mil - VTPM }
|
|
|
|
This function sits on top of the $RADC$ {\dc} determined above.
|
|
We look at the pre-conditions for the function $read\_4\_20\_input$ , % which we can call $RI$
|
|
to determine its {\fms}.
|
|
Its pre-condition is, {\em /* require: input from ADC to be between 0.88 and 4.4 volts */}.
|
|
We can map this violation of the pre-condition, to the {\fm} VRNGE; %As this function has one pre-condition
|
|
we can state,
|
|
|
|
$$ fm(read\_4\_20\_input) = \{ VRNGE \} .$$
|
|
|
|
We can now form a functional group with the {\dc} $RADC$ and the
|
|
software component $read\_4\_20\_input$, i.e. $G_3 = \{read\_4\_20\_input, RADC\} $.
|
|
|
|
|
|
|
|
{
|
|
\tiny
|
|
\begin{table}[h+]
|
|
\caption{$G_3$: Read\_4\_20: Failure Mode Effects Analysis} % title of Table
|
|
\label{tbl:r420i}
|
|
|
|
\begin{tabular}{|| l | c | l ||} \hline
|
|
% \textbf{Failure} & \textbf{failure} & \textbf{Symptom} \\
|
|
% \textbf{Scenario} & \textbf{effect} & \textbf{RADC } \\ \hline
|
|
\hline
|
|
\textbf{Failure} & \textbf{Failure } & \textbf{Derived Component} \\
|
|
\textbf{cause} & \textbf{Effect} & \textbf{Failure Mode} \\
|
|
|
|
|
|
\hline
|
|
1: $RI_{VRGE}$ & voltage & $OUT\_OF\_$ \\
|
|
& outside range & $RANGE$ \\ \hline
|
|
|
|
2: $RADC_{VV_ERR}$ & voltage & $VAL\_ERR$ \\
|
|
& incorrect & \\ \hline \hline
|
|
|
|
|
|
|
|
3: $RADC_{HIGH}$ & voltage value & $VAL\_ERR$ \\
|
|
& incorrect & \\ \hline
|
|
|
|
|
|
|
|
4: $RADC_{LOW}$ & ADC low voltage & $OUT\_OF\_$ \\
|
|
& so out of range & $RANGE$ \\
|
|
& i.e. < 0.88V & \\
|
|
\hline
|
|
|
|
5: post condition fails & software fails & $VAL\_ERR$ \\ \hline
|
|
|
|
\hline
|
|
\hline
|
|
|
|
\end{tabular}
|
|
\end{table}
|
|
}
|
|
|
|
The failure symptoms for the {\fg} are $\{OUT\_OF\_RANGE, VAL\_ERR\}$.
|
|
The postcondition for the function $read\_4\_20\_input$, {\em /* ensure: value is proportional (0-999) to the
|
|
4 to 20mA input */} corresponds to the $VAL\_ERR$ and is already in the set of failure modes.
|
|
% \paragraph{Final Functional Group}
|
|
For single failures these are the two ways in which this function
|
|
can fail. An $OUT\_OF\_RANGE$ will be flagged by the error flag variable.
|
|
The $VAL\_ERR$ will simply mean that the value read is incorrect.
|
|
|
|
We can finally make a {\dc} to represent a failure mode model for our function $read\_4\_20\_input$. %thus:
|
|
|
|
% $$ R420I = \; \derivec(G_3) .$$
|
|
|
|
This new {\dc} has the following {\fms}:
|
|
$$fm(R420I) = \{OUT\_OF\_RANGE, VAL\_ERR\} .$$
|
|
|
|
%
|
|
% Using the derived components, CMATV and VTPM we create
|
|
% a new functional group. This
|
|
% integrates FMEA's from software and eletronics
|
|
% into the same failure mode model.
|
|
|
|
|
|
|
|
We can now represent the software/hardware FMMD analysis
|
|
as a hierarchical diagram, see figure~\ref{fig:eulerswhw}. % see figure~\ref{fig:hd}.
|
|
|
|
% HTR 27OCT2012 % \begin{figure}[h]
|
|
% HTR 27OCT2012 % \centering
|
|
% HTR 27OCT2012 % \includegraphics[width=200pt]{./CH5_Examples/hd.png}
|
|
% HTR 27OCT2012 % % hd.png: 363x520 pixel, 72dpi, 12.81x18.34 cm, bb=0 0 363 520
|
|
% HTR 27OCT2012 % \caption{FMMD hierarchy with hardware and software elements}
|
|
% HTR 27OCT2012 % \label{fig:hd}
|
|
% HTR 27OCT2012 % \end{figure}
|
|
|
|
\begin{figure}[h]
|
|
\centering
|
|
\includegraphics[width=300pt]{./CH5_Examples/eulerswhw.png}
|
|
% eulerswhw.png: 510x344 pixel, 72dpi, 17.99x12.14 cm, bb=0 0 510 344
|
|
\caption{Electronics and Software shown in an integrated failure mode
|
|
model---an Euler diagram showing relationship between {\dcs} determined from electronics and software---the two outermost contours are software functions,
|
|
and the inner two are electronic {\dcs}.}
|
|
\label{fig:eulerswhw}
|
|
\end{figure}
|
|
|
|
\subsection{Conclusion: {\ft} Reader Software/Hardware FMMD Model}
|
|
|
|
The {\dc} representing the {\ft} reader
|
|
in software shows that by FMMD, we can integrate
|
|
software and electro-mechanical FMMD models.
|
|
With this analysis
|
|
we have a complete `reasoning~path' linking the failures modes from the
|
|
electronics to those in the software.
|
|
Each functional group to {\dc} transition represents a
|
|
reasoning stage.
|
|
%
|
|
Each reasoning stage will have an associated analysis report.
|
|
%
|
|
With traditional FMEA methods the reasoning~distance is large, because
|
|
it stretches from the component failure mode to the top---or---system level failure.
|
|
For this reason applying traditional FMEA to software stretches
|
|
the reasoning distance even further. This is exacerbated by the fact that traditional SFMEA is
|
|
performed separately from HFMEA~\cite{sfmea,sfmeaa}, additionally even the software/hardware
|
|
interfacing is treated as a separate FMEA task~\cite{sfmeainterface,embedsfmea,procsfmea}
|
|
|
|
|
|
We now have a {\dc} for a {\ft} input in software.
|
|
Typically, more than one such input could be present in a real-world system.
|
|
Not only have we integrated electronics and software in an FMEA, we can also
|
|
re-use the analysis for each {\ft} input in the system.
|
|
|
|
The unsolved symptoms, or unobservable errors, i.e. $VAL\_ERR$ could be addressed
|
|
by another software function to read other known signals
|
|
via the MUX (i.e. voltage references). This strategy would
|
|
detect ADC\_STUCK\_AT and MUX\_FAIL failure modes.
|
|
|
|
A software specification for a hardware interface will concentrate on
|
|
how to interpret raw readings, or what signals to apply for actuators.
|
|
Using FMMD we can determine an accurate failure model for the interface as well~\cite{sfmeainterface}.
|
|
|
|
|
|
% HTR == HATE TO REMOVE
|
|
%HTR 18NOV2012 We can represent %the hierarchy in figure~\ref{fig:hd} algebraically,
|
|
%HTR 18NOV2012 the analysis hierarchy algebraically using the `$\derivec$' function:
|
|
%HTR 18NOV2012 %using the groups as intermediate stages:
|
|
%HTR 18NOV2012 \begin{eqnarray*}
|
|
%HTR 18NOV2012 G_1 &=& \{R,ADC\} \\
|
|
%HTR 18NOV2012 CMATV &=& \;\derivec (G_1) \\
|
|
%HTR 18NOV2012 G_2 &=& \{CMATV, read\_ADC \} \\
|
|
%HTR 18NOV2012 RADC &=& \; \derivec (G_2) \\
|
|
%HTR 18NOV2012 G_3 &=& \{ RADC, read\_4\_20\_input \} \\
|
|
%HTR 18NOV2012 R420I &=& \; \derivec (G_3) \\
|
|
%HTR 18NOV2012 \end{eqnarray*}
|
|
%HTR 18NOV2012 or, a nested definition,
|
|
%HTR 18NOV2012 $$ \derivec \Big( \derivec \big( \derivec(R,ADC), read\_4\_20\_input \big), read\_4\_20\_input \Big). $$
|
|
|
|
|
|
%\section
|
|
|
|
|
|
%HTR 18NOV2012 This nested structure means that we have multiple traceable
|
|
%HTR 18NOV2012 stages of failure mode reasoning in our analysis. Traditional FMEA would have only one stage
|
|
%HTR 18NOV2012 of reasoning for each component failure mode.
|
|
|
|
|
|
|
|
\section{Closed Loop Control Hardware/Software Hybrid Example}
|
|
|
|
It is desirable to model a complete standalone system with FMMD.
|
|
Not only a standalone system, but ideally a hybrid software/hardware system.
|
|
Temperature control is a first order differential problem, and is often
|
|
addressed using the Proportional Integral differential (PID) algorithm~\cite{dcods}[p.66].
|
|
Traditionally this was performed in analogue electronics
|
|
with trimmer potentiometers providing the P and I parameters.
|
|
Since the introduction of micro-processors, it has been possible to
|
|
implement PID programmatic-ally.
|
|
An FMMD analysis of a PID temperature controller would mean an
|
|
analysis of a standalone system without being un-wieldingly large.
|
|
\paragraph{PID Temperature Control.}
|
|
PID control starts with a setpoint, or desired value for a process
|
|
(here the temperature). It reads the process value and determines an error value for it.
|
|
The aim of the PID controller is to minimise this error term, by setting an output value,
|
|
which is fed back into the process (in this example the amount of power to supply the heater).
|
|
The error value is integrated and multiplied by an I constant.
|
|
A differential of the error value is calculated and multiplied by a D constant.
|
|
The error value its self is multiplied by a P constant, and all three of these are added
|
|
to obtain the output required.
|
|
\subsection{Design Stage: Implementation on a micro-controller.}
|
|
When designing a computer program it is often useful to
|
|
produce a structured analysis `Yourdon' context diagram~\cite{Yourdon:1989:MSA:62004}, see figure~\ref{fig:context_diagram_PID}.
|
|
|
|
\begin{figure}[h]+
|
|
\centering
|
|
\includegraphics[width=300pt]{./CH5_Examples/context_diagram_PID.png}
|
|
% context_diagram_PID.png: 818x324 pixel, 72dpi, 28.86x11.43 cm, bb=0 0 818 324
|
|
\caption{Yourdon Context Diagram for PID Temperature Controller.}
|
|
\label{fig:context_diagram_PID}
|
|
\end{figure}
|
|
We have two voltage inputs (see section~\ref{sec:Pt100}) from the Pt100 temperature sensor.
|
|
For the Pt100 sensor, we will need to read the voltages it outputs and for this
|
|
we will need an ADC and MUX.
|
|
%
|
|
For the output, we can use a Pulse Width Modulator (PWM) (this is a common module found on micro-controllers
|
|
allowing a variable power output~\cite{pwm}). PWM's ADC's and MUX's are commonly built into cheap micro-controllers~\cite{pic18f2523}.
|
|
We can now build more detail into the Yourdon diagram, with the afferent data flow coming through the MUX and ADC on the micro-controller, and the efferent
|
|
channelled through a PWM module, %again built into the micro-controller,
|
|
%
|
|
see figure~\ref{fig:context_diagram2_PID}.
|
|
\begin{figure}[h]+
|
|
\centering
|
|
\includegraphics[width=300pt]{./CH5_Examples/context_diagram2_PID.png}
|
|
% context_diagram_PID.png: 818x324 pixel, 72dpi, 28.86x11.43 cm, bb=0 0 818 324
|
|
\caption{Yourdon Context Diagram for PID Temperature Controller.}
|
|
\label{fig:context_diagram2_PID}
|
|
\end{figure}
|
|
The Yourdon methodology allows us to zoom into data transform bubbles and analyse them in more
|
|
detail.
|
|
%
|
|
We define the controlling software, by looking at or zooming into its transform bubble.
|
|
We have the inputs and outputs from the software.
|
|
We refine the data flow within the software and thus define software functions.
|
|
%, and
|
|
%this in terms of software functions.
|
|
%
|
|
We follow the data streams through the process, creating transform bubbles as required.
|
|
In all `bare~metal' software architectures, we need a rudimentary operating system, often referred to as the monitor.
|
|
%
|
|
We bare in mind that PID, because the algorithm depends heavily on integral calculus, is time sensitive
|
|
and we therefore need to call at precise intervals specific to its integration and differential coefficients.
|
|
%
|
|
Most micro-controllers feature several general purpose timers~\cite{pic18f2523}.
|
|
We can use an internal timer in conjunction with the monitor function
|
|
to call the PID algorithm at a specified interval.
|
|
%
|
|
\paragraph{Data flow model to programmatic call tree.}
|
|
The Yourdon methodology also gives us a guide as to which software
|
|
functions should be called to control the process, or in `C' terms be the main function.
|
|
%
|
|
\begin{figure}[h]
|
|
\centering
|
|
\includegraphics[width=300pt]{./CH5_Examples/context_software.png}
|
|
% context_software.png: 1023x500 pixel, 72dpi, 36.09x17.64 cm, bb=0 0 1023 500
|
|
\caption{Context diagram of the software in the PID temperature controller}
|
|
\label{fig:contextsoftware}
|
|
\end{figure}
|
|
Using figure~\ref{fig:contextsoftware} we can now pick the transform bubble we
|
|
want to be the `main' or controlling function in the software.
|
|
This can be thought of as picking one bubble and holding it up. The other bubbles hang underneath
|
|
forming the software call tree hierarchy, see figure~\ref{fig:context_calltree}.
|
|
From is clearly going to be the monitor function.
|
|
\begin{figure}[h]+
|
|
\centering
|
|
\includegraphics[width=300pt]{./CH5_Examples/context_calltree.png}
|
|
% context_calltree.png: 800x783 pixel, 72dpi, 28.22x27.62 cm, bb=0 0 800 783
|
|
\caption{Software yourdon diagram converted to programatic call tree.}
|
|
\label{fig:context_calltree}
|
|
\end{figure}
|
|
%
|
|
|
|
\paragraph{Software Algorithm.}
|
|
The monitor function will orchestrate the control process.
|
|
Firstly it will examine the timer value, and when appropriate, call the PID function, which will call first
|
|
the determine\_set\_point\_error function with that calling convert\_ADC\_to\_T
|
|
which calls Read\_ADC (the function developed in the earlier example).
|
|
%
|
|
With the set point error value the PID function will return
|
|
output control value to its calling
|
|
function (i.e. the PID
|
|
demand which will be returned to the monitor function).
|
|
%
|
|
On returning to the monitor function, it will return the PID demand value.
|
|
The PID demand value will be applied via the PWM.
|
|
We now have a rudimentary closed loop control system incorporating both hardware and software.
|
|
%
|
|
Using the Yourdon methodology we have the system design: we have all the components, i.e. hardware elements and software functions
|
|
that will be used in the temperature controller.
|
|
We list these, and begin, from the bottom-up, to apply FMMD analysis.
|
|
|
|
\clearpage
|
|
\subsection{FMMD Analysis of PID temperature Controller}
|
|
|
|
To summarise from the design stage,
|
|
Identified electronic components:
|
|
\begin{itemize}
|
|
\item ADCMUX --- Electronics, analysed in previous example.
|
|
\item TIMER --- Internal micro controller timer
|
|
\item HEATER --- Heating element, essentially a resistor.
|
|
\item Pt100 --- Pt100 Temperature sensor, as analysed in section~\ref{sec:Pt100}.
|
|
\item PWM --- Internal micro controller pulse width modulation module
|
|
\item General Purpose I/O (GPIO) ---
|
|
\item LEDs --- Indication LEDs via GPIO
|
|
\item micro-controller --- the medium for running the software
|
|
\end{itemize}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
\subsection{Temperature Controller Hardware Elements FMMD}
|
|
|
|
\paragraph{ACDMUX and Read\_ADC}
|
|
We re-use this derived component from section~\ref{readADC}.
|
|
$$ fm(RADC) = \{ VV\_ERR, HIGH, LOW \} .$$
|
|
|
|
|
|
\paragraph{TIMER}
|
|
The internal timer in use is a register which when read
|
|
returns an incremented time value.
|
|
Using two's complement mathematics, by subtracting
|
|
the time we last read it, we can calculate the interval
|
|
between readings (assuming the timer has not completely wrapped around).
|
|
We can say that a timer can fail by
|
|
incrementing its value at an incorrect rate, or can stop incrementing.
|
|
$$ fm(TIMER) = \{ STOPPED, INCORRECT\_INTERVAL \}$$
|
|
|
|
\paragraph{HEATER}
|
|
A heating element is typically some configuration of resistive wire.
|
|
It therefore has the same failure modes as a resistor and we can state
|
|
$$fm(HEATER) = \{ OPEN, SHORT \}$$
|
|
|
|
\paragraph{Pt100 Platinum Temperature Sensor}
|
|
The Pt100 four wire configuration is analysed in section~\ref{sec:Pt100}
|
|
$$ fm(Pt100) = \{ OUT\_OF\_RANGE \} $$
|
|
|
|
|
|
\paragraph{PWM}
|
|
The PWM, in use, is a hardware register written to with an integer value.
|
|
It then applies a mark space ratio proportional to that value providing
|
|
a means of applying varying amounts of power. When the PWM
|
|
action is halted the digital output pin associated with it will typically be held in a high or low state.
|
|
We therefore state:
|
|
$$ fm(PWM) = \{ HIGH, LOW \}.$$
|
|
|
|
\paragraph{Micro-Controller}
|
|
The Micro controller is a complex piece of highly integrated electronics.
|
|
Typically, along with a micro-processor with PROM and RAM, they have many I/O modules including UARTS, PWM, ADCMUX, CAN
|
|
General I/O and interrupt lines to name but a few.
|
|
In this project we are using the ADCMUX, TIMER, PWM and the general purpose computing facilities.
|
|
We have to therefore consider the general~computing, CLOCK, PROM and RAM failure modes.
|
|
$$fm (micro-controller) =\{ PROM\_FAULT, RAM\_FAULT, CPU\_FAULT, ALU\_FAULT, CLOCK\_STOPPED \}.$$
|
|
|
|
\subsection{Temperature Controller Software Elements FMMD}
|
|
Identified Software Components:
|
|
\begin{itemize}
|
|
\item --- Monitor (which calls PID algorithm and sets status LEDS)
|
|
\item --- PID (which calls determine\_set\_point\_error and output\_control)
|
|
\item --- determine\_set\_point\_error (which calls convert\_ADC\_to\_T)
|
|
\item --- convert\_ADC\_to\_T (which calls read\_ADC which we can re-use from the last example)
|
|
\item --- read\_ADC
|
|
\item --- output\_control (which sets the PWM hardware according to the PID demand value)
|
|
\end{itemize}
|
|
With the call tree structure defined (see figure~\ref{fig:context_calltree}), we can now analyse these
|
|
components from the bottom-up, starting with the afferent flow, the reading in of the temperature and its conversion
|
|
to a PID calculated heater output demand.
|
|
|
|
\subsubsection{Afferent flow FMMD analysis , Pt100, temperature, set point error, PID output demand.}
|
|
We start with the afferent flow from the Pt100.
|
|
%with the software, and consider the hardware elements
|
|
%used (if any) by each software function.
|
|
Starting at the bottom we form a {\fg} with
|
|
the function read\_ADC and the Pt100.
|
|
This gives us a {\dc} we shall call ReadPt100.
|
|
|
|
{
|
|
\tiny
|
|
\begin{table}[h+]
|
|
\caption{ Read\_Pt100: Failure Mode Effects Analysis} % title of Table
|
|
\label{tbl:readPt100}
|
|
|
|
\begin{tabular}{|| l | c | l ||} \hline
|
|
% \textbf{Failure} & \textbf{failure} & \textbf{Symptom} \\
|
|
% \textbf{Scenario} & \textbf{effect} & \textbf{RADC } \\ \hline
|
|
\hline
|
|
\textbf{Failure} & \textbf{Failure } & \textbf{Derived Component} \\
|
|
\textbf{cause} & \textbf{Effect} & \textbf{Failure Mode} \\
|
|
|
|
|
|
\hline
|
|
FC1: $RI_{VRGE}$ & voltage & $VOLTAGE\_HIGH$ \\
|
|
& outside range & \\ \hline
|
|
|
|
FC2: $RADC_{VV_ERR}$ & voltage & $VAL\_ERR$ \\
|
|
& incorrect & \\ \hline \hline
|
|
|
|
|
|
|
|
FC3: $RADC_{HIGH}$ & voltage value & $VAL\_ERR$ \\
|
|
& incorrect & \\ \hline
|
|
|
|
|
|
|
|
FC4: $RADC_{LOW}$ & ADC may read & $VOLTAGE\_LOW$ \\ \hline
|
|
\end{tabular}
|
|
\end{table}
|
|
}
|
|
|
|
The {\dc} Read\_Pt100 is a failure mode model of the Read\_ADC function and the Pt100
|
|
hardware, and has the following failure modes:
|
|
|
|
$$ fm (Read\_Pt100) = \{ VOLTAGE\_HIGH, VAL\_ERR, VOLTAGE\_LOW \}. $$
|
|
|
|
|
|
We can now move along in the afferent flow, and we come to the convert\_ADC\_to\_T function.
|
|
This will call Read\_ADC thwice, one for the high Pt100 value, again for the lower. % and once for to read a current sense.
|
|
We then, calculate the resistance of the Pt100 element, and with this---using a
|
|
polynomial or a lookup table~\cite{eutothermtables}---and calculate the temperature.
|
|
The pre-conditions for the function are that:
|
|
\begin{itemize}
|
|
% \item The current calculated is within pre-defined bounds i.e. Pt100\_current,
|
|
\item The lower Pt100 value is within an acceptable voltage range i.e. Pt100\_lower\_voltage,
|
|
\item The higher Pt100 value is within an acceptable voltage range i.e. Pt100\_higher\_voltage,
|
|
\item The Lower and higher values agree to within a given tolerance i.e. Pt100\_high\_low\_mismatch.
|
|
\end{itemize}
|
|
Any violation of these pre-conditions is equivalent to a failure mode.
|
|
Note that a temperature outside the pre-defined range will also cause these errors.
|
|
The postcondition is that it returns a temperature within a given tolerance to the temperature at the sensor.
|
|
A failure of this post-condition can be termed temp\_incorrect.
|
|
\clearpage
|
|
We now apply FMMD to the {\fg} formed by Read\_Pt100 and the function convert\_ADC\_to\_T.
|
|
We can call the resulting {\dc} Get\_Temperature.
|
|
|
|
{
|
|
\tiny
|
|
\begin{table}[h+]
|
|
\caption{ Get\_Temperature: Failure Mode Effects Analysis} % title of Table
|
|
\label{tbl:gettemperature}
|
|
|
|
\begin{tabular}{|| l | c | l ||} \hline
|
|
% \textbf{Failure} & \textbf{failure} & \textbf{Symptom} \\
|
|
% \textbf{Scenario} & \textbf{effect} & \textbf{RADC } \\ \hline
|
|
\hline
|
|
\textbf{Failure} & \textbf{Failure } & \textbf{Derived Component} \\
|
|
\textbf{cause} & \textbf{Effect} & \textbf{Failure Mode} \\
|
|
|
|
|
|
\hline
|
|
FC1: $Pt100:Voltage\_High$ & Pt100 voltage too high & Pt100\_out\_of\_range \\
|
|
& Pt100\_higher\_voltage & \\
|
|
& OR Pt100\_current & \\ \hline
|
|
|
|
FC2: $Pt100:Voltage\_Low$ & Pt100 voltage too low & Pt100\_out\_of\_range \\
|
|
& Pt100\_lower\_voltage & \\
|
|
& OR Pt100\_current & \\ \hline
|
|
|
|
|
|
|
|
|
|
FC3: $Pt100\_high\_low\_mismatch$ & temperature can be calculated & Pt100\_out\_of\_range \\
|
|
& from either high or low & \\
|
|
& reading, but should correlate & \\ \hline
|
|
|
|
|
|
% FC4: $Pt100\_current$ & the current applied is & Pt100\_out\_of\_range \\
|
|
% & necessary to calculate resistance, & \\
|
|
% & but should be within given bounds & \\ \hline
|
|
%
|
|
%
|
|
|
|
FC4: $Pt100:VAL\_ERR$ & could cause an out of & temp\_incorrect\\
|
|
& range error, but may also & \\
|
|
& cause us to read an & \\
|
|
& incorrect temperature & \\ \hline
|
|
\hline
|
|
|
|
\end{tabular}
|
|
\end{table}
|
|
}
|
|
|
|
|
|
We now collect the failure symptoms for the {\dc} Get\_Temperature and can state:
|
|
|
|
$$fm(Get\_Temperature) = \{ Pt100\_out\_of\_range, temp\_incorrect \}$$
|
|
\clearpage
|
|
|
|
Following the afferent flow further, we come to a function to determine the control error value.
|
|
The is simply the target temperature subtracted from the measured.
|
|
We thus form a {\fg} with our newly {\dc} Get\_Temperature
|
|
and the function determine\_set\_point\_error.
|
|
|
|
The pre-condition for determine\_set\_point\_error is that the temperature read by it
|
|
is accurate, and its post condition is to return the correct control error value.
|
|
Most failure modes from a Pt100 are observable.
|
|
we can divide the post condition into two variants, a known incorrect error value, KnownIncorrectErrorValue
|
|
where we can detect the Pt100 value is suspect, and IncorrectErrorValue where we simply have
|
|
an incorrect error value.
|
|
|
|
|
|
{
|
|
\tiny
|
|
\begin{table}[h+]
|
|
\caption{ GetError: Failure Mode Effects Analysis} % title of Table
|
|
\label{tbl:geterror}
|
|
|
|
\begin{tabular}{|| l | c | l ||} \hline
|
|
% \textbf{Failure} & \textbf{failure} & \textbf{Symptom} \\
|
|
% \textbf{Scenario} & \textbf{effect} & \textbf{RADC } \\ \hline
|
|
\hline
|
|
\textbf{Failure} & \textbf{Failure } & \textbf{Derived Component} \\
|
|
\textbf{cause} & \textbf{Effect} & \textbf{Failure Mode} \\
|
|
|
|
|
|
\hline
|
|
FC1: $ Pt100\_out\_of\_range $ & pre-condition violated & KnownIncorrectErrorValue \\
|
|
& observable/detectable & \\
|
|
& failure mode & \\ \hline
|
|
|
|
FC2: $temp\_incorrect$ & pre-condition violated & IncorrectErrorValue \\
|
|
& unobservable & \\
|
|
& undetectable failure mode & \\ \hline
|
|
|
|
|
|
|
|
\end{tabular}
|
|
\end{table}
|
|
}
|
|
|
|
|
|
We collect failure mode symptoms, and can create a new {\dc} GetError
|
|
where
|
|
$$fm(GetError) = \{ KnownIncorrectErrorValue, IncorrectErrorValue \}.$$
|
|
|
|
|
|
We now follow the afferent path to the PID algorithm.
|
|
Here we assume that the PID constants are fixed (i.e. are not parameters).
|
|
We use the $GetError$ {\dc} and the PID function to form a {\fg}.
|
|
The pre-condition for the PID function is that % are that it is called
|
|
%iat the correct frequency and that
|
|
it receives the correct error value.
|
|
The post-condition is that it outputs correct control values.
|
|
% RESP FOR TIMEING IS ON CALLING FUNCTION AND IS A SEPARATE ERROR- TGHINK ABOUT JITTER.....
|
|
% and controll values..... Jitter might not matter, wrong int times would
|
|
% controlling function provdes context of use.
|
|
Those familiar with the PID algorithm may here notice raise the point of calling frequency.
|
|
were this function to be called at an incorrect rate its output
|
|
would be wrong (the differential and integral parameters would effectively have been changed).
|
|
However this problem is a failure mode for the function calling it.
|
|
The calling function sets the context for the PID algorithm (i.e. what it is used for).
|
|
If this PID were to be used, say as some form of low pass filter, we could consider jitter
|
|
for instance. In a control environment with PID jitter would not be a significant factor.
|
|
This harks back to the context of use (see section~\ref{sec:subjectiveobjective}) discussion, the subjective
|
|
being the context the {\dc} is used for/in, and the objective
|
|
being the logic and process of the failure mode analysis.
|
|
|
|
{
|
|
\tiny
|
|
\begin{table}[h+]
|
|
\caption{ PID: Failure Mode Effects Analysis} % title of Table
|
|
\label{tbl:pidfunction}
|
|
|
|
\begin{tabular}{|| l | c | l ||} \hline
|
|
% \textbf{Failure} & \textbf{failure} & \textbf{Symptom} \\
|
|
% \textbf{Scenario} & \textbf{effect} & \textbf{RADC } \\ \hline
|
|
\hline
|
|
\textbf{Failure} & \textbf{Failure } & \textbf{Derived Component} \\
|
|
\textbf{cause} & \textbf{Effect} & \textbf{Failure Mode} \\
|
|
|
|
|
|
\hline
|
|
FC1: $ KnownIncorrectErrorValue $ & pre-condition violated & KnownControlValueErrorV \\
|
|
& observable/detectable & \\
|
|
& failure mode & \\ \hline
|
|
|
|
FC2: $ IncorrectErrorValue $ & pre-condition violated & IncorrectControlErrorV \\
|
|
& unobservable & \\
|
|
& undetectable failure mode & \\ \hline
|
|
|
|
|
|
|
|
\end{tabular}
|
|
\end{table}
|
|
}
|
|
|
|
We now create a PID {\dc}, with the following failure modes:
|
|
|
|
$$ fm(PID) = \{ KnownControlValueErrorV, IncorrectControlErrorV \} .$$
|
|
|
|
\begin{figure}[h]
|
|
\centering
|
|
\includegraphics[width=400pt]{./CH5_Examples/euler_afferent_PID.png}
|
|
% euler_afferent_PID.png: 1002x342 pixel, 72dpi, 35.35x12.06 cm, bb=0 0 1002 342
|
|
\caption{Euler diagram representing the hierarchy of FMMD analysis applied to the afferent branch of call tree for the PID temperature controller example.}
|
|
\label{fig:euler_afferent_PID}
|
|
\end{figure}
|
|
|
|
|
|
|
|
We have now modelled the the software call tree for the afferent flow, we represent this as an Euler diagram in figure~\ref{fig:euler_afferent_PID}.
|
|
Two call tree branches remain. The LED indication branch and the
|
|
PWM/heater output.
|
|
|
|
\subsubsection{Efferent flow, PID demand value to PWM output}
|
|
|
|
The monitor function calls the output\_control function with the PID demand.
|
|
The output\_control function then sets the PWM hardware register, which causes the mark space output of the PWM module to
|
|
apply the demanded power. We form a {\fg} with the Heating element, a PWM module and the output\_control function to model this branch
|
|
of the efferent flow. We apply FMMD analysis to this {\fg} in table~\ref{tbl:heateroutput}.
|
|
For the output\_control function, we have a pre-condition that the PWM module is
|
|
configured and working, and has the correct clock frequency.
|
|
A second pre-condition is that the heating element is connected and working.
|
|
The post condition is that is sets the correct value into the PWM register
|
|
to implement the PWM demand.
|
|
|
|
{
|
|
\tiny
|
|
\begin{table}[h+]
|
|
\caption{ HeaterOutput: Failure Mode Effects Analysis} % title of Table
|
|
\label{tbl:heateroutput}
|
|
|
|
\begin{tabular}{|| l | c | l ||} \hline
|
|
% \textbf{Failure} & \textbf{failure} & \textbf{Symptom} \\
|
|
% \textbf{Scenario} & \textbf{effect} & \textbf{RADC } \\ \hline
|
|
\hline
|
|
\textbf{Failure} & \textbf{Failure } & \textbf{Derived Component} \\
|
|
\textbf{cause} & \textbf{Effect} & \textbf{Failure Mode} \\
|
|
|
|
|
|
\hline
|
|
FC1: $ PWM stuck HIGH $ & pre-condition violated & HeaterOnFull \\
|
|
& PWM module not working & \\ \hline
|
|
|
|
|
|
FC2: $ PWM stuck LOW $ & pre-condition violated & HeaterOff \\
|
|
& PWM module not working & \\ \hline
|
|
|
|
FC3: $ output\_control$ wrong value & The software supplies the wrong & HeaterOutputIncorrect \\
|
|
& value to the PWM register & \\ \hline
|
|
|
|
|
|
FC4: HEATER $SHORT$ & heating element resistor & HeaterOff \\
|
|
& SHORT no heating effect & \\ \hline
|
|
|
|
|
|
FC5: HEATER $OPEN $ & heating element resistor & HeaterOff \\
|
|
& OPEN no heating effect & \\ \hline
|
|
|
|
\end{tabular}
|
|
\end{table}
|
|
}
|
|
|
|
We now create a {\dc} called HeaterOutput
|
|
with the following failure modes:
|
|
$$fm(HeaterOutput) = \{ HeaterOnFull, HeaterOff, HeaterOutputIncorrect \}$$
|
|
|
|
|
|
|
|
\begin{figure}[h]
|
|
\centering
|
|
\includegraphics[width=300pt]{./CH5_Examples/euler_heater_output.png}
|
|
% euler_heater_output.png: 392x141 pixel, 72dpi, 13.83x4.97 cm, bb=0 0 392 141
|
|
\caption{Euler diagram showing HeaterOutput with its two hardware components, PWM and HEATER, and its software component output\_control.}
|
|
\label{fig:eulerheateroutput}
|
|
\end{figure}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
\subsubsection{Efferent flow: LED status LEDs}
|
|
|
|
The status LEDS will be controlled by general purpose (GPIO) I/O pins.
|
|
We could have say, three LEDS one flashing with a human readable mark
|
|
space ratio representing the heater output, one flashing at a regular interval to
|
|
indicate the processor is alive and another flashing at an interval related to the temperature,
|
|
(to indicate if the temperature readings are within expected ranges).
|
|
Each LED should flash in normal operation, and any LED being permanently on or off
|
|
would indicate to the operator that an error had occurred.
|
|
The pre condition for this function is that the GPIO
|
|
is connected to working LEDS.
|
|
The post condition is that the function setLEDS, will supply correct indication by flashing the LEDs.
|
|
We form a {\fg} from the GPIO, the LEDs and the software function setLEDs.
|
|
We apply FMMD analysis to this {\fg} in table~\ref{tbl:ledoutput}.
|
|
|
|
{
|
|
\tiny
|
|
\begin{table}[h+]
|
|
\caption{ LEDOutput: Failure Mode Effects Analysis} % title of Table
|
|
\label{tbl:ledoutput}
|
|
|
|
\begin{tabular}{|| l | c | l ||} \hline
|
|
% \textbf{Failure} & \textbf{failure} & \textbf{Symptom} \\
|
|
% \textbf{Scenario} & \textbf{effect} & \textbf{RADC } \\ \hline
|
|
\hline
|
|
\textbf{Failure} & \textbf{Failure } & \textbf{Derived Component} \\
|
|
\textbf{cause} & \textbf{Effect} & \textbf{Failure Mode} \\
|
|
|
|
|
|
\hline
|
|
FC1: $ Temp LED fails $ & LED will not light & FailureIndicated \\
|
|
& & \\ \hline
|
|
|
|
|
|
FC2: $ Processor LED fails $ & LED will not light & FailureIndicated \\
|
|
& & \\ \hline
|
|
|
|
FC3: $ PWM LED fails $ & LED will not light & FailureIndicated \\
|
|
& & \\ \hline
|
|
|
|
FC4: GPIO stuck HIGH & LED permanently OFF & FailureIndicated \\ \hline
|
|
|
|
|
|
FC5: GPIO stuck Low & LED permanently ON & FailureIndicated \\ \hline
|
|
|
|
|
|
FC6: Software SetLEDs & Incorrect Indication & IndicationError \\
|
|
fails to set outputs correctly & Post condition failure & \\ \hline
|
|
|
|
|
|
|
|
\end{tabular}
|
|
\end{table}
|
|
}
|
|
|
|
|
|
|
|
|
|
\begin{figure}[h]
|
|
\centering
|
|
\includegraphics[width=300pt]{./CH5_Examples/euler_led_output.png}
|
|
% euler_heater_output.png: 392x141 pixel, 72dpi, 13.83x4.97 cm, bb=0 0 392 141
|
|
\caption{Euler diagram showing LEDOutput with its three LEDs and GPIO hardware elements, and its
|
|
and its software component setLEDS.}
|
|
\label{fig:eulerheateroutput}
|
|
\end{figure}
|
|
|
|
|
|
|
|
|
|
\subsubsection{Final Analysis Stage: PID Temperature Controller}
|
|
|
|
The possibility of each software function failing its post condition without a direct
|
|
underlying cause from one of its components has been included in each analysis stage
|
|
involving software. This is because software introduces the possibility of
|
|
anything going wrong! The common causes for software failing are:
|
|
\begin{itemize}
|
|
\item Value/RAM corruption typically from interrupt contention problems or accidental over writing~\cite{swseatbelt},
|
|
but can be from external sources such as radiation changing bits/values at runtime~\cite{5963919, 5488118};
|
|
\item Address bus errors leading to program errors (program sequence);
|
|
\item ROM memory failures;
|
|
\item Unintended behaviour of software.
|
|
\end{itemize}
|
|
Because the software is running on a medium, that of the processor or micro-controller
|
|
our design at the final or highest level (see table~\ref{tbl:pid}), must include all possible failure modes of this medium i.e.
|
|
$$fm (micro-controller) =\{ PROM\_FAULT, RAM\_FAULT, CPU\_FAULT, ALU\_FAULT, CLOCK\_STOPPED \}.$$
|
|
We perform the final FMMD stage by forming a functional group with the {\dcs}
|
|
determined previously:
|
|
%
|
|
\begin{itemize}
|
|
\item PID
|
|
\item HeaterOutput
|
|
\item LEDoutput
|
|
\item and the function `monitor'.
|
|
\end{itemize}
|
|
|
|
The post condition for the monitor function is that it implements the PID control task correctly.
|
|
|
|
|
|
{
|
|
\tiny
|
|
\begin{table}[h+]
|
|
\caption{ standalone temperature controller: Failure Mode Effects Analysis} % title of Table
|
|
\label{tbl:pid}
|
|
|
|
\begin{tabular}{|| l | c | l ||} \hline
|
|
% \textbf{Failure} & \textbf{failure} & \textbf{Symptom} \\
|
|
% \textbf{Scenario} & \textbf{effect} & \textbf{RADC } \\ \hline
|
|
\hline
|
|
\textbf{Failure} & \textbf{Failure } & \textbf{Derived Component} \\
|
|
\textbf{cause} & \textbf{Effect} & \textbf{Failure Mode} \\
|
|
|
|
|
|
\hline
|
|
FC1: PID KnownControlValueError & As error is detectable/ & ControlFailureIndicated \\
|
|
& observable error can be indicated & \\ \hline
|
|
|
|
|
|
FC2: PID IncorrectControlerrorV & undetectable/iunobservable & ControlFailure \\
|
|
& failure PID will not control properly & \\ \hline
|
|
|
|
FC3: HeaterOutput & Heater will constantly & ControlFailureIndicated \\
|
|
HeaterOnFULL & apply maximum power & \\ \hline
|
|
|
|
FC4: HeaterOutput & heater will supply & ControlFailureIndicated \\ \hline
|
|
HeaterOFF & no power & \\
|
|
|
|
FC5: HeaterOutput & with incorrect hower applied & ControlFailure \\ \hline
|
|
HeaterOutputIncorrect & control will not be effective & \\
|
|
|
|
FC6: LEDOutput & failure of LED system & KnownIndicationError \\
|
|
FailureIndicated & where failure is observable & \\ \hline
|
|
|
|
FC7: LEDOutput & failure of LED system & UnknownIndicationError \\
|
|
IndicationError & where failure is unobservable & \\ \hline
|
|
|
|
|
|
%% PROM\_FAULT, RAM\_FAULT, CPU\_FAULT, ALU\_FAULT, CLOCK\_STOPPED
|
|
|
|
|
|
FC8: micro-controller & un-defined behaviour & ControlFailure \\
|
|
PROM\_FAULT & & \\ \hline
|
|
|
|
FC9: micro-controller & un-defined behaviour & ControlFailure \\
|
|
RAM\_FAULT & & \\ \hline
|
|
|
|
FC10: micro-controller & un-defined behaviour & ControlFailure \\
|
|
CPU\_FAULT & & \\ \hline
|
|
|
|
FC11: micro-controller & incorrect arithmetic & ControlFailure \\
|
|
ALU\_FAULT & performed in processing & \\ \hline
|
|
|
|
FC12: micro-controller & processor will not run & ControlFailureIndicated \\
|
|
CLOCK\_STOPPED & indicator leds will not flash & \\ \hline
|
|
|
|
FC13: monitor: & postcondition fails & ControlFailure \\
|
|
software fails & & \\ \hline
|
|
|
|
|
|
\hline
|
|
|
|
|
|
\end{tabular}
|
|
\end{table}
|
|
}
|
|
|
|
We can now create a {\dc} for the standalone temperature controller, and give it the name TempController.
|
|
It will have the following failure modes:
|
|
|
|
$$fm ( TempController ) = \{ ControlFailureIndicated, ControlFailure, KnownIndicationError, UnknownIndicationError \}$$
|
|
|
|
|
|
We can now represent this failure mode analysis as an Euler diagram, see figure~\ref{fig:euler_temp_controller}.
|
|
|
|
|
|
\begin{figure}[h]
|
|
\centering
|
|
\includegraphics[width=300pt]{./CH5_Examples/euler_temp_controller.png}
|
|
% euler_temp_controller.png: 714x251 pixel, 72dpi, 25.19x8.85 cm, bb=0 0 714 251
|
|
\caption{euler diagram of the temperature controller final anaysis stage, showing the hybrid software/hardware {\dcs} and the function at the head of the call tree `monitor'.}
|
|
\label{fig:euler_temp_controller}
|
|
\end{figure}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
%OK STOP AT PID and follow the other data flows until we are ready to bring them to the top: i.e.
|
|
%
|
|
%the monitor program.......
|
|
|
|
|
|
|
|
%\clearpage
|
|
|