night night edit, but have not eaten yet (GEDDIT)

This commit is contained in:
Robin P. Clark 2012-07-30 20:27:13 +01:00
parent 5db87d00d5
commit ec8d020d4e

View File

@ -65,32 +65,32 @@ failure mode of the component or sub-system}}}
\newboolean{dag} \newboolean{dag}
\setboolean{dag}{true} % boolvar=true or false : draw analysis using directed acylic graphs \setboolean{dag}{true} % boolvar=true or false : draw analysis using directed acylic graphs
\setlength{\topmargin}{0in} % \setlength{\topmargin}{0in}
\setlength{\headheight}{0in} % \setlength{\headheight}{0in}
\setlength{\headsep}{0in} % \setlength{\headsep}{0in}
\setlength{\textheight}{22cm} % \setlength{\textheight}{22cm}
\setlength{\textwidth}{18cm} % \setlength{\textwidth}{18cm}
%\setlength{\textheight}{24.35cm} % %\setlength{\textheight}{24.35cm}
%\setlength{\textwidth}{20cm} % %\setlength{\textwidth}{20cm}
\setlength{\oddsidemargin}{0in} % \setlength{\oddsidemargin}{0in}
\setlength{\evensidemargin}{0in} % \setlength{\evensidemargin}{0in}
\setlength{\parindent}{0.0in} % \setlength{\parindent}{0.0in}
%\setlength{\parskip}{6pt} % %\setlength{\parskip}{6pt}
% \setlength{\parskip}{1cm plus4mm minus3mm} % % \setlength{\parskip}{1cm plus4mm minus3mm}
\setlength{\parskip}{0pt} % \setlength{\parskip}{0pt}
\setlength{\parsep}{0pt} % \setlength{\parsep}{0pt}
\setlength{\headsep}{0pt} % \setlength{\headsep}{0pt}
\setlength{\topskip}{0pt} % \setlength{\topskip}{0pt}
\setlength{\topmargin}{0pt} % \setlength{\topmargin}{0pt}
\setlength{\topsep}{0pt} % \setlength{\topsep}{0pt}
\setlength{\partopsep}{0pt} % \setlength{\partopsep}{0pt}
\setlength{\itemsep}{1pt} % \setlength{\itemsep}{1pt}
% \renewcommand\subsection{\@startsection % \renewcommand\subsection{\@startsection
% {subsection}{2}{0mm}% % {subsection}{2}{0mm}%
% {-\baslineskip} % {-\baslineskip}
% {0.5\baselineskip} % {0.5\baselineskip}
% {\normalfont\normalsize\itshape}}% % {\normalfont\normalsize\itshape}}%
\linespread{0.953} \linespread{1.0}
\begin{document} \begin{document}
%\pagestyle{fancy} %\pagestyle{fancy}
@ -144,13 +144,16 @@ failure mode of the component or sub-system}}}
This paper presents a worked example of FMEA applied to an This paper presents a worked example of FMEA applied to an
integrated electronics/software system. integrated electronics/software system.
% %
FMEA methodologies trace from the 1940's and were designed to %FMEA methodologies trace from the 1940's and were designed to
%model simple electro-mechanical systems.
%
FMEA methodologies were originally in the 1940's designed to
model simple electro-mechanical systems. model simple electro-mechanical systems.
% %
Software generally sits on top of most modern safety critical control systems Software generally sits on top of most modern safety critical control systems
and defines its most important system wide behaviour and communications. and defines its most important system wide behaviour and communications.
% %
Currently standards that demand FMEA for hardware(HFMEA) (e.g. EN298, EN61508), Currently standards that demand FMEA investigations for hardware(HFMEA) (e.g. EN298, EN61508),
do not specify it for software, but instead specify good practise, do not specify it for software, but instead specify good practise,
review processes and language feature constraints. review processes and language feature constraints.
% %
@ -161,12 +164,22 @@ traces component {\fms}
to resultant system failures, software until recently, has been left in a non-analytical to resultant system failures, software until recently, has been left in a non-analytical
limbo of best practises and constraints. limbo of best practises and constraints.
Software FMEA has been proposed Software FMEA has been proposed
in several forms. SFMEA is always performed separately from HFMEA. in several forms.
%
However, SFMEA is always performed separately from HFMEA.
% %
This paper seeks to examine the effectiveness of current and proposed SFMEA This paper seeks to examine the effectiveness of current and proposed SFMEA
techniques, by using a analysing the chosen example, which is well known and understood techniques, by analysing a simple hybrid hardware/software system,
from years of field experience, and determining how well the HFMEA and SFMEA which is in common use and has mature field experience. %
analysis reports model the failure mode behaviour. %analysing the chosen example, which is well known and understood
%
Because the chosen example is well understood it is
%, this example is
useful
to compare the results from these FMEA methodologies with
the known failure mode behaviour.
%from years of field experience, and determining how well the HFMEA and SFMEA
%analysis reports model the failure mode behaviour.
% % % %
%If software and hardware integrated FMEA were possible, electro-mechanical-software hybrids could %If software and hardware integrated FMEA were possible, electro-mechanical-software hybrids could
%be modelled, and so we could consider `complete' failure mode models. %be modelled, and so we could consider `complete' failure mode models.
@ -205,7 +218,7 @@ component failure modes, %and by reasoning,
tracing their effects through a system tracing their effects through a system
and determining what system level failure modes could be caused. and determining what system level failure modes could be caused.
% %
FMEA dates from the 1940s where simple electro-mechanical systems were the norm. FMEA has its roots in the previous century where simple electro-mechanical systems were the norm.
Modern control systems nearly always have a significant software/firmware element, Modern control systems nearly always have a significant software/firmware element,
and not being able to model software with current FMEA methodologies and not being able to model software with current FMEA methodologies
is a cause for criticism~\cite{safeware}[Ch.12]. is a cause for criticism~\cite{safeware}[Ch.12].
@ -260,19 +273,20 @@ base component {\fms}, and translating them into system level events/failures~\c
In a complicated system, mapping a component failure mode to a system level failure In a complicated system, mapping a component failure mode to a system level failure
will mean a long reasoning distance; that is to say the actions of the will mean a long reasoning distance; that is to say the actions of the
failed component will have to be traced through failed component will have to be traced through
several sub-systems, gauging its effects with other components. several sub-systems, gauging its effects with and on other components.
% %
With software at the higher levels of these sub-systems, With software at the higher levels of these sub-systems,
we have yet another layer of complication. we have yet another layer of complication.
% %
In order to integrate software, %in a meaningful way %In order to integrate software, %in a meaningful way
we need to re-think the %we need to re-think the
FMEA concept of simply mapping a base component failure to a system level event. %FMEA concept of simply mapping a base component failure to a system level event.
% %
SFMEA regards the components to be the variables used by the programs. SFMEA regard, in place of hardware components, the variables used by the programs to be their equivalent~\cite{procsfmea}.
These variables could become erroneously over-written, The failure modes of these variables, are that they could become erroneously over-written,
by calculated incorrectly (due to a mistake by the programmer, or a fault in the micro-processor it is running on, or calculated incorrectly (due to a mistake by the programmer, or a fault in the micro-processor it is running on), or
by radiation causing bits to be erroneously altered. external influences such as
ionising radiation causing bits to be erroneously altered.
\paragraph{A more-complete Failure Mode Model} \paragraph{A more-complete Failure Mode Model}
@ -287,8 +301,8 @@ by radiation causing bits to be erroneously altered.
% %
In order to obtain a more complete failure mode model of In order to obtain a more complete failure mode model of
a hybrid electronic/software system we need to analyse a hybrid electronic/software system we need to analyse
the hardware, the software, the hardware the software runs on, the hardware, the software, the hardware the software runs on (i.e. the software's medium),
and the software hardware interface. and the software/hardware interface.
% %
HFMEA is a well established technique and needs no further description in this paper. HFMEA is a well established technique and needs no further description in this paper.
@ -301,7 +315,7 @@ to supply a current signal to represent the value to be sent~\cite{aoe}[p.934].
Usually, $4mA$ represents a zero or starting value and $20mA$ represents the full scale, Usually, $4mA$ represents a zero or starting value and $20mA$ represents the full scale,
and this is referred to as {\ft} signalling. and this is referred to as {\ft} signalling.
% %
{\ft} has an electrical advantage as well because the current in a loop is constant~\cite{aoe}[p.20]. {\ft} has an electrical advantage as well because the current in an electronic loop is constant~\cite{aoe}[p.20].
Thus resistance in the wires between the source and the receiving end is not an issue Thus resistance in the wires between the source and the receiving end is not an issue
that can alter the accuracy of the signal. that can alter the accuracy of the signal.
% %
@ -332,25 +346,26 @@ The diagram in figure~\ref{fig:ftcontext} shows some equipment which is sending
signal to a micro-controller system. signal to a micro-controller system.
The signal is locally driven over a load resistor, and then read into the micro-controller via The signal is locally driven over a load resistor, and then read into the micro-controller via
an ADC and its multiplexer. an ADC and its multiplexer.
With the voltage detected at the ADC the multiplexer can read the intended quantitative With the voltage detected at the ADC the multiplexer we read the intended quantitative
value from the external equipment. value from the external equipment.
\subsection{Simple Software Example} \subsection{Simple Software Example}
Consider a software function that reads a {\ft} input, and returns a value between 0 and 999 (i.e. per mil $\permil$) Consider a software function that reads a {\ft} input, and returns a value between 0 and 999 (i.e. per mil $\permil$)
representing the current detected with an additional error indication flag. representing the value intended by the current detected, with an additional error indication flag to indicate the validity
of the value returned.
% %
Let us assume the {\ft} detection is via a \ohms{220} resistor, and that we read a voltage Let us assume the {\ft} detection is via a \ohms{220} resistor, and that we read a voltage
from an ADC into the software. from an ADC into the software.
Let us define any value outside the 4mA to 20mA range as an error condition. Let us define any value outside the 4mA to 20mA range as an error condition.
% %
As a voltage, we use ohms law~\cite{aoe} to determine the voltage ranges: $V=IR$, $0.004A * \ohms{220} = 0.88V$ As a voltage, we use ohms law~\cite{aoe} to determine the voltage ranges: $V=IR$, $0.004A * \ohms{220} = 0.88V$
and $0.020A * \ohms{220} = 4.4V$. and $$0.020A * \ohms{220} = 4.4V \;.$$
% %
Our acceptable voltage range is therefore Our acceptable voltage range is therefore
% %
$(V \ge 0.88) \wedge (V \le 4.4) \; .$ $$(V \ge 0.88) \wedge (V \le 4.4) \; .$$
This voltage range forms our input requirement. This voltage range forms our input requirement.
% %
@ -479,7 +494,7 @@ calls {\em read\_ADC}, which in turn interacts with the hardware/electronics.
This software is above the hardware in the conceptual call tree---from a programmatic perspective---%in software terms---the This software is above the hardware in the conceptual call tree---from a programmatic perspective---%in software terms---the
software is reading values from the `lower~level' electronics. software is reading values from the `lower~level' electronics.
% %
FMEA is always a bottom-up process and so we must begin with this hardware. %FMEA is always a bottom-up process and so we must begin with this hardware.
% %
The hardware is simply a load resistor, connected across an ADC input The hardware is simply a load resistor, connected across an ADC input
pin on the micro-controller and ground. pin on the micro-controller and ground.
@ -504,8 +519,8 @@ the multiplexer and the analogue to digital converter.
\label{tbl:r420i} \label{tbl:r420i}
\begin{tabular}{|| l | c | l ||} \hline \begin{tabular}{|| l | c | l ||} \hline
\textbf{Failure} & \textbf{failure} & \textbf{System Failure} \\ \textbf{Failure} & \textbf{failure} & \textbf{System} \\
\textbf{Scenario} & \textbf{effect} & \\ \hline \textbf{Scenario} & \textbf{effect} & \textbf{Failure} \\ \hline
\hline \hline
$R$ & OPEN~\cite{en298}[Ann.A] & $LOW$ \\ $R$ & OPEN~\cite{en298}[Ann.A] & $LOW$ \\
& & $READING$ \\ \hline & & $READING$ \\ \hline
@ -537,7 +552,7 @@ For software FMEA we take the variables used by the system,
and examine what could happen if they are corrupted in various ways~\cite{procsfmea, embedsfmea}. and examine what could happen if they are corrupted in various ways~\cite{procsfmea, embedsfmea}.
From the function $read\_4\_20\_input()$ we have the variables $error\_flag$, From the function $read\_4\_20\_input()$ we have the variables $error\_flag$,
$input\_volts$ and $value$: from the function $read\_ADC()$, $timeout$, $ADCMUX$, $ADCGO$, $dval$. $input\_volts$ and $value$: from the function $read\_ADC()$, $timeout$, $ADCMUX$, $ADCGO$, $dval$.
We must now determine putative system failure modes for these variables becoming corrupted. We must now determine putative system failure modes for these variables becoming corrupted, this is performed in table~\ref{tbl:sfmea}.
{ {
@ -547,8 +562,8 @@ We must now determine putative system failure modes for these variables becoming
\label{tbl:sfmea} \label{tbl:sfmea}
\begin{tabular}{|| l | c | l ||} \hline \begin{tabular}{|| l | c | l ||} \hline
\textbf{Failure} & \textbf{failure} & \textbf{System Failure} \\ \textbf{Failure} & \textbf{failure} & \textbf{System} \\
\textbf{Scenario} & \textbf{effect} & \\ \hline \textbf{Scenario} & \textbf{effect} & \textbf{Failure} \\ \hline
\hline \hline
$error\_flag$ & set FALSE & $VAL\_ERROR$ \\ $error\_flag$ & set FALSE & $VAL\_ERROR$ \\
& & \\ \hline & & \\ \hline
@ -592,7 +607,7 @@ We must now determine putative system failure modes for these variables becoming
Microprocessors/Microcontrollers have sets of known failure modes, these include RAM, ROM Microprocessors/Microcontrollers have sets of known failure modes, these include RAM, ROM
EEPROM failure\footnote{EEPROM failure is not applicable for this example.} and EEPROM failure\footnote{EEPROM failure is not applicable for this example.} and
oscillator clock timing~\cite{sfmeaauto}. oscillator clock timing
@ -603,11 +618,11 @@ oscillator clock timing~\cite{sfmeaauto}.
\label{tbl:sfmeaup} \label{tbl:sfmeaup}
\begin{tabular}{|| l | c | l ||} \hline \begin{tabular}{|| l | c | l ||} \hline
\textbf{Failure} & \textbf{failure} & \textbf{System Failure} \\ \textbf{Failure} & \textbf{failure} & \textbf{System} \\
\textbf{Scenario} & \textbf{effect} & \\ \hline \textbf{Scenario} & \textbf{effect} & \textbf{Failure} \\ \hline
\hline \hline
$RAM$ & variable corruption & All errors \\ $RAM$ & variable & All errors \\
& & from table~\ref{tbl:sfmea} \\ \hline & corruption & from table~\ref{tbl:sfmea} \\ \hline
$RAM$ & program flow & process \\ $RAM$ & program flow & process \\
& & halts / crashes \\ \hline & & halts / crashes \\ \hline
@ -632,51 +647,93 @@ oscillator clock timing~\cite{sfmeaauto}.
\end{table} \end{table}
} }
\clearpage
\section{Software FMEA - The software hardware interface} \section{Software FMEA - The software/hardware interface}
As FMEA is applied separately to software and hardware As FMEA is applied separately to software and hardware
the interface between them is an undefined factor. the interface between them is an undefined factor.
Ozarin~\cite{sfmeainterface} recommends that an FMEA report be written Ozarin~\cite{sfmeainterface,procsfmea} recommends that an FMEA report be written
to focus on the software/hardware interface. to focus on the software/hardware interface.
The software/hardware interface has
specific problems common to many systems and configurations
and these are described in~\cite{sfmeainterface}.
%An interface FMEA is performed in table~\ref{hwswinterface}.
%
The hardware to software interface for the {\ft} example is handled The hardware to software interface for the {\ft} example is handled
by the 'C' function $read\_ADC()$. by the 'C' function $read\_ADC()$.
~\cite{sfmeaauto}.
%
% An FMEA of the `software~medium' is given in table~\ref{tbl:sfmeaup}.
\paragraph{Timing and Synchronisation.}
The $ADCOUT$ register, where the raw ADC value is read
is an internal register used by the ADC and presented
as a readable memory location when the ADC
has finished updating it.
Reading it at the wrong time would
cause an invalid value to be read.
The synchronisation is performed by polling an $ADCGO$
bit, a flag mapped to memory by which the ADC indicates that the data is ready.
\paragraph{Interrupt Contention.}
Were an interrupt to also attempt to read from the ADC
the ADCMUX could be altered, causing the non-interrupt
routine to read from the wrong channel.
\paragraph{Data Formatting.}
The ADC may use a big-endian or little endian integer
format. It may also right or left justify the bits in its value.
\section{Conclusion} \section{Conclusion}
% %
The FMMD method has been demonstrated using an the industry standard {\ft} This paper has picked a very simple example (the industry standard {\ft}
input circuit and software. input circuit and software) to demonstrate
SFMEA and HFMEA methodologies used to describe a failure mode model.
%Even a modest system would be far too large to analyse in conference paper
%and this
% %
The {\dc} representing the {\ft} reader %The {\dc} representing the {\ft} reader
shows that by taking a %shows that by taking a
%modular approach for FMEA, i.e. FMMD, we can integrate %modular approach for FMEA, i.e. FMMD, we can integrate
four FMEA reports we can model the failure mode behaviour from Our model is described by four FMEA reports; and these % we can model the failure mode behaviour from
several perspectives, for model the system from four several perspectives.
software and electrical systems% models.
%
With this analysis
we have stages along the `reasoning~path' linking the failure modes from the
electronics to those in the software.
Each {\fg} to {\dc} transition represents a
reasoning stage.
%
% %
With traditional FMEA methods the reasoning~distance is large, because With traditional FMEA methods the reasoning~distance is large, because
it stretches from the component failure mode to the top---or---system level failure. it stretches from the component failure mode to the top---or---system level failure.
%
With these four analysis reports
we do not have stages along the `reasoning~path' linking the failure modes from the
electronics to those in the software.
%Software is often written `defensively' but t
%Each {\fg} to {\dc} transition represents a
%reasoning stage.
%
%
%For this reason applying traditional FMEA to software stretches %For this reason applying traditional FMEA to software stretches
%the reasoning distance even further. %the reasoning distance even further.
% %
In fact these reasoning paths overlap ---or even by-pass one another--- In fact many these reasoning paths overlap---or even by-pass one another---
it is very difficult to gauge cause and effect. For instance it is very difficult to gauge cause and effect.
were the ADC to have a small value error, say adding For instance, hardware failures are not analysed in the context of how they will
a small percentage onto the value, we would be unable to be handled (or missed) by the software.
detect this under the analysis conditions for this model, or %
be able to pinpoint it. System outputs commanded from may not take into account particular
hardware limitations etc.
The interface FMEA does serve to provide a useful
checklist to ensure conventions used by the hardware
and software are not mismatched.
However, while these techniques ensure that the software and hardware is
viewed and analysed from several perspectives, it cannot be termed a homogeneous
failure mode model.
% For instance
% were the ADC to have a small value error, say adding
% a small percentage onto the value, we would be unable to
% detect this under the analysis conditions for this model, or
% be able to pinpoint it.
%
{ {