night night edit, but have not eaten yet (GEDDIT)
This commit is contained in:
parent
5db87d00d5
commit
ec8d020d4e
@ -65,32 +65,32 @@ failure mode of the component or sub-system}}}
|
||||
\newboolean{dag}
|
||||
\setboolean{dag}{true} % boolvar=true or false : draw analysis using directed acylic graphs
|
||||
|
||||
\setlength{\topmargin}{0in}
|
||||
\setlength{\headheight}{0in}
|
||||
\setlength{\headsep}{0in}
|
||||
\setlength{\textheight}{22cm}
|
||||
\setlength{\textwidth}{18cm}
|
||||
%\setlength{\textheight}{24.35cm}
|
||||
%\setlength{\textwidth}{20cm}
|
||||
\setlength{\oddsidemargin}{0in}
|
||||
\setlength{\evensidemargin}{0in}
|
||||
\setlength{\parindent}{0.0in}
|
||||
%\setlength{\parskip}{6pt}
|
||||
% \setlength{\parskip}{1cm plus4mm minus3mm}
|
||||
\setlength{\parskip}{0pt}
|
||||
\setlength{\parsep}{0pt}
|
||||
\setlength{\headsep}{0pt}
|
||||
\setlength{\topskip}{0pt}
|
||||
\setlength{\topmargin}{0pt}
|
||||
\setlength{\topsep}{0pt}
|
||||
\setlength{\partopsep}{0pt}
|
||||
\setlength{\itemsep}{1pt}
|
||||
% \setlength{\topmargin}{0in}
|
||||
% \setlength{\headheight}{0in}
|
||||
% \setlength{\headsep}{0in}
|
||||
% \setlength{\textheight}{22cm}
|
||||
% \setlength{\textwidth}{18cm}
|
||||
% %\setlength{\textheight}{24.35cm}
|
||||
% %\setlength{\textwidth}{20cm}
|
||||
% \setlength{\oddsidemargin}{0in}
|
||||
% \setlength{\evensidemargin}{0in}
|
||||
% \setlength{\parindent}{0.0in}
|
||||
% %\setlength{\parskip}{6pt}
|
||||
% % \setlength{\parskip}{1cm plus4mm minus3mm}
|
||||
% \setlength{\parskip}{0pt}
|
||||
% \setlength{\parsep}{0pt}
|
||||
% \setlength{\headsep}{0pt}
|
||||
% \setlength{\topskip}{0pt}
|
||||
% \setlength{\topmargin}{0pt}
|
||||
% \setlength{\topsep}{0pt}
|
||||
% \setlength{\partopsep}{0pt}
|
||||
% \setlength{\itemsep}{1pt}
|
||||
% \renewcommand\subsection{\@startsection
|
||||
% {subsection}{2}{0mm}%
|
||||
% {-\baslineskip}
|
||||
% {0.5\baselineskip}
|
||||
% {\normalfont\normalsize\itshape}}%
|
||||
\linespread{0.953}
|
||||
\linespread{1.0}
|
||||
|
||||
\begin{document}
|
||||
%\pagestyle{fancy}
|
||||
@ -144,13 +144,16 @@ failure mode of the component or sub-system}}}
|
||||
This paper presents a worked example of FMEA applied to an
|
||||
integrated electronics/software system.
|
||||
%
|
||||
FMEA methodologies trace from the 1940's and were designed to
|
||||
%FMEA methodologies trace from the 1940's and were designed to
|
||||
%model simple electro-mechanical systems.
|
||||
%
|
||||
FMEA methodologies were originally in the 1940's designed to
|
||||
model simple electro-mechanical systems.
|
||||
%
|
||||
Software generally sits on top of most modern safety critical control systems
|
||||
and defines its most important system wide behaviour and communications.
|
||||
%
|
||||
Currently standards that demand FMEA for hardware(HFMEA) (e.g. EN298, EN61508),
|
||||
Currently standards that demand FMEA investigations for hardware(HFMEA) (e.g. EN298, EN61508),
|
||||
do not specify it for software, but instead specify good practise,
|
||||
review processes and language feature constraints.
|
||||
%
|
||||
@ -161,12 +164,22 @@ traces component {\fms}
|
||||
to resultant system failures, software until recently, has been left in a non-analytical
|
||||
limbo of best practises and constraints.
|
||||
Software FMEA has been proposed
|
||||
in several forms. SFMEA is always performed separately from HFMEA.
|
||||
in several forms.
|
||||
%
|
||||
However, SFMEA is always performed separately from HFMEA.
|
||||
%
|
||||
This paper seeks to examine the effectiveness of current and proposed SFMEA
|
||||
techniques, by using a analysing the chosen example, which is well known and understood
|
||||
from years of field experience, and determining how well the HFMEA and SFMEA
|
||||
analysis reports model the failure mode behaviour.
|
||||
techniques, by analysing a simple hybrid hardware/software system,
|
||||
which is in common use and has mature field experience. %
|
||||
%analysing the chosen example, which is well known and understood
|
||||
%
|
||||
Because the chosen example is well understood it is
|
||||
%, this example is
|
||||
useful
|
||||
to compare the results from these FMEA methodologies with
|
||||
the known failure mode behaviour.
|
||||
%from years of field experience, and determining how well the HFMEA and SFMEA
|
||||
%analysis reports model the failure mode behaviour.
|
||||
% %
|
||||
%If software and hardware integrated FMEA were possible, electro-mechanical-software hybrids could
|
||||
%be modelled, and so we could consider `complete' failure mode models.
|
||||
@ -205,7 +218,7 @@ component failure modes, %and by reasoning,
|
||||
tracing their effects through a system
|
||||
and determining what system level failure modes could be caused.
|
||||
%
|
||||
FMEA dates from the 1940s where simple electro-mechanical systems were the norm.
|
||||
FMEA has its roots in the previous century where simple electro-mechanical systems were the norm.
|
||||
Modern control systems nearly always have a significant software/firmware element,
|
||||
and not being able to model software with current FMEA methodologies
|
||||
is a cause for criticism~\cite{safeware}[Ch.12].
|
||||
@ -260,19 +273,20 @@ base component {\fms}, and translating them into system level events/failures~\c
|
||||
In a complicated system, mapping a component failure mode to a system level failure
|
||||
will mean a long reasoning distance; that is to say the actions of the
|
||||
failed component will have to be traced through
|
||||
several sub-systems, gauging its effects with other components.
|
||||
several sub-systems, gauging its effects with and on other components.
|
||||
%
|
||||
With software at the higher levels of these sub-systems,
|
||||
we have yet another layer of complication.
|
||||
%
|
||||
In order to integrate software, %in a meaningful way
|
||||
we need to re-think the
|
||||
FMEA concept of simply mapping a base component failure to a system level event.
|
||||
%In order to integrate software, %in a meaningful way
|
||||
%we need to re-think the
|
||||
%FMEA concept of simply mapping a base component failure to a system level event.
|
||||
%
|
||||
SFMEA regards the components to be the variables used by the programs.
|
||||
These variables could become erroneously over-written,
|
||||
by calculated incorrectly (due to a mistake by the programmer, or a fault in the micro-processor it is running on, or
|
||||
by radiation causing bits to be erroneously altered.
|
||||
SFMEA regard, in place of hardware components, the variables used by the programs to be their equivalent~\cite{procsfmea}.
|
||||
The failure modes of these variables, are that they could become erroneously over-written,
|
||||
calculated incorrectly (due to a mistake by the programmer, or a fault in the micro-processor it is running on), or
|
||||
external influences such as
|
||||
ionising radiation causing bits to be erroneously altered.
|
||||
|
||||
|
||||
\paragraph{A more-complete Failure Mode Model}
|
||||
@ -287,8 +301,8 @@ by radiation causing bits to be erroneously altered.
|
||||
%
|
||||
In order to obtain a more complete failure mode model of
|
||||
a hybrid electronic/software system we need to analyse
|
||||
the hardware, the software, the hardware the software runs on,
|
||||
and the software hardware interface.
|
||||
the hardware, the software, the hardware the software runs on (i.e. the software's medium),
|
||||
and the software/hardware interface.
|
||||
%
|
||||
HFMEA is a well established technique and needs no further description in this paper.
|
||||
|
||||
@ -301,7 +315,7 @@ to supply a current signal to represent the value to be sent~\cite{aoe}[p.934].
|
||||
Usually, $4mA$ represents a zero or starting value and $20mA$ represents the full scale,
|
||||
and this is referred to as {\ft} signalling.
|
||||
%
|
||||
{\ft} has an electrical advantage as well because the current in a loop is constant~\cite{aoe}[p.20].
|
||||
{\ft} has an electrical advantage as well because the current in an electronic loop is constant~\cite{aoe}[p.20].
|
||||
Thus resistance in the wires between the source and the receiving end is not an issue
|
||||
that can alter the accuracy of the signal.
|
||||
%
|
||||
@ -332,25 +346,26 @@ The diagram in figure~\ref{fig:ftcontext} shows some equipment which is sending
|
||||
signal to a micro-controller system.
|
||||
The signal is locally driven over a load resistor, and then read into the micro-controller via
|
||||
an ADC and its multiplexer.
|
||||
With the voltage detected at the ADC the multiplexer can read the intended quantitative
|
||||
With the voltage detected at the ADC the multiplexer we read the intended quantitative
|
||||
value from the external equipment.
|
||||
|
||||
\subsection{Simple Software Example}
|
||||
|
||||
|
||||
Consider a software function that reads a {\ft} input, and returns a value between 0 and 999 (i.e. per mil $\permil$)
|
||||
representing the current detected with an additional error indication flag.
|
||||
representing the value intended by the current detected, with an additional error indication flag to indicate the validity
|
||||
of the value returned.
|
||||
%
|
||||
Let us assume the {\ft} detection is via a \ohms{220} resistor, and that we read a voltage
|
||||
from an ADC into the software.
|
||||
Let us define any value outside the 4mA to 20mA range as an error condition.
|
||||
%
|
||||
As a voltage, we use ohms law~\cite{aoe} to determine the voltage ranges: $V=IR$, $0.004A * \ohms{220} = 0.88V$
|
||||
and $0.020A * \ohms{220} = 4.4V$.
|
||||
and $$0.020A * \ohms{220} = 4.4V \;.$$
|
||||
%
|
||||
Our acceptable voltage range is therefore
|
||||
%
|
||||
$(V \ge 0.88) \wedge (V \le 4.4) \; .$
|
||||
$$(V \ge 0.88) \wedge (V \le 4.4) \; .$$
|
||||
|
||||
This voltage range forms our input requirement.
|
||||
%
|
||||
@ -479,7 +494,7 @@ calls {\em read\_ADC}, which in turn interacts with the hardware/electronics.
|
||||
This software is above the hardware in the conceptual call tree---from a programmatic perspective---%in software terms---the
|
||||
software is reading values from the `lower~level' electronics.
|
||||
%
|
||||
FMEA is always a bottom-up process and so we must begin with this hardware.
|
||||
%FMEA is always a bottom-up process and so we must begin with this hardware.
|
||||
%
|
||||
The hardware is simply a load resistor, connected across an ADC input
|
||||
pin on the micro-controller and ground.
|
||||
@ -504,8 +519,8 @@ the multiplexer and the analogue to digital converter.
|
||||
\label{tbl:r420i}
|
||||
|
||||
\begin{tabular}{|| l | c | l ||} \hline
|
||||
\textbf{Failure} & \textbf{failure} & \textbf{System Failure} \\
|
||||
\textbf{Scenario} & \textbf{effect} & \\ \hline
|
||||
\textbf{Failure} & \textbf{failure} & \textbf{System} \\
|
||||
\textbf{Scenario} & \textbf{effect} & \textbf{Failure} \\ \hline
|
||||
\hline
|
||||
$R$ & OPEN~\cite{en298}[Ann.A] & $LOW$ \\
|
||||
& & $READING$ \\ \hline
|
||||
@ -537,7 +552,7 @@ For software FMEA we take the variables used by the system,
|
||||
and examine what could happen if they are corrupted in various ways~\cite{procsfmea, embedsfmea}.
|
||||
From the function $read\_4\_20\_input()$ we have the variables $error\_flag$,
|
||||
$input\_volts$ and $value$: from the function $read\_ADC()$, $timeout$, $ADCMUX$, $ADCGO$, $dval$.
|
||||
We must now determine putative system failure modes for these variables becoming corrupted.
|
||||
We must now determine putative system failure modes for these variables becoming corrupted, this is performed in table~\ref{tbl:sfmea}.
|
||||
|
||||
|
||||
{
|
||||
@ -547,8 +562,8 @@ We must now determine putative system failure modes for these variables becoming
|
||||
\label{tbl:sfmea}
|
||||
|
||||
\begin{tabular}{|| l | c | l ||} \hline
|
||||
\textbf{Failure} & \textbf{failure} & \textbf{System Failure} \\
|
||||
\textbf{Scenario} & \textbf{effect} & \\ \hline
|
||||
\textbf{Failure} & \textbf{failure} & \textbf{System} \\
|
||||
\textbf{Scenario} & \textbf{effect} & \textbf{Failure} \\ \hline
|
||||
\hline
|
||||
$error\_flag$ & set FALSE & $VAL\_ERROR$ \\
|
||||
& & \\ \hline
|
||||
@ -592,7 +607,7 @@ We must now determine putative system failure modes for these variables becoming
|
||||
|
||||
Microprocessors/Microcontrollers have sets of known failure modes, these include RAM, ROM
|
||||
EEPROM failure\footnote{EEPROM failure is not applicable for this example.} and
|
||||
oscillator clock timing~\cite{sfmeaauto}.
|
||||
oscillator clock timing
|
||||
|
||||
|
||||
|
||||
@ -603,11 +618,11 @@ oscillator clock timing~\cite{sfmeaauto}.
|
||||
\label{tbl:sfmeaup}
|
||||
|
||||
\begin{tabular}{|| l | c | l ||} \hline
|
||||
\textbf{Failure} & \textbf{failure} & \textbf{System Failure} \\
|
||||
\textbf{Scenario} & \textbf{effect} & \\ \hline
|
||||
\textbf{Failure} & \textbf{failure} & \textbf{System} \\
|
||||
\textbf{Scenario} & \textbf{effect} & \textbf{Failure} \\ \hline
|
||||
\hline
|
||||
$RAM$ & variable corruption & All errors \\
|
||||
& & from table~\ref{tbl:sfmea} \\ \hline
|
||||
$RAM$ & variable & All errors \\
|
||||
& corruption & from table~\ref{tbl:sfmea} \\ \hline
|
||||
|
||||
$RAM$ & program flow & process \\
|
||||
& & halts / crashes \\ \hline
|
||||
@ -632,51 +647,93 @@ oscillator clock timing~\cite{sfmeaauto}.
|
||||
\end{table}
|
||||
}
|
||||
|
||||
|
||||
\section{Software FMEA - The software hardware interface}
|
||||
\clearpage
|
||||
\section{Software FMEA - The software/hardware interface}
|
||||
|
||||
As FMEA is applied separately to software and hardware
|
||||
the interface between them is an undefined factor.
|
||||
Ozarin~\cite{sfmeainterface} recommends that an FMEA report be written
|
||||
Ozarin~\cite{sfmeainterface,procsfmea} recommends that an FMEA report be written
|
||||
to focus on the software/hardware interface.
|
||||
|
||||
The software/hardware interface has
|
||||
specific problems common to many systems and configurations
|
||||
and these are described in~\cite{sfmeainterface}.
|
||||
%An interface FMEA is performed in table~\ref{hwswinterface}.
|
||||
%
|
||||
The hardware to software interface for the {\ft} example is handled
|
||||
by the 'C' function $read\_ADC()$.
|
||||
~\cite{sfmeaauto}.
|
||||
%
|
||||
% An FMEA of the `software~medium' is given in table~\ref{tbl:sfmeaup}.
|
||||
\paragraph{Timing and Synchronisation.}
|
||||
The $ADCOUT$ register, where the raw ADC value is read
|
||||
is an internal register used by the ADC and presented
|
||||
as a readable memory location when the ADC
|
||||
has finished updating it.
|
||||
Reading it at the wrong time would
|
||||
cause an invalid value to be read.
|
||||
The synchronisation is performed by polling an $ADCGO$
|
||||
bit, a flag mapped to memory by which the ADC indicates that the data is ready.
|
||||
|
||||
\paragraph{Interrupt Contention.}
|
||||
Were an interrupt to also attempt to read from the ADC
|
||||
the ADCMUX could be altered, causing the non-interrupt
|
||||
routine to read from the wrong channel.
|
||||
|
||||
\paragraph{Data Formatting.}
|
||||
The ADC may use a big-endian or little endian integer
|
||||
format. It may also right or left justify the bits in its value.
|
||||
|
||||
|
||||
|
||||
\section{Conclusion}
|
||||
%
|
||||
The FMMD method has been demonstrated using an the industry standard {\ft}
|
||||
input circuit and software.
|
||||
This paper has picked a very simple example (the industry standard {\ft}
|
||||
input circuit and software) to demonstrate
|
||||
SFMEA and HFMEA methodologies used to describe a failure mode model.
|
||||
%Even a modest system would be far too large to analyse in conference paper
|
||||
%and this
|
||||
%
|
||||
The {\dc} representing the {\ft} reader
|
||||
shows that by taking a
|
||||
%The {\dc} representing the {\ft} reader
|
||||
%shows that by taking a
|
||||
%modular approach for FMEA, i.e. FMMD, we can integrate
|
||||
four FMEA reports we can model the failure mode behaviour from
|
||||
several perspectives, for
|
||||
software and electrical systems% models.
|
||||
%
|
||||
With this analysis
|
||||
we have stages along the `reasoning~path' linking the failure modes from the
|
||||
electronics to those in the software.
|
||||
Each {\fg} to {\dc} transition represents a
|
||||
reasoning stage.
|
||||
%
|
||||
Our model is described by four FMEA reports; and these % we can model the failure mode behaviour from
|
||||
model the system from four several perspectives.
|
||||
%
|
||||
With traditional FMEA methods the reasoning~distance is large, because
|
||||
it stretches from the component failure mode to the top---or---system level failure.
|
||||
%
|
||||
With these four analysis reports
|
||||
we do not have stages along the `reasoning~path' linking the failure modes from the
|
||||
electronics to those in the software.
|
||||
%Software is often written `defensively' but t
|
||||
%Each {\fg} to {\dc} transition represents a
|
||||
%reasoning stage.
|
||||
%
|
||||
%
|
||||
%For this reason applying traditional FMEA to software stretches
|
||||
%the reasoning distance even further.
|
||||
%
|
||||
In fact these reasoning paths overlap ---or even by-pass one another---
|
||||
it is very difficult to gauge cause and effect. For instance
|
||||
were the ADC to have a small value error, say adding
|
||||
a small percentage onto the value, we would be unable to
|
||||
detect this under the analysis conditions for this model, or
|
||||
be able to pinpoint it.
|
||||
In fact many these reasoning paths overlap---or even by-pass one another---
|
||||
it is very difficult to gauge cause and effect.
|
||||
For instance, hardware failures are not analysed in the context of how they will
|
||||
be handled (or missed) by the software.
|
||||
%
|
||||
System outputs commanded from may not take into account particular
|
||||
hardware limitations etc.
|
||||
|
||||
The interface FMEA does serve to provide a useful
|
||||
checklist to ensure conventions used by the hardware
|
||||
and software are not mismatched.
|
||||
|
||||
However, while these techniques ensure that the software and hardware is
|
||||
viewed and analysed from several perspectives, it cannot be termed a homogeneous
|
||||
failure mode model.
|
||||
% For instance
|
||||
% were the ADC to have a small value error, say adding
|
||||
% a small percentage onto the value, we would be unable to
|
||||
% detect this under the analysis conditions for this model, or
|
||||
% be able to pinpoint it.
|
||||
%
|
||||
|
||||
|
||||
{
|
||||
|
Loading…
Reference in New Issue
Block a user