Literature review started
Using the paper "Practical Assessment Research & Evaluation" as a guide to structure
This commit is contained in:
parent
8d39f0c310
commit
f0b4ecb0fc
@ -380,6 +380,13 @@ year = {2012},
|
|||||||
YEAR = "2005"
|
YEAR = "2005"
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@BOOK{easw,
|
||||||
|
AUTHOR = "Nancy Leveson",
|
||||||
|
TITLE = "Engineering a Safer World ISBN: 978-0-262-01662-9",
|
||||||
|
PUBLISHER = "Addison-Wesley",
|
||||||
|
YEAR = "2005"
|
||||||
|
}
|
||||||
|
|
||||||
@BOOK{scse,
|
@BOOK{scse,
|
||||||
AUTHOR = "Fortescue, Swinerd, Stark",
|
AUTHOR = "Fortescue, Swinerd, Stark",
|
||||||
TITLE = "Spacecraft Systems Engineering ISBN:978-0-470-75012-4",
|
TITLE = "Spacecraft Systems Engineering ISBN:978-0-470-75012-4",
|
||||||
|
@ -1,26 +1,20 @@
|
|||||||
\section{Copy dot tex}
|
\section{Introduction}
|
||||||
|
|
||||||
|
Msc project Euler/Spider Diagram editor --- Euler/Spider Diagrams
|
||||||
|
could be used to model failure modes in components.
|
||||||
|
--- 2005 paper --- need for static analysis because of
|
||||||
|
high reliability of modern safety critical systems.
|
||||||
|
|
||||||
|
\section{Practical Experience: Safety Critical Product Approvals}
|
||||||
|
|
||||||
|
FMEA performed on selected areas perceived as critical
|
||||||
|
by test house.
|
||||||
|
Blanket measures, RAM ROM checks, EMC, electrical and environmental stress testing
|
||||||
|
|
||||||
|
\subsection{Practical limitations of testing for certification vs. rigorous approach}
|
||||||
|
|
||||||
|
State explosion problem considering a failure mode of a given component against
|
||||||
|
all other components in the system.
|
||||||
|
|
||||||
|
Impossible to perform double simultaneous failure analysis (as demanded by EN298~\cite{en298}).
|
||||||
|
|
||||||
sample text
|
|
||||||
sample text
|
|
||||||
sample text
|
|
||||||
sample text
|
|
||||||
sample text
|
|
||||||
sample text
|
|
||||||
sample text
|
|
||||||
sample text
|
|
||||||
sample text
|
|
||||||
sample text
|
|
||||||
sample text
|
|
||||||
sample text
|
|
||||||
sample text
|
|
||||||
sample text
|
|
||||||
sample text
|
|
||||||
sample text
|
|
||||||
sample text
|
|
||||||
sample text
|
|
||||||
sample text
|
|
||||||
sample text
|
|
||||||
sample text
|
|
||||||
sample text
|
|
||||||
sample text
|
|
||||||
sample text
|
|
||||||
|
@ -1,7 +1,7 @@
|
|||||||
|
|
||||||
|
|
||||||
EN61508:6\cite{en61508}[B.6.6]
|
The generic and statistical European Safety Standard, EN61508:6\cite{en61508}[B.6.6]
|
||||||
describes FMEA as:
|
describes Failure Mode Effect Analysis (FMEA) as:
|
||||||
\begin{quotation}
|
\begin{quotation}
|
||||||
"To analyse a system design, by examining all possible sources of failure
|
"To analyse a system design, by examining all possible sources of failure
|
||||||
of a system's components and determining the effects of these failures
|
of a system's components and determining the effects of these failures
|
||||||
@ -10,26 +10,34 @@ on the behaviour and safety of the system."
|
|||||||
|
|
||||||
\section{Concepts}
|
\section{Concepts}
|
||||||
|
|
||||||
Forward and backward searching...
|
\paragraph{Forward and backward searches}
|
||||||
forward search starts with possible failure causes
|
|
||||||
and works out what could happen
|
|
||||||
|
|
||||||
backward search uses possible failures and works back down (and not necessarily to
|
A forward search starts with possible failure causes
|
||||||
base components in a system)
|
and uses logic and reasoning to determine system level outcomes.
|
||||||
Reasoning distance .... general concept... simple ideas about how complex a
|
A backward search starts with system level events
|
||||||
failure analysis is the more modules and components are involved
|
works back down (and not necessarily to
|
||||||
|
base components in a system) using de-composition of
|
||||||
|
of the system and logic.
|
||||||
|
FMEA based methodologies are forward searches\cite{Lutz:1997:RAU:590564.590572} and top down
|
||||||
|
methodologies such as FTA~\cite{nucfta,nasafta}
|
||||||
|
|
||||||
|
\paragraph{Reasoning distance}
|
||||||
|
A reasoning distance is the number of stages of logic and reasoning
|
||||||
|
required to map a failure cause to its potential outcomes.
|
||||||
|
%.... general concept... simple ideas about how complex a
|
||||||
|
%failure analysis is the more modules and components are involved
|
||||||
% cite for forward and backward search related to safety critical software
|
% cite for forward and backward search related to safety critical software
|
||||||
\cite{Lutz:1997:RAU:590564.590572} %{sfmeaforwardbackward}
|
%{sfmeaforwardbackward}
|
||||||
|
|
||||||
\section{F.M.E.A.}
|
\section{FMEA}
|
||||||
|
|
||||||
\subsection{FMEA}
|
%\subsection{FMEA}
|
||||||
%\tableofcontents[currentsection]
|
%\tableofcontents[currentsection]
|
||||||
|
|
||||||
|
|
||||||
FMEA is a broad term; it could mean anything from an informal check on how
|
FMEA is a broad term; it could mean anything from an informal check on how
|
||||||
how failures could affect some equipment in an initial brain-storming session
|
how failures could affect some equipment in an initial brain-storming session
|
||||||
in product design, to formal submissions as part of safety critical certification.
|
in product design, to formal submission as part of safety critical certification.
|
||||||
%
|
%
|
||||||
This chapter describes basic concepts of FMEA, uses a simple example to
|
This chapter describes basic concepts of FMEA, uses a simple example to
|
||||||
demonstrate a single FMEA analysis stage, describes the four main variants of FMEA in use today
|
demonstrate a single FMEA analysis stage, describes the four main variants of FMEA in use today
|
||||||
@ -143,9 +151,9 @@ approach in looking for system failures.
|
|||||||
\subsection{The unacceptability of a single component failure causing a catastrophe}
|
\subsection{The unacceptability of a single component failure causing a catastrophe}
|
||||||
|
|
||||||
FMEA, due to its inductive bottom-up approach, is very good
|
FMEA, due to its inductive bottom-up approach, is very good
|
||||||
at finding potential component failures that could have catastrophic implications.
|
at finding potential single component failures that could have catastrophic implications.
|
||||||
Used in the design phase of a project FMEA is an invaluable tool
|
Used in the design phase of a project FMEA is an invaluable tool
|
||||||
for unearthing these type of failure scenario.
|
for unearthing these failure scenarios.
|
||||||
It is less useful for determining catastrophic events for multiple
|
It is less useful for determining catastrophic events for multiple
|
||||||
simultaneous\footnote{Multiple simultaneous failures are taken to mean failure that occur within the same detection period.} failures.
|
simultaneous\footnote{Multiple simultaneous failures are taken to mean failure that occur within the same detection period.} failures.
|
||||||
|
|
||||||
@ -154,18 +162,18 @@ simultaneous\footnote{Multiple simultaneous failures are taken to mean failure t
|
|||||||
Modern electronic components, are generally very reliable, and the systems built from them
|
Modern electronic components, are generally very reliable, and the systems built from them
|
||||||
are thus very reliable too. Reliable field data on failures will, therefore be sparse.
|
are thus very reliable too. Reliable field data on failures will, therefore be sparse.
|
||||||
Should we wish to prove a continuous demand system for say ${10}^{-7}$ failures\footnote{${10}^{-7}$ failures per hour of operation is the
|
Should we wish to prove a continuous demand system for say ${10}^{-7}$ failures\footnote{${10}^{-7}$ failures per hour of operation is the
|
||||||
threshold for S.I.L. 3 reliability~\cite{en61508}.}
|
threshold for S.I.L. 3 reliability~\cite{en61508}. Failure rates are normally measured per $10^9$ hours of operation
|
||||||
|
and are know as Failure in Time (FIT) values. The maximum FIT values for a SIL 3 system is therefore 100.}
|
||||||
per hour of operation, even with 1000 correctly monitored units in the field
|
per hour of operation, even with 1000 correctly monitored units in the field
|
||||||
we could only expect one failure per ten thousand hours (a little over one a year).
|
we could only expect one failure per ten thousand hours (a little over one a year).
|
||||||
It would be utterly impractical to get statistically significant data for equipment
|
It would be utterly impractical to get statistically significant data for equipment
|
||||||
at these reliability levels.
|
at these reliability levels.
|
||||||
However, we can use FMEA (more specifically the FMEDA variant, see section~\ref{sec:FMEDA}), working from known component failure rates, to obtain
|
However, we can use FMEA (more specifically the FMEDA variant, see section~\ref{sec:FMEDA}),
|
||||||
|
working from known component failure rates, to obtain
|
||||||
statistical estimates of the equipment reliability.
|
statistical estimates of the equipment reliability.
|
||||||
|
|
||||||
|
|
||||||
\subsection{Rigorous FMEA --- State Explosion Problem}
|
\subsection{FMEA and the State Explosion Problem}
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
\paragraph{Rigorous Single Failure FMEA}
|
\paragraph{Rigorous Single Failure FMEA}
|
||||||
|
|
||||||
@ -251,14 +259,8 @@ number.
|
|||||||
Fixing problems with the highest RPN number
|
Fixing problems with the highest RPN number
|
||||||
will return most cost benefit.
|
will return most cost benefit.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
% benign example of PFMEA in CARS - make something up.
|
% benign example of PFMEA in CARS - make something up.
|
||||||
\subsection{PFMEA Example}
|
\subsection{PFMEA Example}
|
||||||
|
|
||||||
|
|
||||||
\begin{table}[ht]
|
\begin{table}[ht]
|
||||||
\caption{FMEA Calculations} % title of Table
|
\caption{FMEA Calculations} % title of Table
|
||||||
%\centering % used for centering table
|
%\centering % used for centering table
|
||||||
@ -268,97 +270,22 @@ will return most cost benefit.
|
|||||||
relay 2 n/c & $1*10^{-5}$ & 98.0 & doorlocks fail & 0.00098 \\ \hline
|
relay 2 n/c & $1*10^{-5}$ & 98.0 & doorlocks fail & 0.00098 \\ \hline
|
||||||
% rear end crash & $14.4*10^{-6}$ & 267,700 & fatal fire & 3.855 \\
|
% rear end crash & $14.4*10^{-6}$ & 267,700 & fatal fire & 3.855 \\
|
||||||
% ruptured f.tank & & & & \\ \hline
|
% ruptured f.tank & & & & \\ \hline
|
||||||
|
|
||||||
|
|
||||||
\hline
|
\hline
|
||||||
\end{tabular}
|
\end{tabular}
|
||||||
\end{table}
|
\end{table}
|
||||||
|
|
||||||
|
|
||||||
%Savings: 180 burn deaths, 180 serious burn injuries, 2,100 burned vehicles. Unit Cost: $200,000 per death, $67,000 per injury, $700 per vehicle.
|
|
||||||
%Total Benefit: 180 X ($200,000) + 180 X ($67,000) + $2,100 X ($700) = $49.5 million.
|
|
||||||
%COSTS
|
|
||||||
%Sales: 11 million cars, 1.5 million light trucks.
|
|
||||||
%Unit Cost: $11 per car, $11 per truck.
|
|
||||||
%Total Cost: 11,000,000 X ($11) + 1,500,000 X ($11) = $137 million.
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
%\subsection{Production FMEA : Example Ford Pinto : 1975}
|
|
||||||
|
|
||||||
\subsection{PFMEA Example: Ford Pinto: 1975}
|
|
||||||
|
|
||||||
\begin{figure}[h]
|
|
||||||
\centering
|
|
||||||
\includegraphics[width=300pt]{./CH2_FMEA/ad_ford_pinto_mpg_red_3_1975.jpg}
|
|
||||||
% ad_ford_pinto_mpg_red_3_1975.jpg: 720x933 pixel, 96dpi, 19.05x24.69 cm, bb=0 0 540 700
|
|
||||||
\caption{Ford Pinto Advert}
|
|
||||||
\label{fig:fordpintoad}
|
|
||||||
\end{figure}
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
\begin{figure}[h]
|
|
||||||
\centering
|
|
||||||
\includegraphics[width=300pt]{./CH2_FMEA/burntoutpinto.png}
|
|
||||||
% burntoutpinto.png: 376x250 pixel, 72dpi, 13.26x8.82 cm, bb=0 0 376 250
|
|
||||||
\caption{Burnt Out Pinto}
|
|
||||||
\label{fig:burntoutpinto}
|
|
||||||
\end{figure}
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
\begin{table}[ht]
|
|
||||||
\caption{FMEA Calculations} % title of Table
|
|
||||||
%\centering % used for centering table
|
|
||||||
\begin{tabular}{|| l | l | c | c | l ||} \hline
|
|
||||||
\textbf{Failure Mode} & \textbf{P} & \textbf{Cost} & \textbf{Symptom} & \textbf{RPN} \\ \hline \hline
|
|
||||||
relay 1 n/c & $1*10^{-5}$ & 38.0 & indicators fail & 0.00038 \\ \hline
|
|
||||||
relay 2 n/c & $1*10^{-5}$ & 98.0 & doorlocks fail & 0.00098 \\ \hline
|
|
||||||
rear end crash & $14.4*10^{-6}$ & 267,700 & fatal fire & 3.855 \\
|
|
||||||
ruptured f.tank & & & allow & \\ \hline
|
|
||||||
|
|
||||||
rear end crash & $1$ & $11$ & recall & 11.0 \\
|
|
||||||
ruptured f.tank & & & fix tank & \\ \hline
|
|
||||||
|
|
||||||
\hline
|
|
||||||
\end{tabular}
|
|
||||||
\end{table}
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
% don't think this is relevant for the thesis: http://www.youtube.com/watch?v=rcNeorjXMrE
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
\section{FMECA - Failure Modes Effects and Criticality Analysis}
|
\section{FMECA - Failure Modes Effects and Criticality Analysis}
|
||||||
|
|
||||||
|
\subsection{ FMECA - Failure Modes Effects and Criticality Analysis}
|
||||||
|
% \begin{figure}
|
||||||
|
% \centering
|
||||||
\subsection{ FMECA - Failure Modes Effects and Criticallity Analysis}
|
% %\includegraphics[width=100pt]{./military-aircraft-desktop-computer-wallpaper-missile-launch.jpg}
|
||||||
\begin{figure}
|
% \includegraphics[width=300pt]{./CH2_FMEA/A10_thunderbolt.jpg}
|
||||||
\centering
|
% % military-aircraft-desktop-computer-wallpaper-missile-launch.jpg: 1024x768 pixel, 300dpi, 8.67x6.50 cm, bb=0 0 246 184
|
||||||
%\includegraphics[width=100pt]{./military-aircraft-desktop-computer-wallpaper-missile-launch.jpg}
|
% \caption{A10 Thunderbolt}
|
||||||
\includegraphics[width=300pt]{./CH2_FMEA/A10_thunderbolt.jpg}
|
% \label{fig:f16missile}
|
||||||
% military-aircraft-desktop-computer-wallpaper-missile-launch.jpg: 1024x768 pixel, 300dpi, 8.67x6.50 cm, bb=0 0 246 184
|
% \end{figure}
|
||||||
\caption{A10 Thunderbolt}
|
|
||||||
\label{fig:f16missile}
|
|
||||||
\end{figure}
|
|
||||||
Emphasis on determining criticality of failure.
|
Emphasis on determining criticality of failure.
|
||||||
Applies some Bayesian statistics (probabilities of component failures and those thereby causing given system level failures).
|
Applies some Bayesian statistics (probabilities of component failures and those thereby causing given system level failures).
|
||||||
|
|
||||||
@ -538,7 +465,7 @@ by statistically determining how frequently it can fail dangerously.
|
|||||||
|
|
||||||
|
|
||||||
\subsection{ FMEDA - Failure Modes Effects and Diagnostic Analysis}
|
\subsection{ FMEDA - Failure Modes Effects and Diagnostic Analysis}
|
||||||
{
|
|
||||||
\begin{table}[ht]
|
\begin{table}[ht]
|
||||||
\caption{FMEA Calculations} % title of Table
|
\caption{FMEA Calculations} % title of Table
|
||||||
%\centering % used for centering table
|
%\centering % used for centering table
|
||||||
@ -612,7 +539,36 @@ judged to be in critical sections of the product.
|
|||||||
|
|
||||||
|
|
||||||
|
|
||||||
\section{Software FMEA (SFMEA)}
|
\section{Literature Review}
|
||||||
|
|
||||||
|
%% FOCUS
|
||||||
|
The focus of this literature review is to establish the practice and applications
|
||||||
|
of FMEA, and to examine its strengths and weaknesses.
|
||||||
|
%% GOAL
|
||||||
|
Its
|
||||||
|
goal is to identify central issues and to criticise and assess the current
|
||||||
|
FMEA methodologies.
|
||||||
|
%% PERSPECTIVE
|
||||||
|
The perspective of the author, is as a practitioner of static failure mode analysis techniques
|
||||||
|
concerning approval of product
|
||||||
|
to European safety standards, both the prescriptive~\cite{en298,en230} and statistical~\cite{en61508}.
|
||||||
|
A second perspective is that of a software engineer trained to use formal methods.
|
||||||
|
Examining FMEA methodologies for mathematical properties, influenced by
|
||||||
|
formal methods applied to software, should provide an angle not traditionally considered.
|
||||||
|
%% COVERAGE
|
||||||
|
The literature reviewed, has been restricted to published books, European safety standards (as examples
|
||||||
|
of current safety measures applied), and traditional research, from journal and conference papers.
|
||||||
|
%% ORGANISATION
|
||||||
|
The review is organised by concept, that is, FMEA can be applied to hardware, software, software~interfacing and
|
||||||
|
to multiple failure scenarios etc. Methodologies related to FMEA are briefly covered for the sake of context.
|
||||||
|
%% AUDIENCE
|
||||||
|
% Well duh! PhD supervisors and examiners....
|
||||||
|
|
||||||
|
\subsection{Related Methodologies}
|
||||||
|
FTA --- HAZOP --- ALARP --- Event Tree Analysis --- bow tie concept
|
||||||
|
\subsection{Hardware FMEA (HFMEA)}
|
||||||
|
\subsection{Multiple Failure scenarios and FMEA}
|
||||||
|
\subsection{Software FMEA (SFMEA)}
|
||||||
|
|
||||||
\paragraph{Current work on Software FMEA}
|
\paragraph{Current work on Software FMEA}
|
||||||
|
|
||||||
@ -635,7 +591,7 @@ would give a better picture of the failure mode behaviour, it
|
|||||||
is by no means a rigorous approach to tracing errors that may occur in hardware
|
is by no means a rigorous approach to tracing errors that may occur in hardware
|
||||||
through to the top (and therefore ultimately controlling) layer of software.
|
through to the top (and therefore ultimately controlling) layer of software.
|
||||||
|
|
||||||
\subsection{Current FMEA techniques are not suitable for software}
|
\paragraph{Current FMEA techniques are not suitable for software}
|
||||||
|
|
||||||
The main FMEA methodologies are all based on the concept of taking
|
The main FMEA methodologies are all based on the concept of taking
|
||||||
base component {\fms}, and translating them into system level events/failures~\cite{sfmea,sfmeaa}.
|
base component {\fms}, and translating them into system level events/failures~\cite{sfmea,sfmeaa}.
|
||||||
@ -659,468 +615,29 @@ external influences such as
|
|||||||
ionising radiation causing bits to be erroneously altered.
|
ionising radiation causing bits to be erroneously altered.
|
||||||
|
|
||||||
|
|
||||||
\paragraph{A more-complete Failure Mode Model}
|
|
||||||
|
|
||||||
% HFMEA
|
|
||||||
% SFMEA
|
|
||||||
% VARIABLE CURRUPTION
|
|
||||||
% MICRO PROCESSOR FAULTS
|
|
||||||
% INTERFACE ANALYSIS
|
|
||||||
%
|
|
||||||
% add them all together --- a load of bollocks, lots of impressive inches of reports that no one will be bothered to read....
|
|
||||||
%
|
|
||||||
In order to obtain a more complete failure mode model of
|
|
||||||
a hybrid electronic/software system we need to analyse
|
|
||||||
the hardware, the software, the hardware the software runs on (i.e. the software's medium),
|
|
||||||
and the software/hardware interface.
|
|
||||||
%
|
|
||||||
HFMEA is a well established technique and needs no further description in this paper.
|
|
||||||
|
|
||||||
\section{Example for analysis} % : How can we apply FMEA}
|
|
||||||
|
|
||||||
For the purpose of example, we chose a simple common safety critical industrial circuit
|
|
||||||
that is nearly always used in conjunction with a programmatic element.
|
|
||||||
A common method for delivering a quantitative value in analogue electronics is
|
|
||||||
to supply a current signal to represent the value to be sent~\cite{aoe}[p.934].
|
|
||||||
Usually, $4mA$ represents a zero or starting value and $20mA$ represents the full scale,
|
|
||||||
and this is referred to as {\ft} signalling.
|
|
||||||
%
|
|
||||||
{\ft} has an electrical advantage as well because the current in an electronic loop is constant~\cite{aoe}[p.20].
|
|
||||||
Thus resistance in the wires between the source and the receiving end is not an issue
|
|
||||||
that can alter the accuracy of the signal.
|
|
||||||
%
|
|
||||||
This circuit has many advantages for safety. If the signal becomes disconnected
|
|
||||||
it reads an out of range $0mA$ at the receiving end. This is outside the {\ft} range,
|
|
||||||
and is therefore easy to detect as an error rather than an incorrect value.
|
|
||||||
%
|
|
||||||
Should the driving electronics go wrong at the source end, it will usually
|
|
||||||
supply far too little or far too much current, making an error condition easy to detect.
|
|
||||||
%
|
|
||||||
At the receiving end, one needs a resistor to convert the
|
|
||||||
current signal into a voltage that we can read with an ADC.%
|
|
||||||
%we only require one simple component to convert the
|
|
||||||
|
|
||||||
|
|
||||||
%BLOCK DIAGRAM HERE WITH FT CIRCUIT LOOP
|
|
||||||
|
|
||||||
\begin{figure}[h]
|
|
||||||
\centering
|
|
||||||
\includegraphics[width=250pt]{./CH2_FMEA/ftcontext.png}
|
|
||||||
% ftcontext.png: 767x385 pixel, 72dpi, 27.06x13.58 cm, bb=0 0 767 385
|
|
||||||
\caption{Context Diagram for {\ft} loop}
|
|
||||||
\label{fig:ftcontext}
|
|
||||||
\end{figure}
|
|
||||||
|
|
||||||
|
|
||||||
The diagram in figure~\ref{fig:ftcontext} shows some equipment which is sending a {\ft}
|
|
||||||
signal to a micro-controller system.
|
|
||||||
The signal is locally driven over a load resistor, and then read into the micro-controller via
|
|
||||||
an ADC and its multiplexer.
|
|
||||||
With the voltage detected at the ADC the multiplexer we read the intended quantitative
|
|
||||||
value from the external equipment.
|
|
||||||
|
|
||||||
\subsection{Simple Software Example}
|
|
||||||
|
|
||||||
|
|
||||||
Consider a software function that reads a {\ft} input, and returns a value between 0 and 999 (i.e. per mil $\permil$)
|
|
||||||
representing the value intended by the current detected, with an additional error indication flag to indicate the validity
|
|
||||||
of the value returned.
|
|
||||||
%
|
|
||||||
Let us assume the {\ft} detection is via a \ohms{220} resistor, and that we read a voltage
|
|
||||||
from an ADC into the software.
|
|
||||||
Let us define any value outside the 4mA to 20mA range as an error condition.
|
|
||||||
%
|
|
||||||
As a voltage, we use ohms law~\cite{aoe} to determine the voltage ranges: $V=IR$, $$0.004A * \ohms{220} = 0.88V $$
|
|
||||||
and $$0.020A * \ohms{220} = 4.4V \;.$$
|
|
||||||
%
|
|
||||||
Our acceptable voltage range is therefore
|
|
||||||
%
|
|
||||||
$$(V \ge 0.88) \wedge (V \le 4.4) \; .$$
|
|
||||||
|
|
||||||
This voltage range forms our input requirement.
|
|
||||||
%
|
|
||||||
We can now examine a software function that performs a conversion from the voltage read to
|
|
||||||
a per~mil representation of the {\ft} input current.
|
|
||||||
%
|
|
||||||
For the purpose of example the `C' programming language~\cite{DBLP:books/ph/KernighanR88} is
|
|
||||||
used\footnote{ C coding examples use the Misra~\cite{misra} and SIL-3 recommended language constraints~\cite{en61508}.}.
|
|
||||||
We initially assume a function \textbf{read\_ADC} which returns a floating point %double precision
|
|
||||||
value representing the voltage read (see code sample in figure~\ref{fig:code_read_4_20_input}).
|
|
||||||
|
|
||||||
|
|
||||||
%%{\vbox{
|
|
||||||
\begin{figure}[h+]
|
|
||||||
|
|
||||||
\footnotesize
|
|
||||||
\begin{verbatim}
|
|
||||||
/***********************************************/
|
|
||||||
/* read_4_20_input() */
|
|
||||||
/***********************************************/
|
|
||||||
/* Software function to read 4mA to 20mA input */
|
|
||||||
/* returns a value from 0-999 proportional */
|
|
||||||
/* to the current input. */
|
|
||||||
/***********************************************/
|
|
||||||
int read_4_20_input ( int * value ) {
|
|
||||||
double input_volts;
|
|
||||||
int error_flag;
|
|
||||||
|
|
||||||
/* set ADC MUX with input to read from */
|
|
||||||
input_volts = read_ADC(INPUT_4_20_mA);
|
|
||||||
|
|
||||||
if ( input_volts < 0.88 || input_volts > 4.4 ) {
|
|
||||||
error_flag = 1; /* Error flag set to TRUE */
|
|
||||||
}
|
|
||||||
else {
|
|
||||||
*value = (input_volts - 0.88) * ( 4.4 - 0.88 ) * 999.0;
|
|
||||||
error_flag = 0; /* indicate current input in range */
|
|
||||||
}
|
|
||||||
/* ensure: value is proportional (0-999) to the
|
|
||||||
4 to 20mA input */
|
|
||||||
return error_flag;
|
|
||||||
}
|
|
||||||
\end{verbatim}
|
|
||||||
%}
|
|
||||||
%}
|
|
||||||
|
|
||||||
\caption{Software Function: \textbf{read\_4\_20\_input}}
|
|
||||||
\label{fig:code_read_4_20_input}
|
|
||||||
%\label{fig:420i}
|
|
||||||
\end{figure}
|
|
||||||
|
|
||||||
We now look at the function called by \textbf{read\_4\_20\_input}, \textbf{read\_ADC}, which returns a
|
|
||||||
voltage for a given ADC channel.
|
|
||||||
%
|
|
||||||
This function
|
|
||||||
deals directly with the hardware in the micro-controller on which we are running the software.
|
|
||||||
%
|
|
||||||
Its job is to select the correct channel (ADC multiplexer) and then to initiate a
|
|
||||||
conversion by setting an ADC 'go' bit (see code sample in figure~\ref{fig:code_read_ADC}).
|
|
||||||
%
|
|
||||||
It takes the raw ADC reading and converts it into a
|
|
||||||
floating point\footnote{the type `double' or `double precision' is a
|
|
||||||
standard C language floating point type~\cite{DBLP:books/ph/KernighanR88}.}
|
|
||||||
voltage value.
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
%{\vbox{
|
|
||||||
\begin{figure}[h+]
|
|
||||||
|
|
||||||
\footnotesize
|
|
||||||
\begin{verbatim}
|
|
||||||
/***********************************************/
|
|
||||||
/* read_ADC() */
|
|
||||||
/***********************************************/
|
|
||||||
/* Software function to read voltage from a */
|
|
||||||
/* specified ADC MUX channel */
|
|
||||||
/* Assume 10 ADC MUX channels 0..9 */
|
|
||||||
/* ADC_CHAN_RANGE = 9 */
|
|
||||||
/* Assume ADC is 12 bit and ADCRANGE = 4096 */
|
|
||||||
/* returns voltage read as double precision */
|
|
||||||
/***********************************************/
|
|
||||||
double read_ADC( int channel ) {
|
|
||||||
int timeout = 0;
|
|
||||||
|
|
||||||
/* return out of range result */
|
|
||||||
/* if invalid channel selected */
|
|
||||||
if ( channel > ADC_CHAN_RANGE )
|
|
||||||
return -2.0;
|
|
||||||
/* set the multiplexer to the desired channel */
|
|
||||||
ADCMUX = channel;
|
|
||||||
ADCGO = 1; /* initiate ADC conversion hardware */
|
|
||||||
/* wait for ADC conversion with timeout */
|
|
||||||
while ( ADCGO == 1 || timeout < 100 )
|
|
||||||
timeout++;
|
|
||||||
if ( timeout < 100 )
|
|
||||||
dval = (double) ADCOUT * 5.0 / ADCRANGE;
|
|
||||||
else
|
|
||||||
dval = -1.0; /* indicate invalid reading */
|
|
||||||
/* return voltage as a floating point value */
|
|
||||||
/* ensure: value is voltage input to within 0.1% */
|
|
||||||
return dval;
|
|
||||||
}
|
|
||||||
\end{verbatim}
|
|
||||||
\caption{Software Function: \textbf{read\_ADC}}
|
|
||||||
\label{fig:code_read_ADC}
|
|
||||||
\end{figure}
|
|
||||||
%}
|
|
||||||
%}
|
|
||||||
|
|
||||||
|
|
||||||
We now have a very simple software structure, a call tree, where {\em read\_4\_20\_input}
|
|
||||||
calls {\em read\_ADC}, which in turn interacts with the hardware/electronics.
|
|
||||||
%shown in figure~\ref{fig:ct1}.
|
|
||||||
%
|
|
||||||
% \begin{figure}[h]
|
|
||||||
% \centering
|
|
||||||
% \includegraphics[width=56pt]{./ct1.png}
|
|
||||||
% % ct1.png: 151x224 pixel, 72dpi, 5.33x7.90 cm, bb=0 0 151 224
|
|
||||||
% \caption{Call tree for software example}
|
|
||||||
% \label{fig:ct1}
|
|
||||||
% \end{figure}
|
|
||||||
%
|
|
||||||
This software is above the hardware in the conceptual call tree---from a programmatic perspective---%in software terms---the
|
|
||||||
software is reading values from the `lower~level' electronics.
|
|
||||||
%
|
|
||||||
%FMEA is always a bottom-up process and so we must begin with this hardware.
|
|
||||||
%
|
|
||||||
The hardware is simply a load resistor, connected across an ADC input
|
|
||||||
pin on the micro-controller and ground.
|
|
||||||
%
|
|
||||||
We can identify the resistor and the ADC module of the micro-controller as
|
|
||||||
the base components in this design.
|
|
||||||
%
|
|
||||||
We now apply FMMD starting with the hardware.
|
|
||||||
|
|
||||||
|
|
||||||
\section{Failure Mode effects Analysis}
|
|
||||||
|
|
||||||
Four emerging and current techniques are now used to
|
|
||||||
apply FMEA to the hardware, the software, the software medium and the software hardware insterface.
|
|
||||||
|
|
||||||
\subsection{Hardware FMEA}
|
|
||||||
|
|
||||||
The hardware FMEA requires that for each component we consider all failure modes
|
|
||||||
and the putative effect those failure modes would have on the system.
|
|
||||||
The electronic components in our {\ft} system are the load resistor,
|
|
||||||
the multiplexer and the analogue to digital converter.
|
|
||||||
|
|
||||||
{
|
|
||||||
\tiny
|
|
||||||
\begin{table}[h+]
|
|
||||||
\caption{Hardware FMEA {\ft}} % title of Table
|
|
||||||
\label{tbl:r420i}
|
|
||||||
|
|
||||||
\begin{tabular}{|| l | c | l ||} \hline
|
|
||||||
\textbf{Failure} & \textbf{failure} & \textbf{System} \\
|
|
||||||
\textbf{Scenario} & \textbf{effect} & \textbf{Failure} \\ \hline
|
|
||||||
\hline
|
|
||||||
$R$ & OPEN~\cite{en298}[Ann.A] & $LOW$ \\
|
|
||||||
& & $READING$ \\ \hline
|
|
||||||
|
|
||||||
$R$ & SHORT~\cite{en298}[Ann.A] & $HIGH$ \\
|
|
||||||
& & $READING$ \\ \hline
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
$MUX$ & read wrong & $VAL\_ERROR$ \\
|
|
||||||
& input ~\cite{fmd91}[3-102] & \\ \hline
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
$ADC$ & ADC output & $VAL\_ERROR$ \\
|
|
||||||
& erronous ~\cite{fmd91}[3-109] & \\ \hline
|
|
||||||
\hline
|
|
||||||
\end{tabular}
|
|
||||||
\end{table}
|
|
||||||
}
|
|
||||||
|
|
||||||
The last two failures both lead to the system failure of $VAL\_ERROR$ .
|
|
||||||
They could lead to low or high reading as well, but we would only be able to determine this
|
|
||||||
from knowledge of the software systems criteria for these.
|
|
||||||
\clearpage
|
|
||||||
\subsection{Software FMEA - variables in place of components}
|
|
||||||
|
|
||||||
For software FMEA, we take the variables used by the system,
|
|
||||||
and examine what could happen if they are corrupted in various ways~\cite{procsfmea, embedsfmea}.
|
|
||||||
From the function $read\_4\_20\_input()$ we have the variables $error\_flag$,
|
|
||||||
$input\_volts$ and $value$: from the function $read\_ADC()$, $timeout$, $ADCMUX$, $ADCGO$, $dval$.
|
|
||||||
We must now determine putative system failure modes for these variables becoming corrupted, this is performed in table~\ref{tbl:sfmea}.
|
|
||||||
|
|
||||||
|
|
||||||
{
|
|
||||||
\tiny
|
|
||||||
\begin{table}[h+]
|
|
||||||
\caption{SFMEA {\ft}} % title of Table
|
|
||||||
\label{tbl:sfmea}
|
|
||||||
|
|
||||||
\begin{tabular}{|| l | c | l ||} \hline
|
|
||||||
\textbf{Failure} & \textbf{failure} & \textbf{System} \\
|
|
||||||
\textbf{Scenario} & \textbf{effect} & \textbf{Failure} \\ \hline
|
|
||||||
\hline
|
|
||||||
$error\_flag$ & set FALSE & $VAL\_ERROR$ \\
|
|
||||||
& & \\ \hline
|
|
||||||
|
|
||||||
$error\_flag$ & set TRUE & invalid \\
|
|
||||||
& & error flag \\ \hline
|
|
||||||
|
|
||||||
$input\_volts$ & corrupted & $VAL\_ERROR$ \\
|
|
||||||
& & \\ \hline
|
|
||||||
|
|
||||||
|
|
||||||
$value $ & corrupted & $VAL\_ERROR$ \\
|
|
||||||
& & \\ \hline
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
$timeout $ & corrupted & $VAL\_ERROR$ \\
|
|
||||||
& & \\ \hline
|
|
||||||
|
|
||||||
|
|
||||||
$ADCMUX $ & corrupted & $VAL\_ERROR$ \\
|
|
||||||
& & \\ \hline
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
$ADCGO $ & corrupted & $VAL\_ERROR$ \\
|
|
||||||
& & \\ \hline
|
|
||||||
|
|
||||||
$dval $ & corrupted & $VAL\_ERROR$ \\
|
|
||||||
& & \\ \hline
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
\hline
|
|
||||||
\end{tabular}
|
|
||||||
\end{table} xe
|
|
||||||
}
|
|
||||||
\clearpage
|
|
||||||
\subsection{Software FMEA - failure modes of the medium ($\mu P$) of the software}
|
|
||||||
|
|
||||||
Microprocessors/Microcontrollers have sets of known failure modes, these include RAM, ROM
|
|
||||||
EEPROM failure\footnote{EEPROM failure is not applicable for this example.} and
|
|
||||||
oscillator clock timing
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
{
|
|
||||||
\tiny
|
|
||||||
\begin{table}[h+]
|
|
||||||
\caption{SFMEA {\ft}} % title of Table
|
|
||||||
\label{tbl:sfmeaup}
|
|
||||||
|
|
||||||
\begin{tabular}{|| l | c | l ||} \hline
|
|
||||||
\textbf{Failure} & \textbf{failure} & \textbf{System} \\
|
|
||||||
\textbf{Scenario} & \textbf{effect} & \textbf{Failure} \\ \hline
|
|
||||||
\hline
|
|
||||||
$RAM$ & variable & All errors \\
|
|
||||||
& corruption & from table~\ref{tbl:sfmea} \\ \hline
|
|
||||||
|
|
||||||
$RAM$ & proxegram flow & process \\
|
|
||||||
& & halts / crashes \\ \hline
|
|
||||||
|
|
||||||
$OSC$ & stopped & process \\
|
|
||||||
& & halts \\ \hline
|
|
||||||
|
|
||||||
$OSC$ & too & ADC \\
|
|
||||||
& fast & value errors \\ \hline
|
|
||||||
|
|
||||||
$OSC$ & too & ADC \\
|
|
||||||
& slow & value errors \\ \hline
|
|
||||||
|
|
||||||
$ROM$ & program & All errors \\
|
|
||||||
& corruption & from table~\ref{tbl:sfmea} \\ \hline
|
|
||||||
|
|
||||||
$ROM$ & constant & All errors \\
|
|
||||||
& /data corruption & from table~\ref{tbl:sfmea} \\ \hline
|
|
||||||
|
|
||||||
\hline
|
|
||||||
\end{tabular}
|
|
||||||
\end{table}
|
|
||||||
}
|
|
||||||
|
|
||||||
\clearpage
|
|
||||||
\subsection{Software FMEA - The software/hardware interface}
|
|
||||||
|
|
||||||
As FMEA is applied separately to software and hardware
|
|
||||||
the interface between them is an undefined factor.
|
|
||||||
Ozarin~\cite{sfmeainterface,procsfmea} recommends that an FMEA report be written
|
|
||||||
to focus on the software/hardware interface.
|
|
||||||
The software/hardware interface has
|
|
||||||
specific problems common to many systems and configurations
|
|
||||||
and these are described in~\cite{sfmeainterface}.
|
|
||||||
%An interface FMEA is performed in table~\ref{hwswinterface}.
|
|
||||||
%
|
|
||||||
The hardware to software interface for the {\ft} example is handled
|
|
||||||
by the 'C' function $read\_ADC()$
|
|
||||||
(see code sample in figure~\ref{fig:code_read_ADC}).
|
|
||||||
%
|
|
||||||
% An FMEA of the `software~medium' is given in table~\ref{tbl:sfmeaup}.
|
|
||||||
\paragraph{Timing and Synchronisation.}
|
|
||||||
The $ADCOUT$ register, where the raw ADC value is read
|
|
||||||
is an internal register used by the ADC and presented
|
|
||||||
as a readable memory location when the ADC
|
|
||||||
has finished updating it.
|
|
||||||
Reading it at the wrong time would
|
|
||||||
cause an invalid value to be read.
|
|
||||||
The synchronisation is performed by polling an $ADCGO$
|
|
||||||
bit, a flag mapped to memory by which the ADC indicates that the data is ready.
|
|
||||||
|
|
||||||
\paragraph{Interrupt Contention.}
|
|
||||||
Were an interrupt to also attempt to read from the ADC
|
|
||||||
the ADCMUX could be altered, causing the non-interrupt
|
|
||||||
routine to read from the wrong channel.
|
|
||||||
|
|
||||||
\paragraph{Data Formatting.}
|
|
||||||
The ADC may use a big-endian or little endian integer
|
|
||||||
format. It may also right or left justify the bits in its value.
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
\subsection{SFMEA Conclusion}
|
|
||||||
%
|
|
||||||
This paper has picked a very simple example (the industry standard {\ft}
|
|
||||||
input circuit and software) to demonstrate
|
|
||||||
SFMEA and HFMEA methodologies used to describe a failure mode model.
|
|
||||||
%Even a modest system would be far too large to analyse in conference paper
|
|
||||||
%and this
|
|
||||||
%
|
|
||||||
%The {\dc} representing the {\ft} reader
|
|
||||||
%shows that by taking a
|
|
||||||
%modular approach for FMEA, i.e. FMMD, we can integrate
|
|
||||||
Our model is described by four FMEA reports; and these % we can model the failure mode behaviour from
|
|
||||||
model the system from several failure mode perspectives.
|
|
||||||
%
|
|
||||||
With traditional FMEA methods the reasoning~distance is large, because
|
|
||||||
it stretches from the component failure mode to the top---or---system level failure.
|
|
||||||
%
|
|
||||||
With these four analysis reports
|
|
||||||
we do not have stages along the `reasoning~path' linking the failure modes from the
|
|
||||||
electronics to those in the software.
|
|
||||||
%Software is often written `defensively' but t
|
|
||||||
%Each {\fg} to {\dc} transition represents a
|
|
||||||
%reasoning stage.
|
|
||||||
%
|
|
||||||
%
|
|
||||||
%For this reason applying traditional FMEA to software stretches
|
|
||||||
%the reasoning distance even further.
|
|
||||||
%
|
|
||||||
In fact many these reasoning paths overlap---or even by-pass one another---
|
|
||||||
it is very difficult to gauge cause and effect.
|
|
||||||
For instance, hardware failures are not analysed in the context of how they will
|
|
||||||
be handled (or missed) by the software.
|
|
||||||
%
|
|
||||||
System outputs commanded from software may not take into account particular
|
|
||||||
hardware limitations etc.
|
|
||||||
|
|
||||||
The interface FMEA does serve to provide a useful
|
|
||||||
check-list to ensure data and synchronisation conventions used by the hardware
|
|
||||||
and software are not mismatched. However, the fact it is perceived as required %The fact its required
|
|
||||||
highlights the the miss-matches possible between the two types of analysis
|
|
||||||
which could run deeper than the mere interface level.
|
|
||||||
|
|
||||||
However, while these techniques ensure that the software and hardware is
|
|
||||||
viewed and analysed from several perspectives, it cannot be termed a homogeneous
|
|
||||||
failure mode model.
|
|
||||||
% For instance
|
|
||||||
% were the ADC to have a small value error, say adding
|
|
||||||
% a small percentage onto the value, we would be unable to
|
|
||||||
% detect this under the analysis conditions for this model, or
|
|
||||||
% be able to pinpoint it.
|
|
||||||
%
|
%
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
\section{Conclusion}
|
\section{Conclusion}
|
||||||
|
|
||||||
|
\paragraph{Where FMEA is now}
|
||||||
FMEA useful tool for basic safety --- provides statistics on safety where field data impractical ---
|
FMEA useful tool for basic safety --- provides statistics on safety where field data impractical ---
|
||||||
very good with single failure modes linked to top level events.
|
very good with single failure modes linked to top level events.
|
||||||
FMEA has become part of the safety critical and safety certification industries.
|
FMEA has become part of the safety critical and safety certification industries.
|
||||||
|
%
|
||||||
SFMEA is in its infancy, but there is a gap in current
|
SFMEA is in its infancy, but there is a gap in current
|
||||||
certification for software, EN61508, recommends hardware redundancy architectures in conjunction
|
certification for software, EN61508~\cite{en61508}, recommends hardware redundancy architectures in conjunction
|
||||||
with FMEDA for hardware: for software it recommends language constraints and quality procedures
|
with FMEDA for hardware: for software it recommends language constraints and quality procedures
|
||||||
but no inductive fault finding technique.
|
but no inductive fault finding technique.
|
||||||
|
|
||||||
|
FMEA has adapted from a cost saving exercise for mass produced items, to incorporating statistical techniques
|
||||||
|
(FMECA) to allowing for self diagnostic mitigation (FMEDA).
|
||||||
|
However, it is still based on the single component failure mapped to system level failure.
|
||||||
|
All these FMEA based methodologies have the following short comings:
|
||||||
|
\begin{itemize}
|
||||||
|
\item Impossible to integrate Software and hardware models,
|
||||||
|
\item State explosion problem exacerbated by increasing complexity due to density of modern electronics,
|
||||||
|
\item Impossibility to consider all multiple component failure modes
|
||||||
|
\end{itemize}
|
||||||
|
Loading…
Reference in New Issue
Block a user