90 lines
5.7 KiB
TeX
90 lines
5.7 KiB
TeX
|
|
\paragraph{Abstract}{
|
|
The ability to assess the safety of man made equipment has been a concern
|
|
since the dawn of the industrial age~\cite{indacc01}~\cite{steamboilers}.
|
|
The philosophy behind safety measure has progressed
|
|
with time, and by world war two~\cite{boffin} we begin to see concepts such as `no single component failure should cause
|
|
a dangerous system failure' emerging.
|
|
The concept of a double failure causing a dangerous condition being unacceptable,
|
|
can be found in the legally binding European standard EN298~\cite{en298}.
|
|
More sophisticated statistically based standards, i.e EN61508~\cite{en61508} and variants thereof,
|
|
governing failure conditions and determining risk levels associated with systems.
|
|
|
|
All of these risk assessment techniques are based on variations on the theme of
|
|
Failure Mode Effect Analysis (FMEA), which has its roots in the 1940's mass production industry
|
|
and was designed to save large companies money by fixing the most financially
|
|
draining problems in a product first.
|
|
|
|
This thesis show that the refinements and additions made to
|
|
FMEA to tailor them for military or statistical commercial use, have common flaws
|
|
which make them unsuitable for the higher safety requirements of the 21st century.
|
|
Problems with state explosion in failure mode reasoning and the impossibility
|
|
of integrating software and hardware failure mode models are the most obvious of these. %flaws.
|
|
The methodologies are explained in chapter~\ref{sec:chap2} and the advantages and drawbacks
|
|
of each FMEA variant are examined in chapter~\ref{sec:chap3}.
|
|
In chapter~\ref{sec:chap4}, a new methodology is then proposed which addresses the state explosion problem
|
|
and, using contract programmed software, allows the modelling of integrated
|
|
software/electrical systems.
|
|
This is followed by two chapters showing examples of the new modular FMEA analysis technique (Failure Mode Modular De-Composition FMMD)
|
|
firstly looking at electronic circuits and then at electronic/software hybrid systems.
|
|
}
|
|
|
|
\section{Introduction}
|
|
The motivation for this study came form two sources, one academic and the other
|
|
practical. I had recently completed an
|
|
Msc and my project was to create an Euler/Spider Diagram editor in Java.
|
|
This editor allowed the user to draw Euler/Spider diagrams, and could then
|
|
represent these as abstract---or mathematical---definitions.
|
|
At work, writing embedded `C' and assembly language code for safety critical
|
|
industrial burners, we were faced with a new and daunting requirement.
|
|
Conformance to the latest European standard, EN298. It appeared to ask for the impossible,
|
|
not only did it require the usual safety measures (self checking of ROM and RAM, watchdog processors with separate clock sources, EMC
|
|
triple fail safe control of valves), it had one new clause in it, that had far reaching consequences.
|
|
It stated that in the event of a failure, where the controller had gone into a `lockout~state'--- a state where the controller
|
|
applies all possible safety measures to stop fuel entering the burner---it could not become dangerous should another fault occur.
|
|
In short this meant we had to be able to deal with double failures.
|
|
Any of the components that could, in failing create a dangerous state, were already
|
|
documented and approved using failure mode effects analysis (FMEA). This new requirement
|
|
effectively meant that any all combinations of component failures were
|
|
now required to be analysed. This, from a state explosion problem alone,
|
|
meant that it was going to be virtually impossible to perform.
|
|
%
|
|
Following the concept of de-composing a problem, and thus simplifying the state explosion---using the thinking behind
|
|
the fast Fourier transform (FFT)~\cite{fpodsadsp}[Ch.8], which takes a complex intermeshed series of real and imaginary number calculations
|
|
and by de-composing them simplifies the problem.
|
|
My reasoning was that were I to analyse the problem in small modules, from the bottom-up following the FFT example, I could apply
|
|
checking for all double failure scenarios.
|
|
Once these first modules were analysed, I now call them {\fgs}, I could determine the symptoms of failure for them
|
|
Using the symptoms of failure, I could now treat these modules as components, now called {\dcs}, and use them to build higher level
|
|
modules. I could apply double simultaneous failure mode checking, because the number of components
|
|
in each module/{\fg} was quite small---thus avoiding state explosion problems, but I could apply
|
|
double checking all the way up the hierarchy. In fact this meant, as a by-product that many multiple as well as double
|
|
failures would be analysed.
|
|
|
|
|
|
Euler/Spider Diagrams
|
|
could be used to model failure modes in components.
|
|
Contours could represent failure modes, and the spider diagram
|
|
`existential~points' instances of failure modes.
|
|
By drawing a spider collecting existential points, a common failure symptom could
|
|
be determined and from this a new diagram generated automatically, to represent the {\dc}.
|
|
Each spider represented a derived failure mode.
|
|
These concepts were presented at the ``Euler~2004''~\cite{Clark200519} conference at Brighton University.
|
|
|
|
--- 2005 paper --- need for static analysis because of
|
|
high reliability of modern safety critical systems.
|
|
|
|
\section{Practical Experience: Safety Critical Product Approvals}
|
|
|
|
FMEA performed on selected areas perceived as critical
|
|
by test house.
|
|
Blanket measures, RAM ROM checks, EMC, electrical and environmental stress testing
|
|
|
|
\subsection{Practical limitations of testing for certification vs. rigorous approach}
|
|
|
|
State explosion problem considering a failure mode of a given component against
|
|
all other components in the system i.e. an exponential ($2^N$) order of processing resource rather than a polynomial i.e. $N^2$.
|
|
|
|
Impossible to perform double simultaneous failure analysis (as demanded by EN298~\cite{en298}).
|
|
|