309 lines
16 KiB
TeX
309 lines
16 KiB
TeX
\label{sec:chap1}
|
|
|
|
%\paragraph{Abstract} % : The Scope of this study.}{
|
|
{
|
|
%
|
|
Increasingly society relies on automation in everyday life.
|
|
%
|
|
Many % of the
|
|
automated systems have the potential to cause harm or even death should they fail.
|
|
%
|
|
Safety assessment and certification is now required for %of
|
|
almost all potentially dangerous equipment.
|
|
%
|
|
As part of the assessment/certification process, typically
|
|
a battery of tests is applied, examining features such as resistance to extremes of environment, Electro Magnetic Compatibility (EMC),
|
|
endurance regimes and static testing.
|
|
%
|
|
Static testing is at the theoretical, or design level, and involves
|
|
looking at failure scenarios and trying to predict how systems would react.
|
|
%
|
|
This thesis deals with one area of static testing, that of Failure Mode Effects Analysis (FMEA)~\cite{iec60812}, a commonly
|
|
used technique that is a legal requirement %mandatory
|
|
for a wide range of equipment certification.
|
|
|
|
The ability to assess the safety of machinery %man made equipment
|
|
has been a concern
|
|
since the dawn of the industrial age~\cite{usefulinfoengineers,steamboilers}.
|
|
%
|
|
% The philosophy behind safety measures has progressed
|
|
% over time and by World War Two we began to see concepts such as `no single component failure should cause
|
|
% a dangerous system failure'~\cite{boffin} emerging~\cite{echoesofwar}[Ch.13].
|
|
The philosophy behind safety measures has progressed
|
|
over time and by World War Two concepts such as `no single component failure should cause
|
|
a dangerous system failure'~\cite{boffin} emerged~\cite{echoesofwar}[Ch.13].
|
|
%
|
|
Concepts such as these allow
|
|
objective criteria of safety assessment.
|
|
%
|
|
The `no~single~failure' concept can be extended
|
|
to double or even multiple failures being
|
|
unacceptable as the cause of dangerous states.
|
|
%
|
|
The concept of a double failure causing a dangerous condition being forbidden
|
|
can be found in the legally binding European standard EN298\footnote{EN298:2003 became
|
|
a legal requirement for all new forced draft industrial burner controllers in 2006 within
|
|
the European Union.} which
|
|
came into force
|
|
in 2006~\cite{en298}.
|
|
%
|
|
More sophisticated statistically based standards, i.e EN61508~\cite{en61508} and variants thereof,
|
|
are based on statistical thresholds for the frequency of dangerous failures.
|
|
%
|
|
For instance, acceptable maximum numbers of
|
|
dangerous failures per billion hours of operation could be stated.
|
|
%
|
|
% We can then broadly categorise orders of failure rates into Safety Integrity Levels (SIL)~\cite{scsh}.
|
|
Orders of failure rates can then be broadly categorised into Safety Integrity Levels (SIL)~\cite{scsh}.
|
|
%
|
|
So for a maximum of 10 potentially dangerous failures per billion hours of operation a SIL level of 4 is assigned,
|
|
for 100 a SIL level of 3, and so on in powers of ten.
|
|
%
|
|
If SIL ratings can be determined,
|
|
they can be matched against given risks.
|
|
%
|
|
The more dangerous the consequences of failure
|
|
the higher the SIL rating. % we can demand for it.
|
|
%
|
|
A band-saw with one operative may require a SIL rating of 1,
|
|
%but something with higher potential for harm to a larger number of people,
|
|
but systems
|
|
such as nuclear power-stations or air-liners,
|
|
with far greater consequences on dangerous failure,
|
|
may require a SIL ratings of 4.
|
|
%
|
|
%That is while a low incidence of failure may be tolerable on a band-saw,
|
|
%extremely low incidences of failure would be tolerable in a nuclear plant.
|
|
%SIL ratings provide another objective yardstick for the measurement of system safety.
|
|
%governing failure conditions and determining risk levels associated with systems.
|
|
|
|
All of these risk assessment techniques are based on variations of %on the theme of
|
|
Failure Mode Effect Analysis (FMEA), which has its roots in the 1940's mass production industry
|
|
and was designed to save large companies money by prioritising the most financially
|
|
draining problems in a product. % first.
|
|
%
|
|
The FMEA of the 1940's has been refined and extended into four main variants.
|
|
%
|
|
This thesis describes the refinements and additions made to
|
|
FMEA to tailor them for military or statistically biased % commercial
|
|
use.
|
|
It then reveals common flaws
|
|
which make them unsuitable for the higher safety requirements of the 21st century.
|
|
%
|
|
\fmmdglossSTATEEX
|
|
Problems with state explosion in failure mode reasoning and the current difficulties %impossibility
|
|
of integrating software and hardware failure mode models~\cite{1372150} are the most obvious of these. %flaws.
|
|
%
|
|
These four current methodologies are described in chapter~\ref{sec:chap2} and %the advantages and drawbacks
|
|
%of each FMEA variant are examined
|
|
critically assessed in chapter~\ref{sec:chap3}.
|
|
\fmmdglossSTATEEX
|
|
In chapter~\ref{sec:chap4}, a new methodology is proposed which addresses the state explosion problem
|
|
and using contract programmed software, allows the modelling of integrated
|
|
software/electrical systems.
|
|
%
|
|
This is followed by two chapters showing examples of the new modular FMEA analysis technique (Failure Mode Modular De-Composition, FMMD)
|
|
firstly looking at a variety of common electronic circuits and then at electronic/software hybrid systems.
|
|
}
|
|
|
|
\section{Motivation}
|
|
The motivation for this study came from two sources, one academic (the author's Software Engineering MSc project) and the other
|
|
practical (the author is a practising embedded software engineer working with FMEA on safety critical burner systems).
|
|
%
|
|
% AF does not think the paragraph below should be included 12JAN2013
|
|
\paragraph{MSc Project: Euler/Spider diagram Editor.}
|
|
The author had recently completed an
|
|
MSc and the project was to create an Euler/Spider~Diagram~\cite{howse:spider} editor in Java.
|
|
This editor allowed the user to draw Euler/Spider diagrams, and could then
|
|
represent these as abstract---i.e. mathematical---definitions.
|
|
%
|
|
The primary motive for writing the Spider diagram editor was to provide an alternative
|
|
to formal languages for software specification.
|
|
%
|
|
An added attraction for using spider diagrams was that they could be used in
|
|
proving logic and theorems~\cite{theoremflower,Fish200553} in an intuitive way.
|
|
%
|
|
Because of the author's daily work exposure to FMEA,
|
|
%I started thinking
|
|
it was natural to think
|
|
of ways to apply formal languages and spider diagrams to
|
|
failure mode analysis.
|
|
%
|
|
%
|
|
\paragraph{European Safety Requirements increase in scope and complexity.}
|
|
At work---which consisted of designing, testing, building and writing embedded `C' and assembly language code for safety critical
|
|
industrial burners---the design team was faced with a new and daunting requirement.
|
|
Conformance to the latest European standard, EN298~\cite{en298}.
|
|
%
|
|
It appeared to ask for the impossible:
|
|
not only did it require the usual safety measures (self-checking of ROM and RAM, watchdog processors with separate clock sources, EMC testing and the
|
|
triple fail safe control of valves), it had one new clause in it that had far reaching consequences.
|
|
%
|
|
It stated that in the event of a failure, where the controller had gone into a `lockout~state'--- a state where the controller
|
|
applies all possible safety measures to stop fuel entering the burner---it was not permitted to % could not
|
|
become dangerous should another fault occur.
|
|
%
|
|
In short this meant %we had to be able to
|
|
dealing with double failures.
|
|
%
|
|
Any of the components that could, in failing, create a dangerous state were already
|
|
documented and approved using failure mode effects analysis (FMEA).
|
|
%
|
|
This new requirement
|
|
effectively meant that single and double component failures were
|
|
now required to be analysed~\cite{en298}[9.1.5].
|
|
%
|
|
This, from a state explosion problem alone,
|
|
meant that it was going to be virtually impossible to perform.
|
|
\fmmdglossSTATEEX
|
|
%
|
|
To compound the problem, %state explosion problem
|
|
FMEA has a deficiency of repeated work, as each component failure is typically represented
|
|
by one line or entry in a spreadsheet~\cite{bfmea}; analysis on repeated sections of
|
|
circuitry (for instance repeated {\ft} outputs on a PCB) meant that
|
|
analysis of identical circuitry was performed many times.
|
|
%
|
|
|
|
%
|
|
\subsection{Modularising/De-Composing FMEA: Initial concepts.} % and augmenting this with concepts from Euler/Spider Diagrams.}
|
|
%
|
|
In the field of digital signal processing there is an algorithm that revolutionised
|
|
access to frequency analysis of digital samples called the Fast Fourier Transform (FFT)~\cite{fftoriginal}.
|
|
This took the Discrete Fourier Transform (DFT), and applied de-composition to its
|
|
mesh of (often repeated) complex number calculations~\cite{fpodsadsp}[Ch.8].
|
|
%
|
|
By doing this it broke the computing order of complexity down from having a polynomial %n exponential
|
|
%order
|
|
to logarithmic order~\cite{ctw}[pp.401-3].
|
|
%
|
|
The author wondered if this thinking could be applied to the state explosion problems encountered in FMEA.
|
|
%
|
|
\fmmdglossSTATEEX
|
|
%Following the concept of de-composing a problem, and thus simplifying the state explosion---using the thinking behind
|
|
%the fast Fourier transform (FFT)~\cite{fpodsadsp}[Ch.8], which takes a complex intermeshed series of real and imaginary number calculations
|
|
%and by de-composing them, simplifies the problem.
|
|
%
|
|
% My reasoning was that if we analysed %were we to analyse
|
|
% the problem in small modules, from the bottom-up following the FFT example, we could apply
|
|
% checking for all double failure scenarios.
|
|
The authors reasoning was that if %were we to analyse
|
|
the problem were analysed in small modules, from the bottom-up following the FFT example,
|
|
checking for all double failure scenarios could have been applied.
|
|
%
|
|
% Once these first modules were analysed---we now call them {\fgs}---we could determine the symptoms of failure for them.
|
|
% Using the symptoms of failure, we could now treat these modules as components in their own right---or {\dcs}---and use them to build higher level
|
|
% {\fgs}. Higher and higher levels of {\fgs} could be built until we had a hierarchy
|
|
% representing a failure mode model for the system.
|
|
Once these first modules were analysed---now called {\fgs}---the symptoms of failure could be determined for them.
|
|
%
|
|
Using the symptoms of failure, these modules could be treated as components in their own right---or {\dcs}---and used to build higher level
|
|
{\fgs}.
|
|
%
|
|
Higher and higher levels of {\fgs} could be built until a hierarchy
|
|
representing a failure mode model for the complete system had been created.
|
|
%
|
|
%Because this is modular, %we can apply double simultaneous failure mode checking; and as %because
|
|
Double simultaneous failure mode checking can be applied as
|
|
the number of components
|
|
in each {\fg} is typically small; state explosion problems are thus avoided. % for the general case. % AF says `in the general case' here 12JAN2013
|
|
\fmmdglossSTATEEX
|
|
%
|
|
%
|
|
% If we apply
|
|
% double checking all the way up the hierarchy we can guarantee to have considered
|
|
% every double simultaneous failure of all components in a system.
|
|
If
|
|
double checking is applied all the way up the hierarchy,
|
|
%we can guarantee to have considered
|
|
all possible
|
|
double simultaneous failures in a system can be guaranteed to have been considered.
|
|
%
|
|
This means, as a fortunate by-product, that many multiple as well as double
|
|
failures would be analysed, but because failure modes are traceable from the base components to the top level---or system---failure modes,
|
|
these relationships can be held in a traversable data structure.
|
|
%
|
|
% If held in a traversable data structure we can apply automated methods to search for all the combinations of multiple failure modes
|
|
% within the model that had been analysed.
|
|
If held in a traversable data structure automated methods can be applied to search for all the combinations of multiple failure modes
|
|
throughout the model being analysed.
|
|
%
|
|
Because of this, it will not always %it may not
|
|
be necessary to apply double checking
|
|
at all higher levels in the analysis hierarchy, to achieve complete double failure coverage.
|
|
%
|
|
The points at which it is possible to relax double failure checking can be verified automatically by traversing
|
|
the failure mode model.
|
|
%
|
|
\subsection{Initial direction: Application of Spider diagrams to FMEA.}
|
|
|
|
Because, Euler/Spider Diagrams~\cite{howse:spider}
|
|
could be used to model failure modes in components
|
|
it was thought that a diagrammatic notation would
|
|
be more user friendly than using formal logic.
|
|
%
|
|
For an FMEA Spider diagram, contours represent failure modes, and the Spider diagram
|
|
`existential~points' represent instances of failure modes.
|
|
%
|
|
Overlapping contours represent multiple failure modes.
|
|
%
|
|
By drawing a spider collecting existential points, a common failure symptom could
|
|
be determined and from this a new diagram generated automatically to represent the {\dc}.
|
|
%
|
|
Each spider represented a derived failure mode.
|
|
The act of collecting common symptoms by drawing spiders
|
|
meant that the analyst was forced to associate one component failure mode with one symptom/derived~failure~mode of failure.
|
|
%
|
|
These concepts were presented at the ``Euler~2004''~\cite{Clark200519} conference held at the University of Brighton. %
|
|
%
|
|
This defined the concepts for modularising FMEA using the formal visual notations from Spider diagrams.
|
|
This lead to work on rapidly calculating available zones in Euler diagrams~\cite{Clark_fastzone,Rodgers2013}.
|
|
%
|
|
The spider diagram notation was useful in defining the concepts and
|
|
initial ideas, but a more traditional `spreadsheet' format has been used
|
|
for the analysis stages of the new methodology.
|
|
%
|
|
Euler diagrams have been used later in the thesis to describe the containment relationships
|
|
of derived components when building hierarchical analysis models with the modularised
|
|
variant of FMEA that this thesis proposes and defends.
|
|
%
|
|
|
|
%
|
|
\section{Objectives of the thesis.}
|
|
The primary objective of the work performed for this thesis is to present a new modularised variant of
|
|
FMEA which solves the problems of:
|
|
\begin{itemize}
|
|
\item State Explosion,
|
|
\item Multiple failure mode modelling,
|
|
\item Re-usability of pre-analysed modules,
|
|
\item Inclusion of software in failure mode modelling.
|
|
\end{itemize}
|
|
To support this, worked examples using the new methodology were created and the work published and presented to
|
|
IET safety conferences. % in 2011~\cite{syssafe2011} and 2012~\cite{syssafe2012}.
|
|
%
|
|
The development of FMMD, starting with a critique of FMEA and a ``wish-list'' for a better methodology,
|
|
was presented to the IET System safety conference in 2011,~\cite{syssafe2011}.
|
|
%
|
|
FMEA, currently cannot integrate software models into its hardware failure mode models~\cite{sfmea,modelsfmea,embedsfmea,sfmeainterface}, but
|
|
%
|
|
\fmmdglossCONTRACTPROG
|
|
FMMD can use the existing structure of functional software, in conjunction
|
|
with contract programming to model software;
|
|
%and
|
|
this concept was presented to the IET System safety conference in 2012~\cite{syssafe2012}.
|
|
|
|
\paragraph{Overview of the thesis.}
|
|
Chapter~\ref{sec:chap2} examines the current state of FMEA based methodologies, Chapter~\ref{sec:chap3}
|
|
examines the benefits and drawbacks of these methodologies
|
|
and proposes a detailed wish list for an ideal FMEA technique.
|
|
Chapter~\ref{sec:chap4} proposes Failure Mode Modular de-composition (FMMD)---a modularised variant
|
|
of FMEA designed to address the points in the detailed wish list.
|
|
Chapter~\ref{sec:chap5} provides worked examples using selected electronic circuits.
|
|
Chapter~\ref{sec:chap6} gives two examples of integrated software and electronic systems analysed using FMMD.
|
|
Metrics and evaluation, along with an example showing double simultaneous failure analysis,
|
|
are provided in Chapter~\ref{sec:chap7}, with a conclusion and further work in Chapter~\ref{sec:chap8}.
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|