248 lines
9.2 KiB
TeX
248 lines
9.2 KiB
TeX
|
|
|
|
|
|
|
|
\ifthenelse {\boolean{paper}}
|
|
{
|
|
\abstract{ This paper proposes a methodology for
|
|
creating failure mode models of safety critical systems, which
|
|
have a common and integrateable notation
|
|
for mechanical, electronic and software domains.
|
|
The proposed methodology is bottom-up and
|
|
modular.}
|
|
}
|
|
{}
|
|
|
|
|
|
\section{Introduction}
|
|
|
|
There are three methodologies in common use for failure mode modelling.
|
|
These are Fault Tree Analysis (FTA), various forms of Fault Mode Effects Analysis (FMEA)
|
|
and statistical analysis.
|
|
|
|
These methodologies have several draw backs.
|
|
FTA can overlook error conditions, and FMEA and the Statistical Methods
|
|
lack precision in predicting failure modes at the SYSTEM level.
|
|
|
|
The Failure Mode Modular De-composition
|
|
(FMMD) methodology presented here provides a more detailed and analytical
|
|
modelling system from which
|
|
the data models from FTA, FMEA and the statistical approach can be
|
|
derived.
|
|
It also applies analysis stages to the failure mode analysis process
|
|
ensuring that all component failure modes must be considered in the model.
|
|
|
|
FMMD
|
|
\ifthenelse {\boolean{paper}}
|
|
{
|
|
paper
|
|
}
|
|
{
|
|
chapter
|
|
}
|
|
presents a bottom up modular methodology, a extension and refinement of FMEA, where instead of looking
|
|
at individual component failure modes and deciding on their impact on the SYSTEM
|
|
it uses the component failure modes, to build modules or derived components.
|
|
This methodology has been named Failure Mode Modular De-composition (FMMD)
|
|
because it de-composes a SYSTEM into a hierarchy of modules or {\dc}s.
|
|
It does this by working from the bottom up, taking small groups
|
|
of components, {\fgs}, and then analysing how they can fail.
|
|
This analysis is performed using FMEA from a micro rather than a macro perspective.
|
|
Thus instead of looked at a component failure modes, and determining how
|
|
it {\em might} cause a failure at SYSTEM level, we are looking at how
|
|
it will affect the {\fg}.
|
|
When we know the failure modes of a {\fg} we can treat it as a `black box'
|
|
or {\dc}. With {\dc}s we can build {\fgs}
|
|
at higher levels of analysis, until we have a complete
|
|
hierarchy representing the failure behaviour of the SYSTEM.
|
|
Because all the failure modes of all the components
|
|
are held in a computer program, we can determine if the model is complete
|
|
(i.e. all component failure modes have been included in the model).
|
|
|
|
|
|
%OK need to describe the need for it
|
|
\section{The need for a new failure mode modelling methodology}
|
|
|
|
In summary.
|
|
|
|
\subsection { FTA }
|
|
|
|
This, like all top~down methodologies introduces the very serious problem
|
|
of missing component failure modes, or modelling at
|
|
a too high level of failure mode abstraction.
|
|
|
|
\subsection { FMEA }
|
|
|
|
This places a burden of taking individual component failure modes
|
|
and trying to determine what affects this will have at SYSTEM level.
|
|
Justifications for this methodology are often statistical and Bayes Theorem \cite{probstat}
|
|
is often cited.
|
|
This lacks precision, or in other words, determinability prediction accuracy,
|
|
as often the component failure mode cannt be proven to cause a SYSTEM level failure, only to make it more likely.
|
|
Also, it can miss combinations of failure modes that will cause SYSTEM level errors.
|
|
|
|
\subsection { Statistical Analyis }
|
|
|
|
|
|
This uses MTFF and other statisical models to determine the probability of
|
|
failures occurring. A component failure mode, given its MTTF
|
|
the probability of detecting the fault and its safety relevant validation time $\tau$,
|
|
contributes a simple risk factor that is summed
|
|
in to give a final risk result. Thus a statistical
|
|
model can be implemented on a spreadsheet, where each component
|
|
has a calculated risk, and estimated risk importance
|
|
and these are all summed to give the final assement figure.
|
|
|
|
The Statistical Analysis method is used from two perspectives,
|
|
Probability of Failure on Demand (PFD), and Probability of Failure
|
|
in continuous Operation, Failure in Time (FIT) and measured in failures per billion ($10^9$) hours of operation.
|
|
For instance with the anti-lock system on a automobile braking
|
|
system, we would be interested in PFD.
|
|
For a continuously running nuclear powerstation
|
|
we would be interested in its FIT values.
|
|
|
|
This suffers from the same problems of
|
|
lack of determinability prediction accuracy, as FMEA above.
|
|
|
|
By this we may have the MTTF of some critical component failure
|
|
modes, but we can only guess, in most cases what the safety case outcome
|
|
will be if it occurs.
|
|
|
|
This leads to having components within a SYSTEM partitioned into different
|
|
safety level zones \cite{en61508}. This is a vague way of determining
|
|
safety.
|
|
|
|
The Statistical Analyis methodology is the core philosophy
|
|
of the Safety Integrity Levels (SIL) of EN61508 \cite{en61508}.
|
|
|
|
|
|
%AND then how we can solve all there problems
|
|
|
|
\section{A wish list for a failure mode methodolgy}
|
|
\begin{itemize}
|
|
\item All component failure modes must be considered in the model.
|
|
\item It should be easy to integrate mechanical, electronic and software models.
|
|
\item It should be re-usable, in that commonly used modules can be re-used in other designs/projects.
|
|
\item It should have a formal basis, that is to say, it should be able to produce mathematical proofs
|
|
for its results.
|
|
\item It should be capable of producing reliability and danger evaluation statistics.
|
|
\item It should be easy to use.
|
|
\end{itemize}
|
|
|
|
|
|
\section{building blocks of a safety critical systen}
|
|
|
|
This section looks at common features in a safety critical system and
|
|
then looks at the building blocks of these systems
|
|
and their characteristics.
|
|
|
|
\subsection{what is a safety critical system?}
|
|
|
|
DEFINITIONS GET REFS
|
|
|
|
|
|
TYPICALLY HAS MECHANICAL, ELECTRONIC and SOFTWARE
|
|
actuators control intelligence
|
|
|
|
\subsection{An example : industrial burner}
|
|
|
|
An industrial burner is a nice example of a safety critical system.
|
|
It has some lethal risks and some environmental.
|
|
It could, by igniting an explosive mixture, cause an explosion.
|
|
By burning incorrect proportions of fuel and air, it could be ineffecient and waste
|
|
resources, or worse could cause poisonous burning (typically carbon monoxide, but also
|
|
where flame temperature is very high, can produce NOX emmissions).
|
|
|
|
To prevent igniting an explosive mixture, air is pumped though the furnace
|
|
chamber on start-up, and this is verified with an air pressure switch.
|
|
|
|
|
|
NEED A DIAGRAM HERE
|
|
|
|
|
|
NEED A STATE CHART TOO
|
|
|
|
It is interesting here to compare how the different methodologies
|
|
would deal with a particular sub-system in the burner controller
|
|
and compare how they analyse it.
|
|
The Flame scanner is a good example for this.
|
|
We shall consider a simple infra red (IR) flame scanner.
|
|
This is in the form of an IR sensitive resistor.
|
|
The flame type we will be looking for will have a characteristic
|
|
flicker frequency of around 13Hz.
|
|
The circuit is then simply a resitor voltage divider connected to
|
|
a micro-controller reading the voltage.
|
|
The flame scanner is thus a two resistor voltage divider.
|
|
|
|
\subsection{The Flame Scanner}
|
|
\subsubsection{Macro FTA perspective}
|
|
|
|
SHOW ALL TOP LEVEL FAULTS. EXPLOSION, POISONOUS BURNING CO, POISONOUS BURNING NOX, FAILS TO LIGHT etc
|
|
|
|
Follow the explosion tree down to flame scanner fails ON, and OFF
|
|
|
|
etc
|
|
\subsubsection{Macro FMEA/Statistical perspective}
|
|
|
|
Each of the resistors is considered critical, in the statistical case, and so the MTTF
|
|
is added inot the DANGEROUS section.
|
|
|
|
For FMEA the resistor failures add up to the SYSTEM level, show this is inappropriate
|
|
and makes several jumps in applied knowledge, thus Bayes theorem etc
|
|
|
|
\subsubsection{Micro FMMD perspective}
|
|
|
|
|
|
Here show how the flame scanner becomes a black box, or component in itself.
|
|
How it is now available to be integrated into higher level designs.
|
|
|
|
%and then an ignition position is checked.
|
|
%Initially a pilot flame is started and when this is stable, the main
|
|
%flame is fired.
|
|
%To check the stability of the flame, a flame scanner is required.
|
|
%To mix the fuel and air, motors to position valves are generally used.
|
|
%To prevent fuel leakage into the furnace, safety shut-off valves are used \footnote{These generally open slowly under power, and when power is removed `slam shut'. Thus
|
|
%in the event of a general power failure, the default to safe behaviour.}
|
|
|
|
|
|
|
|
|
|
Motors controlling air and fuel flow
|
|
safety chain to power for shutdown valves
|
|
safety shutdown valves on fuel
|
|
flame sensor
|
|
air pressure sensor
|
|
|
|
|
|
\section{Base Level Components}
|
|
|
|
A common factor with all safety critical systems, is
|
|
base level -or- bought in components. Be these
|
|
electrical, mechanical or firmware, they should all
|
|
have known failure modes.
|
|
|
|
\subsection { Failure modes defining the component}
|
|
We can consider each bought-in component as a base level component,
|
|
and it should have an associated set of failure modes.
|
|
|
|
|
|
|
|
\subsection { Complication of multiple failure modes }
|
|
A very complicated component, like an integrated circuit or perhaps a servo motor, has
|
|
a set of failure modes, where several things could go worng with it within the $\tau$ period.
|
|
This is a simultaneous failure, or more than one failure mode being active during the same time period.
|
|
|
|
|
|
\section{FMMD Proposed Methology Outline}
|
|
|
|
fire away, essentially the elevator pitch
|
|
|
|
\subsection{Treating a functional group as a component}
|
|
\subsection{Using a derived component in designs}
|
|
\section{Building a failure Mode model Hierarchy}
|
|
|
|
AND the hierarchy...
|
|
|
|
|
|
Probab about 3 pages
|