looking at notes written on train to mexico vs england

This commit is contained in:
Robin 2010-05-29 10:25:13 +01:00
parent fd38aa0a07
commit 221cd307cb

View File

@ -2,7 +2,7 @@
\section{Introduction}
%$$ \int_{0\-}^{\infty} f(t).e^{-s.t}.dt \; | \; s \in C$$
$$ \int_{0\-}^{\infty} f(t).e^{-s.t}.dt \; | \; s \in C$$
This thesis describes the application of, mathematical (formal) techniques to
the design of safety critical systems.
@ -17,7 +17,7 @@ probablistic approaches.
%and the probability to dangerous fault approach\cite{EN61508}.
The visual notation developed was initially designed for electronic fault modelling.
However, it was realised that could be applied to mechanical and software domains as well.
However, it was realised that it could be applied to mechanical and software domains as well.
This changed the target for the study slightly to encompass these three domains in a common notation.
\section{Background}
@ -36,7 +36,7 @@ The product must be `certified' by an independent
and
`competent body' recognised under European law.
The cerification involved stress testing with repeated operation cycles,
within a range of temperatures. Electrical stress testing with high voltage interference, and
over a specified a range of temperatures. Electrical stress testing with high voltage interference, and
power supply voltage surges and dips. Electro static discharge testing, and
EMC (Electro Magnetic Compatibility). A significant part
of this process however, was `static testing'. This involved looking at the design of the products,
@ -44,24 +44,26 @@ from the perspective of components failing, and the effect on safety this would
Some of the static testing involved checking that the germane `EN' standards had
been complied with. Failure Mode Effects Analysis (FMEA) was also applied. This involved
looking in detail at critical sections of the product and proposing
component failure scenarios. For each failure scenario proposed either a satisfactory
component failure scenarios.
For each failure scenario proposed either a satisfactory
answer was required, or a counter proposal to change the design to cope with
the comonent failure eventuality. FMEA was time consuming, and being directed by
a theroretical component failure eventuality.
FMEA was time consuming, and being directed by
experts undoubtly ironed out many potential safety faults before the product saw
light of day. However it was quickly apparent that only a small proportion
light of day.
However it was quickly apparent that only a small proportion
of copmponent~failure modes was considered. Also there was no formalism.
The component~failure~modes investigated were not analysed within
any rigourous framework.
any rigourous or mathematically proven framework.
\subsection{ Blanket Risk Reduction Approach }
The suite of tests applied for a certified product amount to a `blanket' approach.
That is to say that by applying Electrical, repeated operations, and environmental
stress testing it is hoped that the majority of latent faults are discovered.
The FMEA, or static sections, only look at the most obviously safety critical
The FMEA and static testing only looked at the most obviously safety critical
aspects, and a small minority of the total component base for a product.
Systememic faults, or mistakes will often by-pass this testing.
Systememic faults, or mistakes are missed by this form of static testing.
\subsection{Possibility of applying mathematical techniques to FMEA}
@ -90,7 +92,11 @@ decide which of these can contribute to a system level fault mode.
Potentially failure modes, be they from components or the interaction
betweem modules can be missed. A disturbing example of this
is the NASA space shuttle in 1986, which missed the fault mode of an O
ring.
ring. This was made even worse, by the fact that the `O' ring had a specified temperature
range where the probability of this fault occuring was dramatically raised when below
the temperature range. This was a known and documented feature of a safety critical component
and it was ignored in the safety analysis.
\paragraph{Bottom-up Approach}
A bottom-up approach look impractical at first due to the shear number
of component failure modes in a typical system. However
@ -111,6 +117,28 @@ Also a hierarchy is formed when the top level errors are formed
naturally from the lower levels of analysis.
Unlike a top~down analysis, we cannot miss a top level fault condition.
\paragraph{Multi-disipline}. Most safety critical systems are composed of mechanical, electrical and
computing elements. A tragic example of the mechanical and electircal elements
interfacing to a computer~controller is found in the THERAC25 x-ray dosage machine.
With no common notation to integrate the saftey analyis between the electricali/mechanical and computing
domains synchronisation errors occurred that were in some cases fatal.
\paragraph{Requirements for a rigourous FMEA process}.
It was determined that any process to apply
FMEA in rigourous and complete (in terms of complete component coverage) had to be
a bottom~up process to eliminate the possibility of missing component failure modes.
It also had to naturally converge to a failure model of the system.
It had to take potentially thousands of component failure modes and simplify
these into system level errors.
To analyse the large number of component failure modes, and resolve these to perhaps a handful
of system failure modes, would require
a process of modularisation from the bottom~up.
\begin{list}{$*$}{}
\item The analysis process must be `bottom~up'
\item The process must be modular and hierarchical
\item The process must be multi-disipline and must be able to represent hardware, electronics and software
\end{list}
\section{Safety Critical Systems}
@ -120,10 +148,10 @@ A safety critical system is one in which lives may depend upon it or
it has the potential to become dangerous\cite{sccs}.
%(/usr/share/texmf-texlive/tex/latex/amsmath/amstext.sty)
An industrial burner is typical of plant that is potentially dangerous.
An incorrect air/fuel mixture can be explosive.
Medical electronics for automatically dispensing drugs or maintaining
life support are examples of systems that lives depend upon.
%An industrial burner is typical of plant that is potentially dangerous.
%An incorrect air/fuel mixture can be explosive.
%Medical electronics for automatically dispensing drugs or maintaining
%life support are examples of systems that lives depend upon.
\subsection{Two approaches : Probablistic, and Deterministic}
@ -132,8 +160,8 @@ There are two main philosophies applied to safety critical systems certification
One is a general number of acceptable failures per hour\footnote{The common metric is Failure in Time (FIT) values - failures per ${10}^{9}$
hours of operation} of operation or
a given statistical failure on demand.
This is the probablistic approach and is embodied in the european standard
EN61508 \cite{EN61508}.
This is the probablistic approach and is embodied in the European Standard
EN61508 \cite{EN61508} (international standard IOC1508).
\paragraph{Deterministic safety Measures}
The second philosophy, applied to application specific standards, is to investigate
@ -201,22 +229,33 @@ unhandled failures could create dangerous faults.
The methodology takes a bottom up approach to
the design of an integrated system.
%
Each component is assigned a well defined set of failure modes.
The components are formed into modules, or functional groups.
The system under inspection is then searched for functional groups of components that
perform simple well defined tasks.
These functional groups are analysed with respect to the failure modes of the
components. The `functional group' or module will, after analysis, have a set of derived
failure modes. Thus we can now treat our `functional group' as a component in its own right,
with its own set of failure~modes.
components.
%
The `functional group', after analysis, have its own set of derived
failure modes.
%
The number of derived failure modes will be
less than or equal to the sum of the failure modes of all its components.
%
%
A `derived' set of failure modes, is at a higher abstraction level.
derived modules may now be used as building blocks, to model the system at
ever higher levels of abstraction until the top level is reached.
%
Thus we can now treat our `functional group' as a component in its own right,
with its own set of failure~modes. We can create
a `derived component' and assign it the derived failure modes as analysed from the `functional group'.
%
Derived Components may now be used as building blocks, to model the system at
ever higher levels of abstraction, building a hierarchy until the top level is reached.
%
Any unhandled faults will appear at this top level and will be `un-resolved'.
A formal description of this process is dealt with in Chapter \ref{fmmddefinition}.
%
%
%This principally focuses
%on simple control systems for maintaining temperature
%and for industrial burners. It is hoped that a general mathematical
@ -225,15 +264,19 @@ A formal description of this process is dealt with in Chapter \ref{fmmddefinitio
Automated systems, as opposed to manual ones are now the norm
in the home and in industry.
%
Automated systems have long been recognised as being more effecient and
more accurate than a human opperator, and the reason for automating a process
can now be more likely to be cost savings due to better effeciency
thatn a human operator \ref{burnereffency}.
than a human operator \ref{burnereffency}.
%
For instance
early automated systems were mechanical, with cams and levers simulating
fuel air mixture profile curves over the firing range.
control functions.
%
A typical control function could be the
fuel air mixture profile curves over a the firing range.
%
Because fuels vary slightly in calorific value, and air density changes with the weather, no optimal tuning can be optional.
In fact for asethtic reasons (not wanting smoke to appear at the flue)
the tuning was often air rich, causing air to be heated and
@ -260,6 +303,7 @@ fault conditions are missed.
% http://en.wikipedia.org/wiki/Autopilot
\paragraph{Importance of self checking}
To take an example of an Autopilot, simple early autopilots, were (i.e. they
prevented the aircraft staying from a compass bearing and kept it flying striaght and level).
Were they to fail the pilot would notice quite quickly
@ -283,7 +327,7 @@ It could also develop an internal fault, and must be able to cope with this.
Milli-Volt Sensor with safety resistor
\label{fig:millivolt}}
\end{figure}
\paragraph{Component added to detect errors}
For exmaple, if the sensor supplies a range of 0 to 40mV, and RG1 and RG2 are such that the op-amp supplies a gain of 100
any signal between 0 and 4 volts on the ADC will be considered in range. Should the sensor become disconnected the
opamp will supply its maximum voltage, telling the system the sensor reading is invalid.