kile automated spelling blissssss
This commit is contained in:
parent
72741febda
commit
a5dc8ccf13
@ -31,7 +31,7 @@ applicable to industrial burner controllers\footnote{Burner Controllers cover th
|
||||
combustion, high pressure steam and hot water, mechanical control, electronics and embedded software.}.
|
||||
The methodology developed was designed to cope with
|
||||
both the deterministic\footnote{Deterministic failure mode analysis, traces failure mode effects at the SYSTEM level to lower level causes in components or sub-systems.} and probablistic approaches
|
||||
\footnote{Probablistic failure mode analysis tries to determine the probability of given SYSTEM failure modes, and pfrom these
|
||||
\footnote{Probabilistic failure mode analysis tries to determine the probability of given SYSTEM failure modes, and pfrom these
|
||||
can determine an overall failure rate, in terms of probability of failure on demand, or failure in time (or Mean Time to Failure (MTTF).}.
|
||||
\glossary{name={safety critical},description={A safety critical system is one in which its failure may result in death or serious injury to humans, an environmental catastrophe or severe loss or damage}}
|
||||
\fmodegloss
|
||||
@ -39,10 +39,10 @@ can determine an overall failure rate, in terms of probability of failure on dem
|
||||
|
||||
\paragraph{Safety Critical Controllers, knowledge and culture sub-disiplines}
|
||||
The maturing of the application of the programmable electronic controller (PEC)
|
||||
for a wide range safety critical applications, has led to a fragmentation of subdisiplines
|
||||
for a wide range safety critical applications, has led to a fragmentation of sub-disciplines
|
||||
which speak imperfectly to one another.
|
||||
This is because
|
||||
the main three engineering disiplines, Electrical, Software and Mechanical Engineering
|
||||
the main three engineering disciplines, Electrical, Software and Mechanical Engineering
|
||||
produced equipment that was interfaced a a later time.
|
||||
Just as electronic circuitry becomes more integrated, and sub-domains
|
||||
of electrical engineering (analog and digital for instance) are commonly found along-side on the same chip,
|
||||
@ -50,13 +50,13 @@ so modern PEC's are becoming more and more integrated and now typically encompa
|
||||
input from the three engineering disciplines\footnote{Consider an aircraft, this involves expert knowledge from
|
||||
Software, Electronic and Mechanical Engineering and requires a high degree of safety validation}.
|
||||
|
||||
Additional disiplines are defined by application area of the PEC. All of these sub-displines
|
||||
Additional disiplines are defined by application area of the PEC. All of these sub-disciplines
|
||||
are in turn split into even finer units.
|
||||
The practicioners of these fields tend to view a PEC in different ways.
|
||||
Discoveries and culture in one field diffuse only slowly into the conciousness of a specialist in another.
|
||||
Too often, one disipline's unproven assumptions or working methods, are treated as firm boundary conditions
|
||||
The practitioners of these fields tend to view a PEC in different ways.
|
||||
Discoveries and culture in one field diffuse only slowly into the consciousness of a specialist in another.
|
||||
Too often, one discipline's unproven assumptions or working methods, are treated as firm boundary conditions
|
||||
for an overlapping field.
|
||||
For failure mode analysis a common notation, across disiplines is a very desirable and potentially useful
|
||||
For failure mode analysis a common notation, across disciplines is a very desirable and potentially useful
|
||||
tool.
|
||||
|
||||
\paragraph{Safety Assessment/analysis of PEC's}
|
||||
@ -81,14 +81,15 @@ component failure mode in a model, and to be able to represent
|
||||
mechanical, electrical and software components in a single failure mode model.
|
||||
|
||||
\paragraph{Desirability of a common failure mode notation}
|
||||
Having a common failure mode notation accross all disciplines in a project
|
||||
Having a common failure mode notation across all disciplines in a project
|
||||
would allow all the specialists to prepare failure mode
|
||||
analysis and then bring them together to model the PEC.
|
||||
\paragraph{Visual form of the notation}
|
||||
The visual notation developed was initially designed for electronic fault modelling.
|
||||
The visual notation developed was initially designed for electronic fault modeling.
|
||||
This notation deals with failure modes of components using concepts derived from
|
||||
Euler and Spider diagrams.
|
||||
However, as the notation dealt with generic failure modes, it was realised that it could be applied to mechanical and software domains as well.
|
||||
However, as the notation dealt with generic failure modes, it was realised that it could be applied to
|
||||
mechanical and software domains as well.
|
||||
This changed the target for the study slightly to encompass these three domains in a common notation.
|
||||
|
||||
\paragraph{PEC's: Legal and Insurance Issues}
|
||||
@ -97,10 +98,10 @@ There is also usually a differentiation between the manufacturers
|
||||
and the the plant operators.
|
||||
|
||||
The manufacturers have to ensure
|
||||
that the device is adaquately safe for use in its operational context.
|
||||
that the device is adequately safe for use in its operational context.
|
||||
This usually means conforming to device specific standards~\footnote{in Europe, conformance to European Norms (EN) are legal requirements
|
||||
for specific types of controllers, and in the USA conformance to Underwriters Laboratories (UL) standards
|
||||
are usually a mimimum requirement to take out insurance}, and offering training
|
||||
are usually a minimum requirement to take out insurance}, and offering training
|
||||
of operators.
|
||||
|
||||
Operators of safety critical plant are concerned with maintenance and legal obligations for
|
||||
@ -133,14 +134,14 @@ looking in detail at selected critical sections of the product and proposing
|
||||
component failure scenarios.
|
||||
For each failure scenario proposed either a satisfactory
|
||||
answer was required, or a counter proposal to change the design to cope with
|
||||
a theroretical component failure eventuality.
|
||||
a theoretical component failure eventuality.
|
||||
FMEA was time consuming, and being directed by
|
||||
experts undoubtly ironed out many potential safety faults before the product saw
|
||||
experts undoubtedly ironed out many potential safety faults before the product saw
|
||||
light of day.
|
||||
However it was quickly apparent that only a small proportion
|
||||
of component~failure modes was considered. Also there was no formalism.
|
||||
The component~failure~modes investigated were not analysed within
|
||||
any rigourous or mathematically proven framework.
|
||||
any rigorous or mathematically proven framework.
|
||||
|
||||
\subsection{ Blanket Risk Reduction Approach }
|
||||
|
||||
@ -154,7 +155,7 @@ Systemic faults, or mistakes are missed by this form of static testing.
|
||||
\subsection{Possibility of applying mathematical techniques to FMEA}
|
||||
|
||||
My MSc project was a diagram editor for Constraint diagrams.
|
||||
I wanted to apply constriant diagram techniques to FMEA
|
||||
I wanted to apply constraint diagram techniques to FMEA
|
||||
and began thinking about how this could be done. One
|
||||
obvious factor was that a typical safety critical system could
|
||||
have more than 1000 component parts. Each component
|
||||
@ -173,21 +174,21 @@ to perform some particular low-level function.
|
||||
\paragraph{Top down Approach}
|
||||
A top down approach has several potential problems.
|
||||
By its nature it means that at the start of the process
|
||||
a set of system or top level faults or undesireable outcomes are defined.
|
||||
a set of system or top level faults or undesirable outcomes are defined.
|
||||
It then must break the system down into modules and
|
||||
decide which of these can contribute to a system level fault mode.
|
||||
Potentially failure modes, be they from components or the interaction
|
||||
between modules can be missed. A disturbing example of this
|
||||
is the NASA space shuttle in 1986, which missed the fault mode of an O
|
||||
ring. This was made even worse, by the fact that the `O' ring had a specified temperature
|
||||
range where the probability of this fault occuring was dramatically raised when below
|
||||
range where the probability of this fault occurring was dramatically raised when below
|
||||
the temperature range. This was a known and documented feature of a safety critical component
|
||||
and it was ignored in the safety analysis.
|
||||
|
||||
\paragraph{Bottom-up Approach}
|
||||
A bottom-up approach looked impractical at first due to the sheer number
|
||||
of component failure modes in a typical system.
|
||||
However were this bottom-up approach to be modular, (reducing the order of cross checking), and build a hierachy
|
||||
However were this bottom-up approach to be modular, (reducing the order of cross checking), and build a hierarchal
|
||||
of modules rising up until all components are covered, we
|
||||
can model an entire complex system.
|
||||
This is the core concept behind this study.
|
||||
@ -216,7 +217,7 @@ repeated within a SYSTEM
|
||||
|
||||
In general terms we can describe
|
||||
these circuitry sub-systems
|
||||
as collections of components or smaller sub-systesm, that interact to perform a given function.
|
||||
as collections of components or smaller sub-systems, that interact to perform a given function.
|
||||
We can call these collections {\fg}s.
|
||||
|
||||
|
||||
@ -239,7 +240,7 @@ failures occur, not know and feed incorrect data into our system.
|
||||
%
|
||||
Figure \ref{fig:millivolt} shows a typical industrial
|
||||
circuit to measure and amplify millivolt signals.
|
||||
It will detect a disconneted milli-volt source (the most common
|
||||
It will detect a disconnected Milli-volt source (the most common
|
||||
failure, and usually due to wiring faults) and some other internal component failures.
|
||||
It can however provide an incorrect (slightly low reading) if
|
||||
one of two resistors fail in particular ways.
|
||||
@ -254,12 +255,12 @@ of detecting `undetected failures' in safety critical product design.
|
||||
\paragraph{Multi-disipline} Most safety critical systems are composed of mechanical, electrical and
|
||||
computing elements. A tragic example of the mechanical and electrical elements
|
||||
interfacing to a computer is found in the THERAC25 x-ray dosage machine.
|
||||
With no common notation to integrate the saftey analyis between the electrical/mechanical and computing
|
||||
With no common notation to integrate the safety analysis between the electrical/mechanical and computing
|
||||
domains, synchronisation errors occurred that were in some cases fatal.
|
||||
The interfacing between the hardware and software for the THERAC-25 was not considered
|
||||
in the design phase.
|
||||
Niel Story in the formal methods chapter of "safety critical computer systems"
|
||||
describes the different formal languages suitable for hardward and software and
|
||||
describes the different formal languages suitable for hardware and software and
|
||||
bemaons the fact that no single language is suitable for for such a broad range of tasks \cite{sccs}[pp. 287].
|
||||
|
||||
\paragraph{Requirements for a rigorous FMEA process}
|
||||
@ -276,7 +277,7 @@ a process of modularisation from the bottom~up.
|
||||
\begin{list}{$*$}{}
|
||||
\item The analysis process must be `bottom~up'
|
||||
\item The process must be modular and hierarchical
|
||||
\item The process must be multi-dicipline and must be able to represent hardware, electronics and software
|
||||
\item The process must be multi-discipline and must be able to represent hardware, electronics and software
|
||||
\end{list}
|
||||
|
||||
\section{Safety Critical Systems}
|
||||
@ -300,7 +301,7 @@ it has the potential to become dangerous\cite{sccs}.
|
||||
%Medical electronics for automatically dispensing drugs or maintaining
|
||||
%life support are examples of systems that lives depend upon.
|
||||
|
||||
\subsection{Two approaches : Probablistic, and Deterministic}
|
||||
\subsection{Two approaches : Probabilistic, and Deterministic}
|
||||
|
||||
There are two main philosophies applied to safety critical systems certification.
|
||||
\paragraph{Probablistic safety Measures}
|
||||
@ -348,7 +349,7 @@ Ref chapter specifically on this but give an overview now
|
||||
|
||||
A modern industrial burner has mechanical, electronic and software
|
||||
elements, that are all safety critical. That is to say
|
||||
unhandled failures could create dangerous faults.
|
||||
unhanded failures could create dangerous faults.
|
||||
|
||||
%To add to these problems
|
||||
%Operators are often under pressure to keep them running. An boiler supplying
|
||||
@ -419,8 +420,8 @@ Automated systems, as opposed to manual ones are now the norm
|
||||
in the home and in industry.
|
||||
%
|
||||
Automated systems have long been recognised as being more efficient and
|
||||
more accurate than a human opperator, and the reason for automating a process
|
||||
can now be more likely to be cost savings due to better effeciency
|
||||
more accurate than a human operator, and the reason for automating a process
|
||||
can now be more likely to be cost savings due to better efficiency
|
||||
than a not paying a salary to a human operator \ref{burnereffency}.
|
||||
%
|
||||
For instance
|
||||
@ -431,10 +432,10 @@ A typical control function could be the
|
||||
fuel air mixture profile curves over a the firing range.
|
||||
%
|
||||
Because fuels vary slightly in calorific value, and air density changes with the weather, no optimal tuning can be optional.
|
||||
In fact for asethetic reasons (not wanting smoke to appear at the flue)
|
||||
In fact for aesthetic reasons (not wanting smoke to appear at the flue)
|
||||
the tuning was often air rich, causing air to be heated and
|
||||
unnecessarily passed through the burner, leading to direct loss of energy.
|
||||
An automated system analysing the combustion gasses and automatically
|
||||
An automated system analysing the combustion gases and automatically
|
||||
adjusting the fuel air mix can get the efficiencies very close to theoretical levels.
|
||||
|
||||
|
||||
@ -469,7 +470,8 @@ can make horrendous mistakes. This means that simply reading sensors and applyin
|
||||
corrections cannot be enough.
|
||||
Checking for error conditions must also be incorporated.
|
||||
Equipment can also develop an internal faults, and strategies
|
||||
must be in-pcae to recognise and cope with them.
|
||||
must be in-place to firstly recognise internal faults,
|
||||
and then cope with them in the safest possible way.
|
||||
|
||||
\begin{figure}[h]
|
||||
\centering
|
||||
@ -523,9 +525,9 @@ detect several other failure modes of this circuit and a full analysis is given
|
||||
\paragraph{Self Checking}
|
||||
This introduces a level of self checking into the system.
|
||||
Admittedly this is the simplest failure mode scenario (that the
|
||||
sensor is not wired correcly or has become disconnected).
|
||||
sensor is not wired correctly or has become disconnected).
|
||||
%
|
||||
This safety resisitor has a side effect, it also checks for internal errors
|
||||
This safety resistor has a side effect, it also checks for internal errors
|
||||
that could occur in this circuit.
|
||||
Should the input resistor $R22$ go OPEN this would be detected.
|
||||
Should the gain resistors $R30$ or $R26$ go OPEN or SHORT a fault condition will be detected.
|
||||
@ -533,7 +535,7 @@ Should the gain resistors $R30$ or $R26$ go OPEN or SHORT a fault condition will
|
||||
\paragraph{Not rigorous, but tested by time}
|
||||
This is a typical example of an industry standard circuit that has been
|
||||
thought through, and in practise works and detects most commonly encountered failure modes.
|
||||
But it is not rogorous: it does not take into account every failure
|
||||
But it is not rigorous: it does not take into account every failure
|
||||
mode of every component in it.
|
||||
|
||||
However it does lead on to an important concept of three main states of a safety critical system.
|
||||
@ -566,12 +568,12 @@ at the very least that single failures of hardware
|
||||
or software cannot
|
||||
create an unsafe condition in operational plant. Further to this
|
||||
a second fault introduced, must not cause an unsafe state, due
|
||||
to the combation of both faults.
|
||||
to the combination of both faults.
|
||||
\vskip 0.3cm
|
||||
This sounds like an entirely reasonable requirement. But to rigorously
|
||||
check the effect a particular component fault has on the system,
|
||||
we could check its effect on all other components.
|
||||
Should a diode in the powersupply fail in a particular way, by perhaps
|
||||
Should a diode in the power supply fail in a particular way, by perhaps
|
||||
introducing a ripple voltage, we should have to look at all components
|
||||
in the system to see how they will be affected.
|
||||
|
||||
@ -634,16 +636,16 @@ A technique of modularising, or breaking down the problem is clearly necessary.
|
||||
|
||||
One question that anyone developing a safety critical analysis design tool
|
||||
could do well to answer, is how the methodology would cope with known previous disasters.
|
||||
The Challenger disaster is a good example, and was well documented and invistigated.
|
||||
The Challenger disaster is a good example, and was well documented and investigated.
|
||||
|
||||
The problem lay in a seal that had an operating temperature range.
|
||||
On the day of the launch the temperature of this seal was out of range.
|
||||
A bottom up safety approach would have revealed this as a fault.
|
||||
|
||||
The FTA in use by NASA and the US Nuclear regulatory commisssion
|
||||
allows for enviromental considerations such as temperature\cite{nasafta}\cite{nucfta}.
|
||||
The FTA in use by NASA and the US Nuclear regulatory commission
|
||||
allows for environmental considerations such as temperature\cite{nasafta}\cite{nucfta}.
|
||||
But because of the top down nature of the FTA technique, the safety designer must be aware of
|
||||
the environemtnal constraints of all component parts in order to use this correctly.
|
||||
the environmental constraints of all component parts in order to use this correctly.
|
||||
This element of FTA is discussed in \ref{surveysc}
|
||||
|
||||
\subsection{Therac 25}
|
||||
@ -665,7 +667,7 @@ excluded the software \cite{safeware}[App. A].
|
||||
|
||||
\subsection{Problems with Natural Language}
|
||||
|
||||
Written natural language desciptions can not only be ambiguous or easy to misinterpret, it
|
||||
Written natural language descriptions can not only be ambiguous or easy to misinterpret, it
|
||||
is also not possible to apply mathematical checking to them.
|
||||
|
||||
A mathematical model on the other hand can be checked for
|
||||
@ -680,7 +682,7 @@ specifying systems, developed at Brighton and Kent university
|
||||
have been used and extended by this author to create a methodology
|
||||
for modelling complex safety critical systems, using diagrams.
|
||||
|
||||
This project uses a modified form of euler diagram used to represent propositional logic.
|
||||
This project uses a modified form of Euler diagram used to represent propositional logic.
|
||||
%The propositional logic is used to analyse system components.
|
||||
|
||||
|
||||
@ -693,7 +695,7 @@ failure modes.
|
||||
\subsection{Mechanical}
|
||||
Find refs
|
||||
\subsection{Software}
|
||||
Software must run on a microprocessor/microcontroller, and these devices have a known set of failure modes.
|
||||
Software must run on a microprocessor/micro-controller, and these devices have a known set of failure modes.
|
||||
The most common of these are RAM and ROM failures, but bugs in particular machine instructions
|
||||
can also exist.
|
||||
These can be checked for periodically.
|
||||
@ -715,7 +717,7 @@ temperature being the most typical. Very often what happens to the system outsid
|
||||
|
||||
\begin{itemize}
|
||||
\item To create a Bottom up FMEA technique that permits a connected hierarchy to be
|
||||
built representing the fault behaviour of a system.
|
||||
built representing the fault behavior of a system.
|
||||
\item To create a procedure where no component failure mode can be accidentally ignored.
|
||||
\item To create a user friendly formal common visual notation to represent fault modes
|
||||
in Software, Electronic and Mechanical sub-systems.
|
||||
@ -727,7 +729,7 @@ highest abstract system 'top level'.
|
||||
\item To produce a software tool to aid in the drawing of diagrams and
|
||||
ensuring that all fault modes are addressed.
|
||||
\item to provide a data model that can be used as a source for deterministic and probablistic failure mode analysis reports.
|
||||
\item To allow the possiblility of MTTF calculation for statistical
|
||||
\item To allow the possibility of MTTF calculation for statistical
|
||||
reliability/safety calculations.
|
||||
\end{itemize}
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user