Edit on Friday afternoon
This commit is contained in:
parent
148bc7cba9
commit
4a83c4e8e6
@ -169,7 +169,7 @@ the ability to model integrated hardware and software systems.
|
||||
% coverage of the combined FMEA techniques.
|
||||
|
||||
To demonstrate FMMD a small, but complete embedded system
|
||||
(including both software and hardware),
|
||||
(including both software and hardware)
|
||||
worked example is presented to show FMMD applied to an
|
||||
integrated electronics/software system.
|
||||
%, the industry standard
|
||||
@ -183,17 +183,23 @@ integrated electronics/software system.
|
||||
|
||||
FMEA stands for Failure Mode Effects Analysis.
|
||||
%
|
||||
All components used to build a system can fail, also
|
||||
All components used to build a system can fail; also
|
||||
they may fail in more than one way.
|
||||
The ways in which a component can fail, are known as its {\fms}.
|
||||
|
||||
At its simplest FMEA means taking taking a {\fm} of a component and predicting
|
||||
At its simplest FMEA means taking a {\fm} of a component and predicting
|
||||
what problems it may cause for the system it is part of.
|
||||
%
|
||||
One way the electronic component the resistor can fail for instance, is if it were
|
||||
to go open circuit. It could be because it was not soldered on properly and fell off,
|
||||
to go open circuit.
|
||||
%
|
||||
This open circuit could be because it was not soldered on properly and fell off,
|
||||
it could have had an internal mechanical fault or it could have been destroyed/burnt~off by too much
|
||||
electrical current. The cause does not matter. The fact that it can fail by going open circuit does.
|
||||
electrical current.
|
||||
%
|
||||
The cause does not matter.
|
||||
%
|
||||
The fact that it can fail by going open circuit does.
|
||||
%
|
||||
This then is one of the {\fms} of a resistor, $OPEN$.
|
||||
%
|
||||
@ -223,7 +229,7 @@ This means looking at every component in the system, and for each of those compo
|
||||
examining all known failure modes in the context of the system that it is part of.
|
||||
%
|
||||
Various handbooks and international standards list common components and
|
||||
their know failure modes, often with accompanying statistics~\cite{en298, fmd91, mil1991}.
|
||||
their known failure modes often with accompanying statistics~\cite{en298, fmd91, mil1991}.
|
||||
|
||||
\subsection{Origins of FMEA techniques}
|
||||
%FMEA methodologies trace from the 1940's and were designed to
|
||||
@ -250,7 +256,7 @@ programmatic/software elements.
|
||||
Software generally sits on top of most modern safety critical control systems
|
||||
and defines its most important system wide behaviour and communications.
|
||||
%
|
||||
A typical control system, be in in a car or a microwave oven in the kitchen
|
||||
A typical control system, be it in a car or a microwave oven in the kitchen
|
||||
will generally combine a micro-controller with electronics.
|
||||
It will form a hierarchy where low level electronics
|
||||
is implemented at the bottom, which prepares input/output (IO)
|
||||
@ -276,8 +282,7 @@ do not specify FMEA for software but instead essentially just specify good pract
|
||||
i.e. review processes and language feature constraints.
|
||||
%
|
||||
That is to say FMEA has no formal framework for following
|
||||
failure modes from low level hardware elements through into the software models.
|
||||
|
||||
failure modes from low level hardware elements through into software models.
|
||||
%
|
||||
This is a weakness.
|
||||
%
|
||||
@ -410,7 +415,7 @@ where the aim is to ensure that single component failures (at least) cannot
|
||||
cause unacceptable system level events~\cite{iec60812,boffin},
|
||||
|
||||
\item Failure Mode Effect Criticality Analysis (FMECA) is applied to determine the most potentially dangerous or damaging
|
||||
failure modes to fix, using FMEA in conjunction with severity and failure probability figures~\cite{fmeca,mil1991,fmd91},
|
||||
failure modes using FMEA in conjunction with severity and failure probability figures~\cite{fmeca,mil1991,fmd91},
|
||||
|
||||
\item Failure Mode Effects and Diagnostics Analysis, is FMEA performed to
|
||||
determine a statistical level of safety. This is a fairly standard FMEA but with statistical values attached to each component {\fm};
|
||||
@ -436,13 +441,11 @@ When analysing a failure mode of a component, it is reasonable to
|
||||
look at how the failure mode will affect the other components in the system and to put this then
|
||||
into the context of the systems behaviour.
|
||||
%
|
||||
Components may fail in several ways. European standard EN298~\cite{en298} gives the possible
|
||||
Components may fail in several ways. European standard EN298~\cite{en298} gives two possible
|
||||
failure modes for a resistor as $OPEN$ and $SHORT$ for instance.
|
||||
%
|
||||
The term $f$ is defined as the number of component failure modes for a given component.
|
||||
A system will have $N$ number of components.
|
||||
|
||||
|
||||
%A system will have $N$ number of components.
|
||||
|
||||
In the case of the resistor $f$ is two
|
||||
~\footnote{A resistor is assigned two failure modes by the European Burner standard EN298~\cite{en298}
|
||||
@ -498,7 +501,7 @@ the sum of these multiplications for all its components. % it contains.
|
||||
Take a hypothetical small system with say 100 components, with three failure modes per component,
|
||||
%this
|
||||
%would give an exhaustive reasoning distance for single failure analysis---of $3 \times 100 \times 99$.
|
||||
that means to for each {\fm} of every component, i.e. $3$ checks, would have to be made
|
||||
that means for each {\fm} of every component, i.e. $3$ checks, would have to be made
|
||||
against 99 other components. There are 100 components in this hypothetical example
|
||||
for single failure analysis this means $3 \times 100 \times 99$ checks.
|
||||
%
|
||||
@ -547,7 +550,7 @@ of a failure mode with all other components in a system would have to be examine
|
||||
Or in other words, all possible failure scenarios considered.
|
||||
%
|
||||
%to do this completely (all failure modes against all components).
|
||||
This is represented in the equation below, %~\ref{eqn:fmea_state_exp},
|
||||
This is represented in equation~\ref{eqn:fmea_single} below, %~\ref{eqn:fmea_state_exp},
|
||||
where $N$ is the total number of components in the system, $RD_{single}$ is the reasoning~distance and
|
||||
$f$ is the number of failure modes per component:
|
||||
%
|
||||
@ -566,7 +569,7 @@ The hypothetical example described above gives $100 \times 99 \times 3 = 29,700
|
||||
|
||||
%%% SANITY CHECK.
|
||||
%%%
|
||||
When stating a general equation such as equation~\ref{eqn:fmea_single} it can be sanity checked
|
||||
When stating a general equation such as equation~\ref{eqn:fmea_single}, it can be sanity checked
|
||||
by thinking of common examples.
|
||||
For instance a simple amplifier circuit with a handful of components
|
||||
would have a low $RD_{single}$ count of potential failure mode to components checks.
|
||||
@ -576,9 +579,9 @@ how it would react to well defined component failure modes.
|
||||
|
||||
For a larger circuit the problems of tracing side effects of the failure mode through the circuit
|
||||
mean that it is likely to be a far more complex task.
|
||||
|
||||
%
|
||||
The order $O(N^2)$ for FMEA complexity, for single failures, therefore agrees with experience.
|
||||
|
||||
%
|
||||
In general terms, for a very simple small circuit, a better understanding of failure effects is expected,
|
||||
than for a very large system where there are more variables and potential {\fm} interactions.
|
||||
%
|
||||
@ -591,7 +594,7 @@ scenarios\footnote{Certain double failure scenarios are already legal
|
||||
requirements---The European Gas burner standard (EN298:2003~\cite{en298}) for instance---demands the checking of
|
||||
double failure scenarios (for burner lock-out scenarios).}
|
||||
%
|
||||
(two components failing within a given time frame) and the order becomes $O(N^3)$.
|
||||
(two components failing within a given time frame) the order becomes $O(N^3)$.
|
||||
Where $RD_{double}$ is the reasoning~distance for double failure scenarios:
|
||||
\begin{equation}
|
||||
\label{eqn:fmea_double}
|
||||
@ -620,7 +623,7 @@ Current FMEA methodologies cannot consider---for the reason of state explosion--
|
||||
%\fmmdglossSTATEEX
|
||||
%
|
||||
%Because for practical reasons,
|
||||
In practical terms XFMEA cannot be performed for anything other than a trivial system,
|
||||
In practical terms XFMEA cannot be performed for anything other than a trivial system, instead
|
||||
reliance is placed upon experts on the system under investigation
|
||||
to perform a meaningful analysis.
|
||||
%
|
||||
@ -632,7 +635,7 @@ these experts have to select the areas they see as most critical for detailed FM
|
||||
it is usually impossible, for reasons of time to perform the work,
|
||||
to action a detailed level of analysis on all component {\fms}
|
||||
on anything but a very small %hypothetical
|
||||
system.
|
||||
system (i.e. XFMEA).
|
||||
|
||||
% \subsection{Component Tolerance}
|
||||
%
|
||||
@ -732,7 +735,8 @@ The automotive industry, because of mass production, must make products that hav
|
||||
but must also be affordable.
|
||||
%
|
||||
This leads to specialist firms producing modules, such as automatic braking systems,
|
||||
that are bought in and assembled to make an auto-mobile.
|
||||
that are bought in and assembled % better word then assembled???? included???
|
||||
to make an auto-mobile.
|
||||
%
|
||||
Performing failure analysis using the basic component single failure modes to
|
||||
system failure mapping, would thus be very difficult: this would require expert knowledge
|
||||
@ -745,7 +749,7 @@ of the design behaviour and component types used in each module.
|
||||
%
|
||||
Some modular FMEA techniques are starting to be used and specified, and are described below.
|
||||
|
||||
\paragraph{Automotive SIL (ASIL) --- modularisation of FMEDA}
|
||||
\paragraph{Automotive SIL (ASIL) --- modularisation of FMEDA.}
|
||||
%
|
||||
The EN61508 variant for automotive use, as defined in standard ISO~26262, is known as Automotive SIL (ASIL)~\cite{Kafka20122}.
|
||||
%
|
||||
@ -756,13 +760,13 @@ This allows automotive designers to use pre-certified modules in their designs
|
||||
and applies broad statistical guidelines to achieving particular safety levels by
|
||||
use of redundancy and automated diagnostics etc.
|
||||
%
|
||||
Note that the ASIL modules are given a relaibility rating which can be enhanced with redundancy.
|
||||
Note that the ASIL modules are given a reliability rating which can be enhanced with redundancy.
|
||||
It does not introduce traceable {\fm} reasoning in its hierarchy.
|
||||
%%
|
||||
%% IN SOFTWARE THIS WOULD BE TIGHTLY COUPLED AS OPPOSED TO LOOSELY COUPLED FUNCTIONS.
|
||||
|
||||
%
|
||||
\paragraph{Indenture levels --- modularisation of FMECA}
|
||||
\paragraph{Indenture levels --- modularisation of FMECA.}
|
||||
%
|
||||
The US military standard for FMECA~\cite{fmeca}, describes a very broad modularity regime, that
|
||||
it terms `indenture' levels.
|
||||
@ -776,18 +780,18 @@ an altitude radar: within that finer grained modules may be identified until
|
||||
the base components are listed.
|
||||
%
|
||||
Note that this is a top down approach to modularisation and
|
||||
this can introduce errors into the reliability calculations~\cite{MILSTD1629short}
|
||||
and miss-out some component failure modes.
|
||||
this can introduce errors into the reliability calculations
|
||||
by missing out some component failure modes~\cite{MILSTD1629short}.
|
||||
%
|
||||
|
||||
\paragraph{Integrated Circuits (ICs)}
|
||||
\paragraph{Integrated Circuits (ICs).}
|
||||
|
||||
Consider some commonly used ICs an op-amp
|
||||
is a good example.
|
||||
%
|
||||
An op-amp will have a high internal component count.
|
||||
It is mainly a collection of transistors on a chip
|
||||
and is a complex circuit designed to give a very high and precise gain.
|
||||
and is a complex circuit designed to give a very high and precise differential gain.
|
||||
%These are made from several components including
|
||||
%ransistos, resistors capactors etc.
|
||||
In order to perform FMEA op-amps are given
|
||||
@ -851,7 +855,7 @@ and treat those sections as components in their own right.
|
||||
\subsection{The problem of Systems using software and FMEA}
|
||||
|
||||
Software systems are becoming part of everyday life.
|
||||
It is getting increasingly rarer to find systems where there is not a computer
|
||||
It is getting increasingly rare to find systems where there is not a computer
|
||||
controlling some part of it.
|
||||
All modern airliners are fly-by wire. The throttle in a modern car is fly-by wire.
|
||||
|
||||
@ -967,7 +971,7 @@ in an improved FMEA methodology,
|
||||
|
||||
\section{Proposed Methodology: Failure Mode Modular De-composition (FMMD)}
|
||||
|
||||
The basic concept behind FMMD is to from the bottom-up, modularise the problem.
|
||||
The basic concept behind FMMD is to, from the bottom-up, modularise the problem.
|
||||
|
||||
FMEA cannot easily be modularised from the top-down, because
|
||||
it has to deal with component failure modes.
|
||||
@ -996,7 +1000,7 @@ in the circuit, these modules can then be merged to form
|
||||
bigger modules until there is a hierarchy and one final module representing the whole system.
|
||||
|
||||
|
||||
\paragraph{Broadly FMMD is modularisation from the bottom-up of FMEA}
|
||||
\paragraph{Broadly FMMD is modularisation from the bottom-up of FMEA.}
|
||||
|
||||
Firstly modules are identified (for instance common circuitry formations such as amplifiers or digital outputs) and
|
||||
then failure mode analysis is performed on them.
|
||||
@ -1043,6 +1047,14 @@ They are then considered as higher level components with
|
||||
their own failure mode behaviour. These higher level components
|
||||
are then collected to form {\fgs} and so on until a hierarchy is built
|
||||
representing the entire system.
|
||||
%
|
||||
This means that failure modes can be traced through linking the
|
||||
{\fgs}. This means that the system level {\fms} can be traced back to
|
||||
the component {\fms} that can cause them.
|
||||
%
|
||||
This gives rigorous failure mode traceability through the model.
|
||||
|
||||
|
||||
%
|
||||
Any new static failure mode methodology must ensure that it
|
||||
represents all component failure modes and it therefore should be bottom-up,
|
||||
@ -1055,7 +1067,7 @@ bottom-level component failure modes would be handled/used.
|
||||
%
|
||||
Starting at the bottom means having to deal with each component failure mode from the beginning.
|
||||
|
||||
\section{The proposed Methodology: quick guide or `how~to'.}
|
||||
\subsection{The proposed Methodology: quick guide or `how~to'.}
|
||||
|
||||
An FMEA typically begins with a parts list and then from that a series
|
||||
of entries for each component failure mode.
|
||||
@ -1110,7 +1122,7 @@ can be found in~\cite{clark}.
|
||||
|
||||
FMMD is described in more detail in the section below.
|
||||
|
||||
\paragraph{FMMD process description}
|
||||
\subsection{FMMD process detailed description}
|
||||
|
||||
To ensure all component failure modes are modelled and traceable through stages of analysis, the new methodology must be bottom-up.
|
||||
%
|
||||
@ -1163,11 +1175,11 @@ access to frequency analysis of digital samples called the Fast Fourier Transfor
|
||||
This took the Discrete Fourier Transform (DFT), and applied de-composition to its
|
||||
mesh of (often repeated) complex number calculations~\cite{fpodsadsp}[Ch.8].}
|
||||
%
|
||||
By doing this it broke the computing order of complexity down from having a polynomial %n exponential
|
||||
By doing this it breaks the computing order of complexity down from having a polynomial %n exponential
|
||||
%order
|
||||
to logarithmic order~\cite{ctw}[pp.401-3].
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%FFT%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
It also means that modules are re-usable (analogous to software classes).
|
||||
It also means that {\fgs} are re-usable (analogous to software classes).
|
||||
%
|
||||
Where there are repeated sections of circuitry (as in for instance common types of interface)
|
||||
the analysis for that module may be simply re-used.
|
||||
@ -1177,7 +1189,7 @@ A practical example of a hardware FMEA performed both traditionally and using FM
|
||||
software and hardware hybrid example is analysed in~\cite{syssafe2012}
|
||||
and examples of `reasoning~distance' efficiency savings can be found in~\cite{clark}[Ch.7].
|
||||
%
|
||||
\paragraph{Integrating software into the FMMD model.}
|
||||
\subsection{Integrating software into the FMMD model.}
|
||||
%
|
||||
%With modular FMEA i.e. FMMD %(FMMD)
|
||||
%the concepts of failure~modes
|
||||
@ -1220,6 +1232,103 @@ For electrical and mechanical systems, although the original system designers
|
||||
concepts of modularity and sub-systems in design may provide guidance,
|
||||
applying FMMD means deciding on the members for {\fgs} and the subsequent hierarchy.
|
||||
|
||||
\paragraph{Contract Programming and FMMD.}
|
||||
%
|
||||
With electronic components, the literature points to suitable sets of
|
||||
{\fms}~\cite{fmd91}~\cite{mil1991}~\cite{en298}. %~\cite{en61508}~\cite{en298}.
|
||||
%
|
||||
With software only some library functions are well known and rigorously documented
|
||||
enough to have the equivalent of known failure modes,
|
||||
most software is `bespoke'.
|
||||
%
|
||||
A different strategy is required to
|
||||
describe the failure mode behaviour of software functions; %.
|
||||
concepts from contract programming can be used to assist in this. % here.
|
||||
|
||||
\subsection{Contract programming description}
|
||||
\fmmdglossCONTRACTPROG
|
||||
Contract programming~\cite{dbcbe} is a discipline for building software functions in a controlled
|
||||
and traceable way. Each function is subject to pre-conditions (constraints on its inputs),
|
||||
post-conditions (constraints on its outputs) and function wide invariants (rules).
|
||||
|
||||
|
||||
\paragraph{Mapping contract `pre-condition' violations to component failure modes.}
|
||||
\fmmdglossCONTRACTPROG
|
||||
A precondition, or requirement for a contract software function
|
||||
defines the correct ranges of input conditions for the function
|
||||
to operate successfully.
|
||||
%
|
||||
% C Garret said this was unclear so I have added the following two sentences.
|
||||
%
|
||||
%If we consider a software function to be a {\fg} in the FMMD sense, i.e.
|
||||
A software function is considered to be
|
||||
a collection of code, functions called and %values/
|
||||
variables used.
|
||||
%
|
||||
In this way it is similar to an electronic circuit, which is a collection
|
||||
of components connected in a specific way.
|
||||
%
|
||||
Using this analogy for software, the connections are the functions code, and the
|
||||
called functions/variables/inputs %and variables
|
||||
are the components.
|
||||
%
|
||||
Erroneous behaviour from called functions and variables/inputs has the same effect as component failure modes
|
||||
on an electronic {\fg}.
|
||||
%
|
||||
%
|
||||
If it is considered that %consider the
|
||||
called functions and variables/inputs are the components of a function,
|
||||
a modular and hierarchical failure mode model
|
||||
from existing software can be built.
|
||||
%
|
||||
Thus for FMMD applied to software, a violation of a pre-condition is considered to be equivalent to a failure mode of `one of its components'.
|
||||
%
|
||||
\paragraph{Mapping contract `post-condition' violations to symptoms.}
|
||||
%\fmmdglossCONTRACTPROG
|
||||
%
|
||||
A post-condition is a definition of correct behaviour of a function.
|
||||
%
|
||||
A violated post-condition is a symptom of failure, or, in FMMD terms a derived failure mode, for a function.
|
||||
%
|
||||
Post conditions could relate to either actions performed (i.e. the state of hardware changed) or an output value of a function.
|
||||
%
|
||||
In pure contract programming, a violation of a pre-condition would cause the function to \textbf{not} be executed.
|
||||
%
|
||||
In implementation code, a pre-condition violation should cause
|
||||
an error to be generated, and thus a post-condition to fail.
|
||||
%
|
||||
A function can fail for reasons other than corruption of its input data (i.e.
|
||||
failure caused by variables it uses or return values from functions it calls).
|
||||
%
|
||||
Variables can become corrupted, by radiation affecting RAM~\cite{5488118,5963919} or
|
||||
by another software function erroneously overwriting variables~\cite{swseatbelt}.
|
||||
%
|
||||
Current work on software FMEA generally focuses on mapping
|
||||
variable corruption to failure modes~\cite{procsfmea,procsfmeadb,sfmeaauto,sfmea}.
|
||||
However, errors other than variable corruption can occur.
|
||||
%
|
||||
For instance a microprocessor may have subtle bugs in its instruction set, or
|
||||
incorrectly handled
|
||||
interrupt contention~\cite{concurrency_c_tool} which could cause side effects in software.
|
||||
%
|
||||
For the failure mode model of any software function,
|
||||
it must be considered that all failure modes defined by post-condition
|
||||
violations could simply occur.
|
||||
%`components'.
|
||||
%
|
||||
\paragraph{Mapping contract `invariant' violations to symptoms and failure modes.}
|
||||
Invariants are conditions that are considered to be relied on throughout the execution of
|
||||
a program.
|
||||
%
|
||||
Here they are taken to mean invariants applying to data
|
||||
or conditions that the function under analysis deals with or could be affected by.
|
||||
%
|
||||
Invariants in contract programming may apply to inputs to the function (where violations can be considered {\fms} in FMMD terminology),
|
||||
and to outputs (where violations can be considered symptoms, or derived {\fms}, in FMMD terminology).
|
||||
%\fmmdglossCONTRACTPROG
|
||||
|
||||
|
||||
|
||||
|
||||
%
|
||||
\section{Example for analysis} % : How can we apply FMEA}
|
||||
@ -1240,7 +1349,7 @@ The software then applies a PID~\cite{dcods} algorithm to determine the length/m
|
||||
|
||||
|
||||
|
||||
\section{Closed Loop Control Hardware/Software Hybrid Example}
|
||||
\subsection{Closed Loop Control Hardware/Software Hybrid Example}
|
||||
|
||||
It is desirable to model a complete standalone system with FMMD,
|
||||
not only a standalone system, but ideally a hybrid software/hardware system.
|
||||
@ -1372,13 +1481,16 @@ functions should be called to control a process, or in `C' terms be the main fun
|
||||
Using figure~\ref{fig:contextsoftware} the transform bubble
|
||||
to represent the `main' or controlling function in the software must be chosen.
|
||||
%
|
||||
All software functions will be written in bold with a pair of brackets
|
||||
to distingish them as such. The `C' main function is thus presented as \cf{main}.
|
||||
%
|
||||
This can be thought of as picking one bubble and holding it up.
|
||||
%
|
||||
The other bubbles hang underneath
|
||||
forming the software call tree hierarchy, see figure~\ref{fig:context_calltree}.
|
||||
%
|
||||
From examining the diagram, and in common with established embedded programming practise,
|
||||
this is clearly going to be the monitor function.
|
||||
this is clearly going to be the \cf{monitor} function.
|
||||
%
|
||||
\begin{figure}[h]+
|
||||
\centering
|
||||
@ -1396,7 +1508,7 @@ The monitor function will orchestrate the control process.
|
||||
Firstly it will examine the timer value, and when appropriate, call the \cf{PID} function.
|
||||
%
|
||||
The \cf{PID} function calls \cf{determine\_set\_point\_error} which calls \cf{convert\_ADC\_to\_T}
|
||||
which in turn calls \cf{Read\_ADC} (the function developed in the earlier example)
|
||||
which in turn calls \cf{Read\_ADC} (a function developed and analysed using FMMD in~\cite{syssafe2012})
|
||||
which reads from hardware.
|
||||
%
|
||||
With the set point error value the \cf{PID} function will return an output control value to its calling
|
||||
@ -1548,7 +1660,7 @@ level in the hierarchy is found, the Pt100 sensor.
|
||||
Beginning at the bottom, a {\fg} is formed with
|
||||
the function \cf{read\_ADC} and the Pt100.
|
||||
This gives a {\dc}, %which we call
|
||||
`Read\_Pt100' (see appendix~\ref{sec:readPt100}).
|
||||
`Read\_Pt100'. % (see appendix~\ref{sec:readPt100}).
|
||||
%
|
||||
%
|
||||
%
|
||||
|
Loading…
Reference in New Issue
Block a user