massive edit session in the afternoon after painting shelves.
This commit is contained in:
parent
c3416b7fde
commit
74259c22c8
@ -159,7 +159,7 @@ the ability to model integrated hardware and software systems.
|
||||
% reaches conclusions about the effectiveness and failure mode
|
||||
% coverage of the combined FMEA techniques.
|
||||
|
||||
To demonstrate FMMDA a small, but complete embedded system
|
||||
To demonstrate FMMD a small, but complete embedded system
|
||||
(including both software and hardware),
|
||||
worked example is presented to show FMMD applied to an
|
||||
integrated electronics/software system.
|
||||
@ -172,6 +172,46 @@ integrated electronics/software system.
|
||||
|
||||
\section{Introduction}
|
||||
|
||||
FMEA stands for Failure Mode Effects Analysis.
|
||||
%
|
||||
All components used to build a system can fail.
|
||||
They may fail in more than one way.
|
||||
The ways in which a component can fail, are known as its failure modes.
|
||||
|
||||
At its simplest FMEA means taking taking a failure mode of a component and predicting
|
||||
what problems it may cause for the system it is part of.
|
||||
%
|
||||
One way the electronic component the resistor can fail for instance, is it
|
||||
to go open circuit. It could be because it was not soldered on properly and fell off,
|
||||
it could have had an internal mechanical fault or it could be burnt off by too much
|
||||
electrical current. The cause does not matter. The fact that it can fail by going open circuit does.
|
||||
%
|
||||
This then is one of the failure modes of a resistor, $OPEN$.
|
||||
%
|
||||
For instance, an FMEA scenario could be a resistor in a system going $OPEN$. % circuit.
|
||||
%
|
||||
If the resistor was part of an amplifier in the circuit
|
||||
it could be predicted say, that a particular reading,
|
||||
as measured by the amplifiers output, would go outside of an expected
|
||||
range.
|
||||
%
|
||||
The erroneous reading may cause the system to fail dangerously or may simply be detected and flagged
|
||||
as a fault.
|
||||
%
|
||||
The description of the outcome is at the discretion of the Engineer
|
||||
responsible for the FMEA report.
|
||||
|
||||
|
||||
The central concept of FMEA is that if all component failures are known,
|
||||
by analysing them the failure behaviour of a system can be determined.
|
||||
%
|
||||
This means looking at every component in the system, and for each of those components
|
||||
examining all known failure modes in the context of the system that it is in.
|
||||
%
|
||||
Various handbooks and international standards list common components and
|
||||
their know failure modes, often with accompanying statistics~\cite{en298, fmd91, mil1991}.
|
||||
|
||||
\subsection{Origins of FMEA tecniques}
|
||||
%FMEA methodologies trace from the 1940's and were designed to
|
||||
%model simple electro-mechanical systems.
|
||||
%
|
||||
@ -192,8 +232,8 @@ software elements.
|
||||
Software generally sits on top of most modern safety critical control systems
|
||||
and defines its most important system wide behaviour and communications.
|
||||
%
|
||||
Currently standards that demand FMEA investigations for hardware(HFMEA) (e.g. EN298, EN61508),
|
||||
do not specify it for software but instead essentially just specify good practise,
|
||||
Currently standards that demand FMEA investigations for hardware FMEA (HFMEA) (e.g. EN298, EN61508),
|
||||
do not specify FMEA for software but instead essentially just specify good practise,
|
||||
i.e. review processes and language feature constraints.
|
||||
%
|
||||
That is to say FMEA has no formal framework for following
|
||||
@ -204,13 +244,31 @@ This is a weakness.
|
||||
%
|
||||
Where HFMEA % scientifically
|
||||
traces component {\fms}
|
||||
to resultant system failures, software until recently, has been left in a non-analytical
|
||||
limbo of best practises and constraints.
|
||||
to resultant system failures, the issue of software until recently, has been ignored.
|
||||
Most %left in a non-analytical limbo
|
||||
standards that mention software do not have methodologies
|
||||
to apply FMEA, instead they prescribe best practises,
|
||||
defensive programming strategies, redundancy and constraints~\cite{en61508}.
|
||||
%
|
||||
Software FMEA has been proposed
|
||||
Software FMEA (SFMEA) has been proposed
|
||||
in several forms~\cite{modelsfmea,sfmea,procsfmeadb,sfmeaauto}.
|
||||
%
|
||||
However, SFMEA is always performed separately from HFMEA.
|
||||
|
||||
Some work has looked at the software/hardware interface~\cite{sfmeainterface}
|
||||
but in general SFMEA is always performed separately from HFMEA.
|
||||
%
|
||||
What this means is that the FMEA analysis cannot guarantee to handle
|
||||
all possible failures from hardware.
|
||||
%
|
||||
The hardware may even flag that it has self-detected some kind of failure, but
|
||||
because the software and hardware analyses are separate
|
||||
there is no way the analysis process can guarantee the hardware error will be made known to the software.
|
||||
%
|
||||
This means that some hardware failure modes are unexpected and therefore
|
||||
un-handled by the software.
|
||||
This means the system can exhibit unpredictable and possibly dangerous behaviour which
|
||||
will not be picked up by the FMEA process.
|
||||
%
|
||||
%
|
||||
% This paper seeks to examine the effectiveness of current and proposed SFMEA
|
||||
% techniques, by analysing a simple hybrid hardware/software system,
|
||||
@ -329,20 +387,33 @@ Because of this a programmer is very unlikely to implement a simple bubble sort
|
||||
if the number of elements $N$ is predicted to be large.
|
||||
|
||||
To evaluate FMEA techniques a metric is required.
|
||||
|
||||
%
|
||||
When analysing a failure mode of a component, it is reasonable to
|
||||
look at how the failure mode will affect the other components in the system and to put this then
|
||||
into the context of the systems behaviour.
|
||||
%
|
||||
Components may fail in several ways. European standard EN298~\cite{en298} gives the possible
|
||||
failure modes for a resistor as $OPEN$ and $SHORT$ for instance.
|
||||
%
|
||||
The term $f$ is defined as the number of component failure modes for a given component.
|
||||
A system will have $N$ number of components.
|
||||
|
||||
For each component then there will be $f \times (N-1)$ other components that could be affected by
|
||||
it failing.
|
||||
|
||||
|
||||
In the case of the resistor $f$ is two
|
||||
~\footnote{A resistor is assigned two failure modes by the European Burner standard EN298~\cite{en298}
|
||||
as long as some specific safety precautions involving voltage and power ratings are kept.} : $OPEN$ or $SHORT$.
|
||||
N is the number of components in the system.
|
||||
In order to check this single resistors failure mode then,
|
||||
it must be checked twice, for the condition OPEN and then for the condition SHORT, potentially
|
||||
against all other components in the system $(N-1)$.
|
||||
|
||||
For each component then there will be $f (N-1)$ other components that could be affected by
|
||||
it failing.
|
||||
%
|
||||
By counting the number of of checks to make, i.e. failure mode against all other components in the system,
|
||||
a metric for for evaluating FMEA is defined.
|
||||
a metric for evaluating the maximum number of checks that need to be performed for an FMEA is defined.
|
||||
|
||||
|
||||
%
|
||||
This count of checks is defined as `reasoning~distance' ---or in other words is --- the number of stages of logic and reasoning used
|
||||
@ -380,10 +451,11 @@ in the system.
|
||||
The exhaustive reasoning~distance for a system would be the
|
||||
the sum of these multiplications for all its components. % it contains.
|
||||
%
|
||||
If a small system were to have say 100 components, with three failure modes per component, this
|
||||
would give an exhaustive reasoning distance. % ---for single failure analysis---of $3 \times 100 \times 99$.
|
||||
That means to for each {\fm} of every component, $3$, a check would have to be made
|
||||
against 99 other components. There are 100 components in this hypothetical example so
|
||||
Take a hypothetical small system with say 100 components, with three failure modes per component,
|
||||
%this
|
||||
%would give an exhaustive reasoning distance for single failure analysis---of $3 \times 100 \times 99$.
|
||||
that means to for each {\fm} of every component, i.e. $3$ checks, would have to be made
|
||||
against 99 other components. There are 100 components in this hypothetical example
|
||||
for single failure analysis this means $3 \times 100 \times 99$ checks.
|
||||
%
|
||||
This concept of `reasoning~distance' provides a metric to examine
|
||||
@ -392,13 +464,14 @@ methodologies).
|
||||
%
|
||||
%\fmmdglossSTATEEX
|
||||
%
|
||||
A high reasoning distance, because it is a manual process performed by experperts, is both
|
||||
A high reasoning distance, because it is a manual process performed by experts, is both
|
||||
expensive in terms of time and money.
|
||||
%
|
||||
It is apparent also that the shorter the reasoning distance, the more precisely theoretical examination
|
||||
can determine failure symptoms. A shorter reasoning distance therefore implies a higher quality of safety analysis.
|
||||
%
|
||||
For instance for a very simple small circuit, a better understanding of failure effects is expected,
|
||||
than for a very large system where there are more variables and potential {\fm} interactions.
|
||||
|
||||
|
||||
%
|
||||
%.... general concept... simple ideas about how complex a
|
||||
%failure analysis is the more modules and components are involved
|
||||
@ -446,6 +519,24 @@ to undertake XFMEA for single failures.
|
||||
%Even small systems have typically
|
||||
%100 components, and they typically have 3 or more failure modes each, which would give
|
||||
The hypothetical example described above gives $100 \times 99 \times 3 = 29,700 $ as a reasoning~distance.
|
||||
|
||||
%%% SANITY CHECK.
|
||||
%%%
|
||||
When stating a general equation such as equation~\ref{eqn:fmea_single} it can be sanity checked
|
||||
by thinking of common examples.
|
||||
For instance a simple amplifier circuit with a handful of components
|
||||
would have a low $RD_{single}$ count of potential failure mode to components checks.
|
||||
%
|
||||
From experience, with a simple amplifier circuit it is relatively easy to predict
|
||||
how it would react to well defined component failure modes.
|
||||
|
||||
For a larger circuit the problems of tracing side effects of the failure mode through the circuit
|
||||
mean that it is likely to be a far more complex task.
|
||||
|
||||
The order $O(N^2)$ for FMEA complexity, for single failures, therefore agrees with experience.
|
||||
|
||||
In general terms, for a very simple small circuit, a better understanding of failure effects is expected,
|
||||
than for a very large system where there are more variables and potential {\fm} interactions.
|
||||
%
|
||||
%\fmmdglossSTATEEX
|
||||
\paragraph{Exhaustive FMEA and double failure scenarios.}
|
||||
@ -469,6 +560,13 @@ $100 \times 99 \times 98 \times 9 = 8,731,800 $. % failure mode scenarios.
|
||||
%
|
||||
In practise there is an additional complication; that of
|
||||
the circuit topology changes that {\fms} can cause.
|
||||
Double failure analysis is usually only performed on sections
|
||||
of a system considered most critical, and often in the context of redundancy.
|
||||
%
|
||||
For a combustion controller, it is stated~\cite{en298} that there must be two separate
|
||||
fuel shut-off valves, that are controlled from different relays and wiring.
|
||||
%
|
||||
This is actually more an enforcement of redundancy than FMEA for `any~double~combination' of failure modes.
|
||||
|
||||
\paragraph{Reliance on experts for meaningful FMEA Analysis.}
|
||||
Current FMEA methodologies cannot consider---for the reason of state explosion---an exhaustive approach.
|
||||
@ -501,50 +599,104 @@ system.
|
||||
% An example of component tolerance considered for FMEA
|
||||
% is given in section~\ref{sec:resistortolerance}.
|
||||
|
||||
%\section{FMEA in current usage: Five variants}
|
||||
\section{FMEA in current usage: Four variants}
|
||||
|
||||
%\paragraph{Five main Variants of FMEA}
|
||||
\paragraph{Four main Variants of FMEA}
|
||||
\begin{itemize}
|
||||
%\item \textbf{PFMEA - Production} Emphasis on cost reduction and product improvement;
|
||||
\item \textbf{FMECA - Criticality} Emphasis on minimising the effect of critical systems failing~\cite{fmeca}; % Military/Space
|
||||
\item \textbf{FMEDA - Statistical Safety} Statistical analysis giving Safety Integrity Levels~\cite{en61508};
|
||||
\item \textbf{DFMEA - Design or Static/Theoretical} Approval of safety critical systems using FMEA and single or double failure prevention~\cite{en298};% EN298/EN230/UL1998
|
||||
\item \textbf{SFMEA - Software FMEA} --- Usage not enforced by most current standards~\cite{en298,en230,en61508}. %only used in highly critical systems at present.
|
||||
\end{itemize}
|
||||
% %\section{FMEA in current usage: Five variants}
|
||||
% \section{FMEA in current usage: Four variants}
|
||||
%
|
||||
% %\paragraph{Five main Variants of FMEA}
|
||||
% \paragraph{Four main Variants of FMEA}
|
||||
% \begin{itemize}
|
||||
% %\item \textbf{PFMEA - Production} Emphasis on cost reduction and product improvement;
|
||||
% \item \textbf{FMECA - Criticality} Emphasis on minimising the effect of critical systems failing~\cite{fmeca}; % Military/Space
|
||||
% \item \textbf{FMEDA - Statistical Safety} Statistical analysis giving Safety Integrity Levels~\cite{en61508};
|
||||
% \item \textbf{DFMEA - Design or Static/Theoretical} Approval of safety critical systems using FMEA and single or double failure prevention~\cite{en298};% EN298/EN230/UL1998
|
||||
% \item \textbf{SFMEA - Software FMEA} --- Usage not enforced by most current standards~\cite{en298,en230,en61508}. %only used in highly critical systems at present.
|
||||
% \end{itemize}
|
||||
|
||||
|
||||
\nocite{MILSTD1629short}
|
||||
|
||||
\subsection{FMEA and modularity.}
|
||||
\section{FMEA and modularity.}
|
||||
Because modern electronics has become more complex the number
|
||||
of basic components has risen dramatically.
|
||||
To add to this components used to fulfil common functions are often Integrated Circuits (ICs)..
|
||||
of basic components in atypical safety critical system has risen dramatically.
|
||||
%
|
||||
|
||||
%
|
||||
To add to this components used to fulfil common functions are often Integrated Circuits (ICs).
|
||||
%
|
||||
Typical examples include voltage regulators, op-amps, micro-controllers~\cite{pic18f2523}, memory modules and
|
||||
protocol handlers~\cite{mcp2515}. To build any of these component from scratch would be very expensive and time consuming,
|
||||
but these IC `components' have very high internal transistor counts, and each have their own unique
|
||||
failure mode behaviour.
|
||||
%
|
||||
Thus modern electronics has already become too large in scope to sensibly implement the base component failure mode directly mapped to
|
||||
a system failure paradigm.
|
||||
|
||||
\paragraph{Modularity --- breaking large systems into manageable blocks}
|
||||
|
||||
When faced with complex systems, a typical way to make them
|
||||
manageable is to break them into sub-systems, and even sub-systems of sub-systems ad infinitum.
|
||||
|
||||
|
||||
|
||||
\paragraph{History of Modularisation in Software}
|
||||
%
|
||||
It is interesting to compare the development of FMEA methodologies with software.
|
||||
%
|
||||
Software faced a crisis in complexity in the 1960's where the architecture of
|
||||
dominant computer language FORTRAN~\cite{f77} became a limiting factor.
|
||||
%
|
||||
Programs written in FORTRAN became clumsy when they became large.
|
||||
%
|
||||
All variables were global.
|
||||
%
|
||||
A miss-spelled variable could cause chaos.
|
||||
%
|
||||
Also it was often difficult to pull a function
|
||||
out of one program and place it in another if it used some of the global variables.
|
||||
|
||||
|
||||
|
||||
Newer computer languages were invented where modularity was encouraged.
|
||||
Instead of FORTRANs global scope for variables, individual functions in a newer language like `C'
|
||||
started to have `local' variables. This meant that
|
||||
a programmer could take a function from a `C' program and
|
||||
use it in another one without complication.
|
||||
%
|
||||
Later languages implemented object orientation
|
||||
which grouped functions and data together into modules called classes, where
|
||||
even the internal local variables could be hidden from the
|
||||
programmer using the class.
|
||||
%
|
||||
Software expanded in complexity faster than electronics,
|
||||
and to cope with this software languages developed modularity (function call trees, classes and finally distributed processing mechanisms).
|
||||
%
|
||||
FMEA has, by necessity, started to include some modular features but none yet
|
||||
have defined mechanisms for ensuring that all failure modes
|
||||
from a module must be considered in the analysis of the module(s)
|
||||
that incorporate it.
|
||||
|
||||
|
||||
|
||||
|
||||
\paragraph{Modularisation in safey analysis in the automotive industry.}
|
||||
|
||||
The automotive industry, because of mass production, must make products that have high safety integrity %that are very safe but
|
||||
% financial pressure keeps their products
|
||||
but must also be affordable.
|
||||
%
|
||||
This leads to specialist firms producing modules, such as automatic braking systems,
|
||||
that are bought in and assembled to make a auto-mobile.
|
||||
that are bought in and assembled to make an auto-mobile.
|
||||
%
|
||||
Performing failure analysis using the basic component single failure modes to
|
||||
system failure mapping, would be very difficult: this would require expert knowledge
|
||||
system failure mapping, would thus be very difficult: this would require expert knowledge
|
||||
of the design behaviour and component types used in each module.
|
||||
%%
|
||||
%Because modern systems have become more complex and now include software elements,
|
||||
%modularity
|
||||
%of some form (breaking the problem down into smaller sections),
|
||||
%has become necessary to break down the state explosion problems associated with FMEA.
|
||||
%
|
||||
Because modern systems have become more complex and now include software elements,
|
||||
modularity
|
||||
of some form (breaking the problem down into smaller sections),
|
||||
has become necessary to break down the state explosion problems associated with FMEA.
|
||||
%
|
||||
Some modular techniques are starting to be formalised, and are described below.
|
||||
Some modular FMEA techniques are starting to be used and specified, and are described below.
|
||||
|
||||
\paragraph{Automotive SIL (ASIL) --- modularisation of FMEDA}
|
||||
%
|
||||
@ -564,66 +716,55 @@ It does not introduce traceable {\fm} reasoning in its hierarchy.
|
||||
|
||||
%
|
||||
\paragraph{Indenture levels --- modularisation of FMECA}
|
||||
%
|
||||
The US military standard for FMECA~\cite{fmeca}, describes a very broad modularity regime, that
|
||||
it terms `indenture' levels. Indenture levels are arranged from the top down
|
||||
and identify finer and finer grained modules. For instance, an aircraft
|
||||
it terms `indenture' levels.
|
||||
%
|
||||
Indenture levels are arranged from the top down
|
||||
and identify finer and finer grained modules.
|
||||
%
|
||||
For instance, an aircraft
|
||||
may be the first indenture level, and the next may be an identifiable module such as
|
||||
an altitude radar: within that finer grained modules may be identified until
|
||||
the base components are listed. Note that this is a top down approach to modularisation and
|
||||
this can introduce errors into the reliability calculations~\cite{MILSTD1629short}.
|
||||
the base components are listed.
|
||||
%
|
||||
\paragraph{Modularisation in Software}
|
||||
Note that this is a top down approach to modularisation and
|
||||
this can introduce errors into the reliability calculations~\cite{MILSTD1629short}
|
||||
and miss-out some component failure modes.
|
||||
%
|
||||
It is interesting to compare the development of FMEA methodologies with software.
|
||||
Software expanded in complexity faster than electronics,
|
||||
and to cope with this software languages developed modularity (function call trees, classes and finally distributed processing mechanisms).
|
||||
%
|
||||
FMEA has, by necessity, started to include some modular features but none yet
|
||||
have defined mechanisms for ensuring that all failure modes
|
||||
from a module must be considered in the analysis of the module(s)
|
||||
that incorporate it.
|
||||
|
||||
\paragraph{Top Down or Bottom-up?}
|
||||
% Because FMEA is a bottom up technique, applying a top down analysis (as in FMECAs indenture levels)
|
||||
% cannot guarantee to consider all component failure modes in the correct context.
|
||||
%
|
||||
% \paragraph{Top Down or Bottom-up?}
|
||||
% % Because FMEA is a bottom up technique, applying a top down analysis (as in FMECAs indenture levels)
|
||||
% % cannot guarantee to consider all component failure modes in the correct context.
|
||||
% % %
|
||||
% A top down approach (such as FTA) can miss~\cite{faa}[Ch.~9] individual failure modes of components,
|
||||
% especially where there are non-obvious or unexpected top-level failures.
|
||||
% %
|
||||
A top down approach (such as FTA) can miss~\cite{faa}[Ch.~9] individual failure modes of components,
|
||||
especially where there are non-obvious or unexpected top-level failures.
|
||||
% In order to ensure that every failure mode is considered, a bottom-up approach
|
||||
% including every base components {\fms} must be used.
|
||||
% %
|
||||
% Going back to the software analogy, the indenture levels of FMECA are similar to
|
||||
% a software call tree where the highest indenture levels would be leaf functions.
|
||||
% %
|
||||
% There is no equivalent of the software `class'.
|
||||
% %
|
||||
% In the real world however there are.
|
||||
% Off the shelf sensors can be purchased which communicate using standard protocols~\cite{Pfeiffer:2003:ENC:1199616}. % consider CANOpen standard sensors, these are%~\footnote{CANopen sensors...}
|
||||
% %modules connected by an industrial data bus.
|
||||
% %
|
||||
% These not only typically have electrical and mechanical
|
||||
% components, they have a firmware and communication bus aspects~\cite{canspec, caninauto}.
|
||||
% %
|
||||
% These type of modules combine hardware, electronics, software, communications
|
||||
% and distributed programming.
|
||||
% %
|
||||
% Current FMEA techniques struggle with software alone, and also, fail to integrate the analysis of hardware and software
|
||||
% systems~\cite{sfmea, embedsfmea, modelsfmea, sfmeaa}. %, sfmeainterface }.
|
||||
%
|
||||
In order to ensure that every failure mode is considered, a bottom-up approach
|
||||
including every base components {\fms} must be used.
|
||||
%
|
||||
Going back to the software analogy, the indenture levels of FMECA are similar to
|
||||
a software call tree where the highest indenture levels would be leaf functions.
|
||||
%
|
||||
There is no equivalent of the software `class'.
|
||||
%
|
||||
In the real world however there are.
|
||||
Off the shelf sensors can be purchased which communicate using standard protocols~\cite{Pfeiffer:2003:ENC:1199616}. % consider CANOpen standard sensors, these are%~\footnote{CANopen sensors...}
|
||||
%modules connected by an industrial data bus.
|
||||
%
|
||||
These not only typically have electrical and mechanical
|
||||
components, they have a firmware and communication bus aspects~\cite{canspec, caninauto}.
|
||||
%
|
||||
These type of modules combine hardware, electronics, software, communications
|
||||
and distributed programming.
|
||||
%
|
||||
Current FMEA techniques struggle with software alone, and also, fail to integrate the analysis of hardware and software
|
||||
systems~\cite{sfmea, embedsfmea, modelsfmea, sfmeaa}. %, sfmeainterface }.
|
||||
|
||||
|
||||
%
|
||||
\subsection{FMEA and software.}
|
||||
In addition to increasing complexity in electronics, modern control systems nearly always have a significant software/firmware element,
|
||||
and not being able to model software with current FMEA methodologies
|
||||
is a cause for criticism~\cite{safeware}[Ch.12].
|
||||
%
|
||||
Similar difficulties in integrating mechanical and electronic/software
|
||||
failure models are discussed in ~\cite{SMR:SMR580}.
|
||||
%
|
||||
Currently standards that demand FMEA for hardware (e.g. EN298~\cite{en298}, EN61508~\cite{en61508}),
|
||||
do not specify it for software, but instead recommended computer-architectures, good software practise,
|
||||
review processes and language feature constraints.
|
||||
|
||||
%
|
||||
|
||||
|
||||
@ -633,8 +774,23 @@ review processes and language feature constraints.
|
||||
|
||||
%FMMD is a modularisation of FMEA and can produce failure~mode models that can be used in
|
||||
%all the above variants of FMEA.
|
||||
\subsection{The problem of Systems using software and FMEA}
|
||||
|
||||
\paragraph{Current work on Software FMEA}
|
||||
Software systems are becoming part of everyday life.
|
||||
It is getting increasingly rarer to find systems where there is not a computer
|
||||
controlling some part of it.
|
||||
All modern airliners are fly-by wire. The throttle in a modern car is fly-by wire.
|
||||
|
||||
|
||||
Because software and hardware FMEAs are separate, tracing failure effects
|
||||
from hardware into software, or even ensuring that all predicted
|
||||
hardware failure modes have been handled in software is difficult.
|
||||
%
|
||||
This problem is recognised and work has been undertaken to
|
||||
begin to redress this problem.
|
||||
|
||||
|
||||
\paragraph{Current work on Software FMEA.}
|
||||
|
||||
SFMEA usually does not seek to integrate
|
||||
hardware and software models, but to perform
|
||||
@ -663,7 +819,7 @@ The main FMEA methodologies are all based on the concept of taking
|
||||
base component {\fms}, and translating them into system level events/failures~\cite{sfmea,sfmeaa}.
|
||||
%
|
||||
That is there is only one stage of reasoning between the low level component {\fm} and
|
||||
the system level symptom of failure.
|
||||
the system level symptom of failure leaves ample room for error.
|
||||
%
|
||||
In a complicated system, mapping a component failure mode to a system level failure
|
||||
will mean a long reasoning distance; that is to say the actions of the
|
||||
@ -671,7 +827,7 @@ failed component will have to be traced through
|
||||
several sub-systems, gauging its effects with and on other components.
|
||||
%
|
||||
With software at the higher levels of these sub-systems,
|
||||
we have yet another layer of complication.
|
||||
there is yet another layer of complication.
|
||||
%
|
||||
%In order to integrate software, %in a meaningful way
|
||||
%we need to re-think the
|
||||
@ -720,12 +876,12 @@ getting too complicated for meaningful analysis using FMEA.
|
||||
% has much higher component counts and more complex components than those in use when FMEA
|
||||
% was designed.
|
||||
%
|
||||
From the above defeciencies, a wish list for a better FMEA is presented, stating the features that should exist
|
||||
From the above deficiencies, a wish list for a better FMEA is presented, stating the features that should exist
|
||||
in an improved FMEA methodology,
|
||||
\begin{itemize}
|
||||
\item Must be able to analyse hybrid software/hardware systems,
|
||||
\item avoid state explosion (i.e. XFMEA is impractical by hand~\cite{cbds}),
|
||||
\item encourage exhaustive checking within each modular, %(total failure coverage within {\fgs} all interacting component and failure modes checked),
|
||||
\item encourage exhaustive checking within each module, %(total failure coverage within {\fgs} all interacting component and failure modes checked),
|
||||
\item traceable reasoning inherent in system failure models,% to aid repeatability and checking,
|
||||
\item re-usable i.e. it should be possible to re-use analysis,
|
||||
\item possibility to analyse simultaneous/multiple failures,
|
||||
@ -737,6 +893,61 @@ in an improved FMEA methodology,
|
||||
|
||||
\section{Proposed Methodology: Failure Mode Modular De-composition (FMMD)}
|
||||
|
||||
The basic concept behind FMMD is to from the bottom-up, modularise the problem.
|
||||
|
||||
FMEA cannot easily be modularised from the top-down, because
|
||||
it has to deal with component failure modes.
|
||||
|
||||
It may seem bit counter intuitive, but this means that if FMEA is to be modularised
|
||||
it must be done from the bottom up.
|
||||
This may seem like a stange idea, but consider how an engineer would look
|
||||
at an electronic circuit/schematic.
|
||||
%
|
||||
The Engineer might, for instance, trace an input signal
|
||||
into some other components following a connection on the schematic.
|
||||
%
|
||||
The Engineer, would typically then following signal paths, try to figure out what
|
||||
those components did.
|
||||
%
|
||||
For instance were it an amplifier, the engineer would
|
||||
recognise the electronic configuration,
|
||||
and maybe get his calculator out and calculate its gain
|
||||
or some other feature, by looking at the other components connected to it.
|
||||
%
|
||||
This is a form of modularisation from the bottom-up.
|
||||
%
|
||||
The Engineer has identified a module, an input amplifier.
|
||||
|
||||
|
||||
\paragraph{Broadly FMMD is modularisation from the bottom-up of FMEA}
|
||||
|
||||
Firstly modules are identified (for instance common circuitry formations such as amplifiers or digital outputs) and
|
||||
then failure mode analysis is performed on them.
|
||||
%
|
||||
By analysing this small group of components as a module
|
||||
the ways in which the module can fail can be listed.
|
||||
%
|
||||
This will give a set of symptoms of failure for the module.
|
||||
%
|
||||
When the lower levels have been analysed, modules can be brought
|
||||
together to form larger modules using the lower ones as through they were
|
||||
components.
|
||||
%
|
||||
These modules can be brought together to form even larger modules.
|
||||
|
||||
Eventually there is one large module which represents the entire system.
|
||||
|
||||
Because the terms module and sub-system are quite general term, and possibly over-used,
|
||||
a new term has been used to take their place in FMMD.
|
||||
%
|
||||
This is the `functional~group'.
|
||||
%
|
||||
Quite simply when identifying a group of components that perform a particular task
|
||||
the term `functional~group' describes it as a group that performs a function.
|
||||
%
|
||||
It also means that a function~group can contain other functional~groups without
|
||||
dragging along the semantic baggage that comes with the terms `module' and 'sub-system'.
|
||||
|
||||
|
||||
\section{The proposed Methodology}
|
||||
\label{fmmdproc}
|
||||
@ -747,18 +958,27 @@ work together to perform a given function: the failure modes of the components
|
||||
are analysed, and a failure mode behaviour for the group determined: this group
|
||||
can now be used as a component in its own right with a set of its own failure modes.
|
||||
%
|
||||
% In essence, this methodology beginning with low level modules (or {\fgs})
|
||||
% which are analysed and assigned a failure mode behaviour.
|
||||
% They are then considered as higher level components with
|
||||
% their own failure mode behaviour. These higher level components
|
||||
% are then collected to form {\fgs} and so on until a hierarchy is built
|
||||
% representing the entire system.
|
||||
In essence, this methodology beginning with low level modules (or {\fgs})
|
||||
which are analysed and assigned a failure mode behaviour.
|
||||
They are then considered as higher level components with
|
||||
their own failure mode behaviour. These higher level components
|
||||
are then collected to form {\fgs} and so on until a hierarchy is built
|
||||
representing the entire system.
|
||||
%
|
||||
% Any new static failure mode methodology must ensure that it
|
||||
% represents all component failure modes and it therefore should be bottom-up,
|
||||
% starting with individual component failure modes.
|
||||
Any new static failure mode methodology must ensure that it
|
||||
represents all component failure modes and it therefore should be bottom-up,
|
||||
starting with individual component failure modes.
|
||||
%
|
||||
That way, all component failure modes must be considered.
|
||||
%
|
||||
If you modularise from the top down, it is not naturally follow
|
||||
bottom-level component failure modes would be handled/used.
|
||||
%
|
||||
Starting at the bottom means having to deal with each component failure mode from the beginning.
|
||||
|
||||
|
||||
\paragraph{FMMD process.}
|
||||
|
||||
To ensure all component failure modes are modelled and traceable through stages of analysis, the new methodology must be bottom-up.
|
||||
%
|
||||
%This seems essential to satisfy criterion 2.
|
||||
@ -824,7 +1044,7 @@ A practical example of a hardware FMEA performed both traditionally and using FM
|
||||
software and hardware hybrid example is analysed in~\cite{syssafe2012}
|
||||
and examples of `reasoning~distance' efficiency savings can be found in~\cite{clark}[Ch.7].
|
||||
%
|
||||
\paragraph{Integrating software into the FMMD model}
|
||||
\paragraph{Integrating software into the FMMD model.}
|
||||
%
|
||||
%With modular FMEA i.e. FMMD %(FMMD)
|
||||
%the concepts of failure~modes
|
||||
@ -881,7 +1101,7 @@ The software reads the temperature from the sensor and applies checks
|
||||
to detect any failures.
|
||||
The software then applies a PID~\cite{dcods} algorithm to determine the length/modulation of the pulses applied to the heater.
|
||||
|
||||
yourdon context diagram here
|
||||
%yourdon context diagram here
|
||||
|
||||
|
||||
|
||||
@ -920,6 +1140,10 @@ as a complete example of an electronic/hardware hybrid analysed using FMMD. %wou
|
||||
When designing a computer program it is often useful to
|
||||
start with a system overview.
|
||||
A structured analysis `Yourdon' context diagram~\cite{Yourdon:1989:MSA:62004} is presented below, see figure~\ref{fig:context_diagram_PID}.
|
||||
A Yourdon context diagram shows an overview of a system, with the data inputs and data outputs.
|
||||
The circle in the middle defines the processing applied to those inputs and outputs.
|
||||
The context diagram can be later refined by introducing more circles with data paths between them.
|
||||
|
||||
%
|
||||
\begin{figure}[h]+
|
||||
\centering
|
||||
@ -932,7 +1156,7 @@ A structured analysis `Yourdon' context diagram~\cite{Yourdon:1989:MSA:62004} is
|
||||
Using figure~\ref{fig:context_diagram_PID} the system in terms of its data flow is reviewed, starting
|
||||
with the data sources (the Pt100 temperature sensor inputs) and the data sinks (the heater output and the LED indicators).
|
||||
%
|
||||
There are two voltage inputs (see section~\ref{sec:Pt100}) from the Pt100 temperature sensor.
|
||||
There are two voltage inputs (see section~\ref{clark}[5]) from the Pt100 temperature sensor.
|
||||
%
|
||||
For the Pt100 sensor, the voltages it outputs are read and %for
|
||||
this requires an ADC and MUX.
|
||||
@ -987,7 +1211,9 @@ controlled system that does not use a traditional operating system. These are ge
|
||||
coded in 'C' or assembly language and run immediately from power-up.}
|
||||
software architectures, a rudimentary operating system is required, often referred to as the `monitor'.
|
||||
%
|
||||
PID, because the algorithm depends heavily on integral calculus~\cite{dcods}[Ch.3.3] is time sensitive
|
||||
The `monitor' function calls the PID function at a regular and precise interval.
|
||||
%
|
||||
The PID function, because the algorithm depends heavily on integral calculus~\cite{dcods}[Ch.3.3] is time sensitive
|
||||
and it is necessary to execute it at precise intervals determined by its proportional, integral and differential (PID) coefficients.
|
||||
%
|
||||
Most micro-controllers feature several general purpose timers~\cite{pic18f2523}.
|
||||
@ -997,7 +1223,7 @@ to call the PID algorithm at a regular and precise time interval. % specified in
|
||||
%
|
||||
\paragraph{Data flow model to programmatic call tree.}
|
||||
The Yourdon methodology also gives guidance as to which software
|
||||
functions should be called to control the process, or in `C' terms be the main function.
|
||||
functions should be called to control a process, or in `C' terms be the main function.
|
||||
%
|
||||
\begin{figure}[h]
|
||||
\centering
|
||||
@ -1058,10 +1284,10 @@ These are listed, and from the bottom-up, FMMD analysis is begun.
|
||||
To summarise from the design stage,
|
||||
the electronic components identified thus far:
|
||||
\begin{itemize}
|
||||
\item ADCMUX --- Electronics, analysed in previous example,
|
||||
\item ADCMUX --- Internal micro controller multiplexer and analogue to digital converter,
|
||||
\item TIMER --- Internal micro controller timer,
|
||||
\item HEATER --- Heating element, essentially a resistor,
|
||||
\item Pt100 --- Pt100 Temperature sensor, as analysed in section~\ref{sec:Pt100},
|
||||
\item Pt100 --- Pt100 Temperature sensor,
|
||||
\item PWM --- Internal micro controller pulse width modulation module,
|
||||
\item General Purpose I/O (GPIO) --- I/O used to drive LEDS, %. %source LED current
|
||||
\item LEDs --- Indication LEDs via GPIO,
|
||||
@ -1070,6 +1296,11 @@ the electronic components identified thus far:
|
||||
%
|
||||
\subsection{Temperature Controller Hardware Elements FMMD.}
|
||||
%
|
||||
|
||||
NEED BETTER REFS HERE FOR THE
|
||||
SOURCES FOR THE FAILURE MODES OF COMPONENTS>
|
||||
|
||||
|
||||
\paragraph{ADCMUX and Read\_ADC.}
|
||||
We re-use the {\dc} from section~\ref{readADC}.
|
||||
$$ fm(RADC) = \{ VV\_ERR, HIGH, LOW \} .$$
|
||||
|
Loading…
Reference in New Issue
Block a user