print out-redpen-edit.... now a nice
cycle ride to prince regent-a 2000m swim and then maybe a 6 inch sub with tuna...
This commit is contained in:
parent
05f96697e7
commit
ec7dc38679
11
mybib.bib
11
mybib.bib
@ -860,6 +860,17 @@ strength of materials, the causes of boiler explosions",
|
||||
biburl="http://www.isa.org/InTechTemplate.cfm?template=/ContentManagement/ContentDisplay.cfm\&ContentID=77994",
|
||||
}
|
||||
|
||||
@INPROCEEDINGS{patterns6113886,
|
||||
author={Lopatkin, I. and Iliasov, A. and Romanovsky, A. and Prokhorova, Y. and Troubitsyna, E.},
|
||||
booktitle={High-Assurance Systems Engineering (HASE), 2011 IEEE 13th International Symposium on},
|
||||
title={Patterns for Representing FMEA in Formal Specification of Control Systems},
|
||||
year={2011},
|
||||
pages={146-151},
|
||||
keywords={control engineering computing;control systems;failure analysis;formal specification;program diagnostics;system recovery;effects analysis;error detection;error recovery;failure modes;formal event-B specification;formal system development;inductive safety analysis;requirement tracing;sluice control system;Computational modeling;Logic gates;Safety;Sensor systems;Switches;Event-B;FMEA;control systems;formal specification;patterns;safety},
|
||||
doi={10.1109/HASE.2011.10},
|
||||
ISSN={1530-2059},}
|
||||
|
||||
|
||||
|
||||
@PHDTHESIS{garrett,
|
||||
AUTHOR = "Chris Garrett",
|
||||
|
@ -128,7 +128,7 @@ This means that for each {\cb} node there are at least two hardware software int
|
||||
Because of this it is virtually impossible to apply meaningful traditional FMEA methodologies to
|
||||
{\cb} systems.
|
||||
%
|
||||
This paper firstly highlights the limitations with traditonal FMEA,
|
||||
This paper firstly highlights the limitations with traditional FMEA,
|
||||
and then describes a new modularised variant, Failure Mode Modular De-composition
|
||||
which addresses the problems of applying FMEA to software/hardware hybrid systems.
|
||||
%The paper first discussed work performed on software FMEA, and then shows the need
|
||||
|
@ -36,7 +36,7 @@ defined at the start of this chapter.
|
||||
The act
|
||||
of defining relationships between the data objects
|
||||
in FMEA raise questions about the nature of the process
|
||||
and allow us to analytically discuss its strengths and weaknesses.
|
||||
and allows us to analytically discuss its strengths and weaknesses.
|
||||
|
||||
|
||||
|
||||
@ -176,7 +176,8 @@ component types, but does not detail specific failure modes.
|
||||
Using MIL1991 in conjunction with FMD-91 we can determine statistics for the failure modes
|
||||
of component types.
|
||||
%
|
||||
The FMEDA process from European standard EN61508~\cite{en61508}
|
||||
The FMEA variant\footnote{EN61508 (and related standards) are based on the FMEA variant Failure Mode Effects and Diagnostic Analysis (FMEDA)}
|
||||
used for European standard EN61508~\cite{en61508}
|
||||
requires statistics for Meantime to Failure (MTTF) for all {\bc} failure modes.
|
||||
|
||||
|
||||
@ -468,7 +469,7 @@ that we got from FMD-91, listed in equation~\ref{eqn:opampfms}.
|
||||
\end{table}
|
||||
|
||||
|
||||
%\clearpage
|
||||
\clearpage
|
||||
|
||||
\subsubsection{Failure modes of an Op-Amp}
|
||||
|
||||
@ -515,11 +516,6 @@ component {\fms} in FMEA or FMMD and require interpretation.
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
\clearpage
|
||||
|
||||
|
||||
%%
|
||||
%% Paragraph using failure modes to build from bottom up
|
||||
%%
|
||||
@ -662,11 +658,11 @@ echoing diagnostic/fault~finding methods~\cite{garrett, maikowski}. % loebowski}
|
||||
%
|
||||
When fault finding, we generally follow the signal path checking for correct behaviour
|
||||
along it: when we find something out of place we zoom in and measure
|
||||
the circuit behaviour until we find a faulty component or module.
|
||||
the circuit behaviour until we find a faulty component or module~\cite{garrett}.
|
||||
%
|
||||
With this style of fault finding, because it is based on experiment,
|
||||
we can hop from module to module eliminating working modules, until we find the
|
||||
failure.
|
||||
failure~\cite{maikowski}.
|
||||
%
|
||||
The rationale and work-culture of those tasked to
|
||||
perform FMEA are generally personnel who have performed fault finding.
|
||||
@ -706,7 +702,7 @@ Also, whether following the effects through the signal path {\em only} is accept
|
||||
would looking at its effect on all other components in the system be necessary.
|
||||
%is a matter for debate.
|
||||
%
|
||||
In practise, it is a compromise between the amount of time/money that can be spent
|
||||
In practise, a compromise is made between the amount of time/money that can be spent
|
||||
on analysis relative to the criticality of the project.
|
||||
Metrics from measuring the amount of work to undertake for FMEA are examined in section~\ref{sec:xfmea}.
|
||||
|
||||
@ -717,7 +713,7 @@ change the circuit topology. For a single failure
|
||||
this effect may cause additional complications for the analyst.
|
||||
For multiple failures this means
|
||||
that the analyst
|
||||
will have to deal altered---or changed circuit topologies---
|
||||
will have to deal with altered---or changed circuit topologies---
|
||||
of the electronic circuit for each analysis.
|
||||
|
||||
|
||||
@ -776,13 +772,29 @@ did not link this failure to the catastrophic failure of the spacecraft~\cite{ch
|
||||
This was not a failure in the objective reasoning, but more of the subjective, or the context in which the leak occurred.
|
||||
%
|
||||
What this means is that for an objectively calculated failure mode outcome, we may have
|
||||
more than one subjective outcome definition for it.
|
||||
more than one subjective outcome. %, or definition, for it.
|
||||
%
|
||||
|
||||
This means that objective reasoning can be applied to determine objective effects, but the criticality ---or the seriousness/consequences---
|
||||
of those failures depends upon the Equipment Under Control (EUC)
|
||||
and its environment.
|
||||
%
|
||||
For instance a leak of nuclear material on an aboard a spacecraft could have the consequences
|
||||
of loss of mission, but a leak on earth could have serious health and environmental consequences.
|
||||
This means one line of FMECA describing a system risk is an over simplification (consider that the same
|
||||
nuclear material will be present during transport and launch, and when outside earth's environment).
|
||||
%
|
||||
Subjective appraisal of the outcome of a system failure mode can also
|
||||
be subject to management and/or political pressure.
|
||||
|
||||
|
||||
\paragraph{Multiple Simultaneous Failure Modes}
|
||||
%
|
||||
FMEA is less useful for determining events for multiple
|
||||
simultaneous
|
||||
failures\footnote{Multiple simultaneous failures are taken to mean failures that occur within the same detection period.}.
|
||||
failures\footnote{Multiple simultaneous failures are taken to mean failures that occur within the same detection period.
|
||||
Detection periods are typically determined for the process under control. For a flame in an industrial burner this
|
||||
could typically be one second.~\cite{en298}}.
|
||||
%
|
||||
Work has been performed using component failure statistics to
|
||||
offer the more likely multiple failures~\cite{FMEAmultiple653556} for analysis.
|
||||
@ -806,14 +818,18 @@ meaning the additional failures might have to be analysed with respect to the ch
|
||||
|
||||
|
||||
\paragraph{Failure modes and their observability criterion: detectable and undetectable.}
|
||||
\label{sec:detectable}
|
||||
Often the effects of a failure mode may be easy to detect,
|
||||
and our equipment can react by raising an alarm or compensating for the resulting fault.
|
||||
%
|
||||
Some failure modes may cause undetectable failures, for instance a component that causes
|
||||
a measured reading to change could have adverse consequences yet not be flagged as a failure.
|
||||
%
|
||||
This type of failure would not be flagged as a failure by the system, because
|
||||
it has no way of knowing the reading is invalid.
|
||||
This type of failure %
|
||||
%would not be flagged as a failure by the system, because
|
||||
can not be dealt with by passing an error indication to higher level modules
|
||||
because we cannot detect it. The system therefore
|
||||
has no way of knowing the reading is invalid.
|
||||
%
|
||||
The term observable has a specific meaning in the field of control engineering~\cite{721666, ACS:ACS1297};
|
||||
systems submitted for FMEA are generally related to control systems,
|
||||
@ -893,11 +909,13 @@ methodologies.
|
||||
%{sfmeaforwardbackward}
|
||||
\subsection{FMEA and the State Explosion Problem}
|
||||
\label{sec:xfmea}
|
||||
\paragraph{Exhaustive Single Failure FMEA.}
|
||||
\paragraph{Problem of which components to check for a given {\bc} {\fm}.}
|
||||
|
||||
FMEA for a safety critical certification~\cite{en298,en61508} will have to be applied
|
||||
to all known failure modes of all components within a system.
|
||||
%
|
||||
Each one of these, in a typical report, would be one line of a spreadsheet entry.
|
||||
%
|
||||
FMEA does not define or specify the scope of the investigation of each component failure mode.
|
||||
Should we follow the signal path, and all components we encounter along that, or should the scope be wider?
|
||||
%
|
||||
@ -921,7 +939,7 @@ $f$ is the number of failure modes per component.
|
||||
\end{equation}
|
||||
|
||||
|
||||
\paragraph{Exhaustive Single Failure FMEA}
|
||||
\paragraph{Exhaustive FMEA and dual failures.}
|
||||
This would mean an order of $O(N^2)$ number of checks to perform
|
||||
to undertake an `exhaustive~FMEA'. Even small systems have typically
|
||||
100 components, and they typically have 3 or more failure modes each.
|
||||
@ -955,7 +973,7 @@ we rely on experts in the system under investigation
|
||||
to perform a meaningful FMEA analysis.
|
||||
%
|
||||
These experts must use their judgement and experience to choose
|
||||
sub-sets of the components in the system to check against each {\fm}.
|
||||
sub-sets of the components in the system, to check against each {\fm}.
|
||||
%
|
||||
Also, %In practise
|
||||
these experts have to select the areas they see as most critical for detailed FMEA analysis:
|
||||
@ -1056,7 +1074,7 @@ FMECA has three probability factors for component failures.
|
||||
\textbf{FMECA ${\lambda}_{p}$ value.}
|
||||
This is the overall failure rate of a base component.
|
||||
This will typically be the failure rate per million ($10^6$) or
|
||||
billion ($10^9$) hours of operation. reference MIL1991.
|
||||
billion ($10^9$) hours of operation~\cite{mil1991}.
|
||||
|
||||
\textbf{FMECA $\alpha$ value.}
|
||||
The failure mode probability, usually denoted by $\alpha$ is the probability of
|
||||
@ -1148,13 +1166,13 @@ or across the software/hardware interface.
|
||||
|
||||
%\subsection{ FMEDA - Failure Modes Effects and Diagnostic Analysis}
|
||||
\label{sec:FMEDA}
|
||||
\textbf{Failure Mode Classifications in FMEDA.}
|
||||
\textbf{Failure Mode Classifications and metrics in FMEDA.}
|
||||
\begin{itemize}
|
||||
\item \textbf{Safe or Dangerous} Failure modes are classified SAFE or DANGEROUS
|
||||
\item \textbf{Detectable failure modes} Failure modes are given the attribute DETECTABLE or UNDETECTABLE
|
||||
\item \textbf{Four attributes to Failure Modes} All failure modes may thus be Safe Detected(SD), Safe Undetected(SU), Dangerous Detected(DD), Dangerous Undetected(DU)
|
||||
\item \textbf{Four statistical properties of a system} \\
|
||||
$ \sum \lambda_{SD}$, $\sum \lambda_{SU}$, $\sum \lambda_{DD}$, $\sum \lambda_{DU}$
|
||||
\item \textbf{Four statistical properties of a system} We sum the statistics for the four classifications of system failures \\
|
||||
$ \sum \lambda_{SD}$, $\sum \lambda_{SU}$, $\sum \lambda_{DD}$, $\sum \lambda_{DU}$ \\
|
||||
\end{itemize}
|
||||
|
||||
% Failure modes are classified as Safe or Dangerous according
|
||||
@ -1334,7 +1352,7 @@ However, as with the components that we should check against a {\fm}, there are
|
||||
the reasoning stages for an FMEA entry.
|
||||
%FMEA does not stipulat which
|
||||
Ideally each FMEA entry would contain a reasoning description
|
||||
for each component the {\fm} is checked against, so that the the entry can be reviewed or revisited.
|
||||
for each component the {\fm} is checked against, so that the the entry can be reviewed or revisited/audited.
|
||||
Because FMEA is traditionally performed with one entry per component {\fm} full reasoning descriptions
|
||||
are rare.
|
||||
This means that re-use, review and checking of traditional analysis must be started from `cold'.
|
||||
|
@ -2,13 +2,52 @@
|
||||
|
||||
\section*{Introduction}
|
||||
|
||||
This chapter examines FMEA in a critical light.
|
||||
The problems with the scope---or required reasoning distance---of detail to apply
|
||||
for FMEA analysis, the difficulties of integrating software
|
||||
and hardware in FMEA failure models, and the near-impossibility of performing meaningful
|
||||
multiple failure analysis are examined.
|
||||
Additional problems such as the inability to easily re-use, and validate (through
|
||||
traceable reasoning) FMEA models is presented.
|
||||
This chapter examines current FMEA
|
||||
practise % practise is a noun and practise is a verb
|
||||
in a
|
||||
critical light.
|
||||
Chapter~\ref{sec:chap2} introduced concepts underlying FMEA, and this chapter seeks to
|
||||
use these concepts to the determine the drawbacks and advantages in its current usage.
|
||||
%
|
||||
Legally mandatory FMEA for a large proportion of safety critical systems
|
||||
in Europe and the USA, at the very least means that experienced
|
||||
engineers have to discuss a system at a level of detail starting
|
||||
at {\bc} {\fms}.
|
||||
%
|
||||
This undoubtedly reveals dangers inherent in designs and makes
|
||||
our lives safer. This chapter aims to look for the deficiencies in the FMEA process, to probe for weaknesses
|
||||
and look for ways in which it could be done better and more efficiently.
|
||||
|
||||
A major problem is with the scope of examination---or required reasoning distance---to apply
|
||||
for FMEA analysis.
|
||||
Checking all combinations quickly leads to a state explosion problem:
|
||||
limiting the number of components to check for against for a given {\bc}
|
||||
{\fm} could address this.
|
||||
%
|
||||
The difficulties of integrating software
|
||||
and hardware in FMEA failure models mean that FMEA is showing its age: designed
|
||||
in an era of simple electro-mechanical systems, the modern world with ubiquitous
|
||||
cheap micro-controllers and processors mean that most of today’s systems are
|
||||
now software/hardware hybrids.
|
||||
%
|
||||
|
||||
With FMEA it is very difficult to perform %impossibility of performing
|
||||
meaningful
|
||||
multiple failure analysis.
|
||||
The main reasons for this are that in electronics, each failure
|
||||
can introduce a circuit topology change.
|
||||
%
|
||||
In software, in a similar vein,
|
||||
one failure can influence the programmatic behaviour and decisions made
|
||||
complicating the analysis of additional failures.
|
||||
%
|
||||
Dual failure analysis is required by some recent European standards~\cite{en298,en230}
|
||||
and with increasing demands on safety we are likely to see more multiple failure
|
||||
FMEA requirements.
|
||||
|
||||
Other problems such as the inability to easily re-use, and validate/audit (through
|
||||
traceable reasoning) FMEA models are presented.
|
||||
%
|
||||
Finally we conclude with a list of deficiencies in current FMEA methodologies, and present a wish list
|
||||
for an improved methodology.
|
||||
|
||||
@ -33,61 +72,12 @@ each {\bc} {\fm}.
|
||||
This means that the reasoning involved in determining the system level failure/symptom is described (if at all) very briefly.
|
||||
Ideally supporting documentation would give the reasoning and calculations behind each analysis case,
|
||||
but the structure of current FMEA reports does not encourage this.
|
||||
|
||||
\subsection{FMEA does not support modularity.}
|
||||
It is a common practise in the process control industry to buy in sub-systems,
|
||||
typically sensors and actuators connected to an industrially hardened computer bus, i.e. CANbus~\cite{can,canspec}, modbus~\cite{modbus} etc.
|
||||
Most sensor systems now are `smart'~\cite{smartinstruments}, that is to say, they contain programmatic elements
|
||||
even if their outputs are %they supply
|
||||
analogue signals. For instance a liquid level sensor that
|
||||
supplies a {\ft} output, would have been typically have been implemented
|
||||
in analogue electronics before the 1980s. After that time, it would be common to use a micro-processor
|
||||
based system to perform the functions of reading the sensor and converting it to a current (\ft) output.
|
||||
For the non-safety critical systems integrator this brings with it the advantages
|
||||
that come with using a digital system (increased accuracy, self checking and ease of
|
||||
calibration etc. ). For a safety critical systems integrator this can be very problematic when it
|
||||
comes to approvals. Even if the sensor manufacturer will let you see the internal workings and software
|
||||
we have a problem with tracing the FMEA reasoning through the sensor, through the sensors software
|
||||
and then though the system being integrated.
|
||||
This problem is compounded by the fact that traditional FMEA cannot integrate software into FMEA models~\cite{sfmea,safeware}.
|
||||
|
||||
|
||||
\section{Reasoning Distance used to measure Comparison Complexity}
|
||||
\label{sec:reasoningdistance}
|
||||
Traditional FMEA cannot ensure that each failure mode of all its
|
||||
components are checked against any other components in the system which
|
||||
it may affect, due to state explosion.
|
||||
\paragraph{Re-use of FMEA analysis}
|
||||
%
|
||||
FMEA is therefore performed using heuristics to decide
|
||||
which components to check the effect of a component failure mode on.
|
||||
We could term the number of checks made for each failure mode
|
||||
on aspects of the system to be the reasoning distance.
|
||||
%
|
||||
In practise FMEA may be performed by following the signal path
|
||||
of the component failure mode to its system level effect. This is less than ideal
|
||||
and it can easily miss interactions with adjacent components, that could cause
|
||||
other system level symptoms.
|
||||
%
|
||||
Were we to compare the reasoning distance with the theoretical maximum, the sum of all failure
|
||||
modes in a system, multiplied by the number of components in it, we could arrive at a maximum
|
||||
reasoning distance, which we can use as a comparison complexity figure.
|
||||
%
|
||||
This figure would mean we could compare the maximum number of checks (i.e. exhaustive %rigorous
|
||||
analysis) with the number actually performed.
|
||||
|
||||
\paragraph{The ideal of exhaustive FMEA (XFMEA)}
|
||||
Obviously, exhaustively checking every component failure mode in a system,
|
||||
against all other components is the ideal for finding all possible system level failures.
|
||||
While this is impossible for all but trivial systems, it should be possible
|
||||
for small groups of components that work together to provide a well defined function.
|
||||
We could term such a group a `{\fg}'.
|
||||
|
||||
\section{Re-use of FMEA analysis}
|
||||
|
||||
Given the {\bc} {\fm} to system level failure mode paradigm it is
|
||||
difficult to re-use FMEA analysis.
|
||||
%
|
||||
Several strategies to aid re-use have been proposed~\cite{rudov2009language, reuse_of_fmea}, but
|
||||
Several strategies to aid re-use have been proposed~\cite{rudov2009language, patterns6113886,931423 }, but
|
||||
the fundamental problem remains, that, with any changes
|
||||
to the component base in a system, it is very difficult to
|
||||
determine which FMEA test scenarios must be re-worked.
|
||||
@ -100,6 +90,66 @@ The failure mode behaviour of these repeated structures will be the same.
|
||||
However with the {\bc} {\fm} to system level failure mode mapping
|
||||
work is likely to be repeated.
|
||||
|
||||
\subsection{FMEA does not support modularity.}
|
||||
It is a common practise in the process control industry to buy in sub-systems,
|
||||
typically sensors and actuators connected to an industrially hardened computer bus, i.e. CANbus~\cite{can,canspec}, modbus~\cite{modbus} etc.
|
||||
With traditional FMEA it is difficult to deal with
|
||||
a `plug~and~play' paradigm. The design philosophy of FMEA is to trace {\bc} failure through to system failures.
|
||||
This is incompatible with a modular approach where the architecture of a
|
||||
system may be different for implementation sites.
|
||||
The modularity problem is exacerbated by FMEAS problems modelling software/hardware hybrids, a problem
|
||||
examined in section~\ref{sec:distributed}.
|
||||
% Most sensor systems now are `smart'~\cite{smartinstruments}, that is to say, they contain programmatic elements
|
||||
% even if their outputs are %they supply
|
||||
% analogue signals. For instance a liquid level sensor that
|
||||
% supplies a {\ft} output, would have been typically have been implemented
|
||||
% in analogue electronics before the 1980s. After that time, it would be common to use a micro-processor
|
||||
% based system to perform the functions of reading the sensor and converting it to a current (\ft) output.
|
||||
% For the non-safety critical systems integrator this brings with it the advantages
|
||||
% that come with using a digital system (increased accuracy, self checking and ease of
|
||||
% calibration etc. ). For a safety critical systems integrator this can be very problematic when it
|
||||
% comes to approvals. Even if the sensor manufacturer will let you see the internal workings and software
|
||||
% we have a problem with tracing the FMEA reasoning through the sensor, through the sensors software
|
||||
% and then though the system being integrated.
|
||||
% This problem is compounded by the fact that traditional FMEA cannot integrate software into FMEA models~\cite{sfmea,safeware}.
|
||||
|
||||
|
||||
\section{Reasoning Distance used to measure Comparison Complexity}
|
||||
\label{sec:reasoningdistance}
|
||||
Traditional FMEA cannot ensure that each failure mode of all its
|
||||
components are checked against any other components in the system which
|
||||
it may affect, due to state explosion.
|
||||
%
|
||||
FMEA is therefore performed using heuristics to decide
|
||||
which components to check the effect of a component failure mode on.
|
||||
%We could term the number of checks made for each failure mode
|
||||
%on aspects of the system to be the reasoning distance.
|
||||
%
|
||||
Typically FMEA will performed by following the signal path
|
||||
of the component failure mode to its system level effect,
|
||||
echoing fault finding reasoning.
|
||||
%
|
||||
This is less than ideal
|
||||
and it can easily miss interactions with adjacent components, that could cause
|
||||
other system level symptoms.
|
||||
%
|
||||
Were we to compare the reasoning distance with the theoretical maximum, the sum of all failure
|
||||
modes in a system, multiplied by the number of components in it, we could arrive at a maximum
|
||||
reasoning distance, which we can use as a comparison complexity figure.
|
||||
%
|
||||
This figure would mean we could compare the maximum number of checks (i.e. exhaustive %rigorous
|
||||
analysis) with the number actually performed.
|
||||
|
||||
\paragraph{The ideal of exhaustive FMEA (XFMEA).}
|
||||
Obviously, exhaustively checking every component failure mode in a system,
|
||||
against all other components is the ideal for finding all possible system level failures.
|
||||
While this is impossible for all but trivial systems, we note that it should be possible
|
||||
for small groups of components that work together to provide a well defined function.
|
||||
We could term such a group a `{\fg}'. Potentially here we have a way of de-composing
|
||||
the problem and reducing the $O(N^2)$ state explosion effect
|
||||
associated with XFMEA.
|
||||
|
||||
|
||||
|
||||
\section{Software and FMEA}
|
||||
|
||||
@ -138,14 +188,16 @@ With the increasing use of micro-controllers in place of analogue electronics
|
||||
for most new designs of electronic product, the poor integration capabilities of FMEA
|
||||
are now being seen as deficiencies.
|
||||
|
||||
This apparent then in the dilemma now faced
|
||||
This is becoming apparent in a dilemma now faced
|
||||
by organisations dealing with highly safety critical systems, and having rely on `smart~instruments'
|
||||
that they can no longer validate using FMEA.
|
||||
%
|
||||
Smart instruments are dealt with in the section below.
|
||||
Distributed real time systems, which rely on micro-controllers connected in a network
|
||||
using a communications protocol, are also impossible to be meaningfully analysed by FMEA.
|
||||
|
||||
\subsection{The rise of the smart instrument}
|
||||
\label{sec:smart}
|
||||
%% AWE --- Atomic Weapons Establishment have this problem....
|
||||
A smart instrument is defined as one that uses a micro-processor and software
|
||||
in conjunction with its sensing electronics, rather than
|
||||
@ -186,25 +238,31 @@ systems. %by traditional FMEA.
|
||||
Currently the only way that some smart~instruments have been permitted for
|
||||
use in highly critical systems is the have the extensively
|
||||
functionally tested~\cite{bishopsmartinstruments}.
|
||||
|
||||
|
||||
%>>>>>>> 1b3d54f0ec2963017e98c4cdadc9a72a8bac911a
|
||||
|
||||
\subsection{Distributed real time systems}
|
||||
|
||||
\label{sec:distributed}
|
||||
Distributed real time systems are control systems where
|
||||
smart sensors communicate over a communications bus to
|
||||
a master controller.
|
||||
%
|
||||
Most modern cars follow this information technology pattern and use CANbus~\cite{canspec,can}.
|
||||
%
|
||||
For instance, in a modern car there will be no mechanical linkage from the pedal to the engine, instead the throttle pedal will be linked to a sensor to determine how
|
||||
For instance, in a modern car there will be no mechanical linkage from the pedal to the engine, instead the throttle pedal
|
||||
will be linked to a sensor to determine how
|
||||
far the pedal is pressed.
|
||||
This sensor will be read by a micro-controller, and passed, via CANbus, to the Engine Control Unit (ECU)
|
||||
which will use that information (along with information from other sensors) to adjust the power required from the engine.
|
||||
%
|
||||
This adjustment could be direct, or could be another CANbus message passed to a micro-controller regulating engine function.
|
||||
In terms of FMEA, see figure~\ref{fig:distcon}, our reasoning path spans four interface layers of electronics to software.
|
||||
%
|
||||
In terms of FMEA, see figure~\ref{fig:distcon}, our reasoning path spans (at least) four interface layers of electronics to software.
|
||||
%
|
||||
Traditional FMEA does not cater for the software hardware interface, and here we have the addition complications
|
||||
%with the additional complications
|
||||
of the communications protocol used to transmit data, and the failure mode characteristics
|
||||
of the communications protocol used to transmit data and the failure mode characteristics
|
||||
of the communications physical layer.
|
||||
|
||||
%(figure~\ref{fig:distcon}
|
||||
@ -235,10 +293,11 @@ utterly anachronistic in the distributed real time system environment.
|
||||
|
||||
\begin{itemize}
|
||||
\item FMEA type methodologies were designed for simple electro-mechanical systems of the 1940's to 1960's.
|
||||
\item Reasoning Distance - component failure to system level symptom process is undefined in regard to the components to check against each given component{\fm}.
|
||||
\item Reasoning Distance - component failure to system level symptom process is undefined in regard
|
||||
to the components to check against each given component {\fm}.
|
||||
\item State explosion - impossible to perform FMEA exhaustively %rigorously
|
||||
\item Difficult to re-use previous analysis work
|
||||
\item Very Difficult to model simultaneous failures.
|
||||
\item Very difficult to model simultaneous failures.
|
||||
\item Software and hardware models are separate (if the software is modelled at all).
|
||||
\item Distributed real time systems are very difficult to analyse with FMEA because they typically involve many hardware/software interfaces.
|
||||
\end{itemize}
|
||||
@ -352,7 +411,8 @@ very good with single failure modes linked to top level events.
|
||||
FMEA has become part of the safety critical and safety certification industries.
|
||||
%
|
||||
SFMEA is in its infancy, and there are corresponding gaps in
|
||||
certification for software, EN61508~\cite{en61508}, recommends hardware redundancy architectures in conjunction
|
||||
certification for software, EN61508~\cite{en61508} a modern standard based
|
||||
on a modern variant of FMEA, recommends hardware redundancy architectures in conjunction
|
||||
with FMEDA for hardware: for software it recommends language constraints and quality procedures
|
||||
but no inductive fault finding technique.
|
||||
%
|
||||
@ -378,7 +438,7 @@ We now form a wish list, stating the features that we would want
|
||||
in an improved FMEA methodology,
|
||||
\begin{itemize}
|
||||
\item Must be able to analyse hybrid software/hardware systems,
|
||||
\item no state explosion (which would make analysis impractical),
|
||||
\item no state explosion (which has rendered exhaustive analysis impractical),
|
||||
\item exhaustive checking at a modular level, %(total failure coverage within {\fgs} all interacting component and failure modes checked),
|
||||
\item traceable reasoning system models,% to aid repeatability and checking,
|
||||
\item re-usable i.e. it should be possible to re-use analysis,
|
||||
|
@ -348,7 +348,8 @@ we thus reveal design deficiencies.
|
||||
In Safety Integrity Level (SIL)~\cite{en61508} terms, by identifying undetectable faults and fixing them, we raise
|
||||
the safe failure fraction (SFF).
|
||||
|
||||
|
||||
\section{Objective and Subjective Reasoning stages}
|
||||
Opportunity for formal definitions and perhaps an interface or process for achieving it....
|
||||
|
||||
\section{Conclusion}
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user