diff --git a/submission_thesis/CH1_introduction/copy.tex b/submission_thesis/CH1_introduction/copy.tex index 9c1f011..129602a 100644 --- a/submission_thesis/CH1_introduction/copy.tex +++ b/submission_thesis/CH1_introduction/copy.tex @@ -22,8 +22,9 @@ used technique that is legally mandatory for a wide range of equipment certifica The ability to assess the safety of man made equipment has been a concern since the dawn of the industrial age~\cite{usefulinfoengineers,steamboilers}. -The philosophy behind safety measure has progressed -with time, and by World War Two we began to see concepts such as `no single component failure should cause +% +The philosophy behind safety measures has progressed +with time and by World War Two we began to see concepts such as `no single component failure should cause a dangerous system failure'~\cite{boffin} emerging~\cite{echoesofwar}[Ch.13]. % Concepts such as these allow us to apply @@ -142,8 +143,10 @@ In the field of digital signal processing there is an algorithm that revolutioni access to frequency analysis of digital samples called the Fast Fourier transform (FFT)~\cite{fftoriginal}. This took the discrete Fourier transform (DFT), and applied de-composition to its mesh of (often repeated) complex number calculations~\cite{fpodsadsp}[Ch.8]. +% By doing this it broke the computing order of complexity problem down from having a polynomial %n exponential -order to logarithmic order~\cite{ctw}[pp.401-3]. +%order +to logarithmic order~\cite{ctw}[pp.401-3]. I wondered if this thinking could be applied to the state explosion problems encountered in FMEA. % %Following the concept of de-composing a problem, and thus simplifying the state explosion---using the thinking behind @@ -182,7 +185,7 @@ could be used to model failure modes in components it was thought that a diagrammatic notation would be more user friendly than using formal logic. % -For an FMEA Spider diagram, contours represent failure modes, and the spider diagram +For an FMEA Spider diagram, contours represent failure modes, and the Spider diagram `existential~points' represent instances of failure modes. % Overlapping contours could represent multiple failure modes. diff --git a/submission_thesis/CH2_FMEA/copy.tex b/submission_thesis/CH2_FMEA/copy.tex index 4e23245..825981c 100644 --- a/submission_thesis/CH2_FMEA/copy.tex +++ b/submission_thesis/CH2_FMEA/copy.tex @@ -23,19 +23,20 @@ are examined in the context of two sources of information that define failure mo % A simple example of an FMEA is given, using a hypothetical {\ft} milli-amp reader. % -The four main variants are described and we develop %conclude by describing concepts +The four main current FMEA variants are described and we develop %conclude by describing concepts the concepts that underlie the usage and philosophy of FMEA. % We return to the overall process of FMEA and model it using UML. % -By using UML we define relationships between the data objects -described at the start of this chapter. +By using UML we define relationships between the FMEA data objects +defined at the start of this chapter. % The act of defining relationships between the data objects -in FMEA raise questions about the nature of this process. +in FMEA raise questions about the nature of the process +and allow us to analytically discuss its strengths and weaknesses. @@ -50,9 +51,9 @@ for a large proportion of safety critical products sold in the European Union. The acronym FMEA can be expanded as follows: \begin{itemize} \item \textbf{F - Failures of given component,} Consider a particular component in a system; - \item \textbf{M - Failure Mode,} Choose a component `failure~mode'); + \item \textbf{M - Failure Mode,} Choose a component `failure~mode'; \item \textbf{E - Effects,} Determine the effects this failure mode will cause to the system we are examining; - \item \textbf{A - Analysis,} Analyse how much impact this symptom will have on the environment/people/the system itself. + \item \textbf{A - Analysis,} Analyse how much impact this symptom will have on the environment/operators/the system itself. \end{itemize} % FMEA is a broad term; it could mean anything from an informal check on how @@ -77,17 +78,19 @@ This could be considered a low pass filter in some electrical environments~\cite but for fixed frequencies the same circuit could be used as a phase changer~\cite{electronicssysapproach}[p.114]. The failure modes of the latter, could be `no~signal' and `all~pass', but when used as a phase changer, would be `no~signal' and `no~phase' change. - -This chapter describes basic concepts of FMEA, uses a simple example to -demonstrate a single FMEA analysis stage, describes the four main variants of FMEA in use today -and explores some concepts with which we can discuss and evaluate -the effectiveness of FMEA. +% +% This chapter describes basic concepts of FMEA, uses a simple example to +% demonstrate a single FMEA analysis stage, describes the four main variants of FMEA in use today +% and explores some concepts with which we can discuss and evaluate +% the effectiveness of FMEA. \section{FMEA Process} We begin FMEA with the basic, or starting components. -This components are the sort we buy in or consider as pre-assembled modules. +% +These components are the sort we buy in or consider as pre-assembled modules. We term these the {\bcs}. +% Firstly we need to know how these can fail. So our first relationship is between a {\bc} and its failure modes, see figure~\ref{fig:component_fm_rel}. @@ -108,10 +111,10 @@ To perform this we need to know how a failure mode, considering its effect on other components in the system will translate to a system level symptom/failure. % -The result of FMEA is to determine a system level failures, or symptoms for given component failure modes. +The result of FMEA is to determine a system level failures, or symptoms for each given component failure mode. % -In practise, an FMEA analysis of a {\bc} {\fm} -would typically be one line in a spreadsheet entry. +In practise, each entry of an FMEA analysis of a {\bc} {\fm} +would typically be one line in a spreadsheet. % The analysis to symptom relationship is generally % considered one-to-one, however here (see figure~\ref{fig:component_fm_rel_ana}), we allow for the possibility @@ -127,7 +130,7 @@ of more than one failure symptom. \end{figure} Figure ~\ref{fig:component_fm_rel_ana} defines the data relationships -for FMEA. This model is expanded upon in the conclusion +for FMEA. This model is later extended in the conclusion of this chapter. @@ -266,8 +269,8 @@ EN298, the European gas burner safety standard, tends to be give failure modes more directly usable for performing FMEA than FMD-91. % -EN298 requires that a full FMEA be undertaken, examining all failure modes -of all electronic components~\cite{en298}[11.2 5] as part of the certification process. +The certification process for EN298 requires that a full FMEA be undertaken, examining all failure modes +of all electronic components~\cite{en298}[11.2 5]. % as part of the certification process. % Annex A of EN298, prescribes failure modes for common components and guidance on determining sets of failure modes for complex components (i.e. integrated circuits). @@ -790,9 +793,14 @@ with the merging of Markov chains. So for multiple failures we have the objective criteria complicated, and the subjective adds another layer of complication. % -Also with the additional complication of having to change between these two modes of thinking, it becomes more difficult to -get a balance between subjective and objective perspectives. - +% +Traditional FMEA has the translation from an objective to subjective +failure modes as an intrinsic part of the process, +this is an additional complication. +%, of having to change between these two modes of thinking, it becomes more difficult to +%get a balance between subjective and objective perspectives. +Another complication for multiple failure analysis is that failure modes may cause a change in circuit topology +meaning the additional failures might have to be analysed with respect to the changed topology. %subjective/objective become more cluttered when there are multiple possibilities %for the the results of an FMEA line of reasoning. @@ -809,7 +817,7 @@ it has no way of knowing the reading is invalid. % The term observable has a specific meaning in the field of control engineering~\cite{721666, ACS:ACS1297}; systems submitted for FMEA are generally related to control systems, -and so to avoid confusion the terms `detectable' and `undetectable' +and so to avoid confusion the terms `detectable' and `undetectable' (as defined in EN61508\cite{en61508}) will be used for describing the observability of failure modes in this document. \glossary{name={observability}, description={The property of a system failure in relation to a particular component failure mode, where it can be determined whether the readings/actions associated     with it are valid, or the by-product of a failure. If we cannot determine that there is a fault present, the system level failure is said to be unobservable.}} @@ -848,8 +856,8 @@ induced). \paragraph{Reasoning distance.} \label{reasoningdistance} -A reasoning distance is the number of stages of logic and reasoning -required to map a failure cause to its potential outcomes. +A reasoning distance is the number of stages of logic and reasoning used +in {\fm} analysis to map a failure cause to its potential outcomes. % In our basic FMEA example in section~\ref{basicfmea} we were asked to consider one failure mode against all the components in the milli-volt reader. @@ -862,6 +870,11 @@ for a given failure mode to determine a system level symptom. % No current FMEA variant gives guidelines for the components that should be included to analyse a {\fm} in a system. +% +Were we to examine a {\fm} against all the other components in a system +this would give us the maximum reasoning distance. +% +We term this the exhaustive FMEA case. %does not The exhaustive~reasoning~distance would be the sum of the number of failure modes, against all other components @@ -914,7 +927,7 @@ to undertake an `exhaustive~FMEA'. Even small systems have typically 100 components, and they typically have 3 or more failure modes each. $100*99*3=29,700$. - \paragraph{Exhaustive Double Failure FMEA} +%\paragraph{Exhaustive Double Failure FMEA} For looking at potential double failure scenarios\footnote{Certain double failure scenarios are already legal requirements---The European Gas burner standard (EN298:2003)---demands the checking of @@ -937,12 +950,16 @@ Current FMEA methodologies cannot consider---for the reason of state explosion-- We define exhaustive FMEA ({\XFMEA}) as examining the effect of every component failure mode against the remaining components in the system under investigation. % -Because we cannot perform XFMEA, +Because we cannot, for practical reasons, perform XFMEA, we rely on experts in the system under investigation to perform a meaningful FMEA analysis. % -In practise these experts have to select the areas they see as most critical for detailed FMEA analysis: -it is usually impossible to perform a detail level of analysis on all component {\fms} +These experts must use their judgement and experience to choose +sub-sets of the components in the system to check against each {\fm}. +% +Also, %In practise +these experts have to select the areas they see as most critical for detailed FMEA analysis: +it is usually impossible to perform a detailed level of analysis on all component {\fms} on anything but a non-trivial system. \subsection{Component Tolerance} @@ -1107,18 +1124,22 @@ and require re-design of some systems. % \end{itemize} FMEDA is the fundamental methodology of the statistical (safety integrity level) -type standards (EN61508/IOC5108). -It provides a statistical overall level of safety -and allows diagnostic mitigation for self checking etc. -It provides guidelines for the design and architecture -of computer/software systems for four levels of -safety integrity, referred to as Safety Integrity Levels (SIL). +type standards (EN61508/IOC5108). +The end result of an EN61508 analysis is an % provides a statistical +overall `level of safety' known as a Safety Integrity level (SIL), for a system. +There are currently four SIL `levels', one to four, with four being the highest level. +It allows diagnostic mitigation for self checking checking circuitry. +% + % for four levels of +%safety integrity, referred to as Safety Integrity Levels (SIL). %For Hardware % -FMEDA does force the user to consider all hardware components in a system +FMEDA requires %does force +the analyst to consider all hardware components in a system by requiring that a MTTF value is assigned for each base component failure~mode; the MTTF may be statistically mitigated (improved) if it can be shown that self-checking will detect failure modes. +The MTTF value for each component {\fm} is denoted as $\lambda$'. % EN61508 in relation to software provides procedural quality guidelines and constraints (such as forbidding certain programming languages and/or features): it does not provide a means to trace failure mode effects in software @@ -1309,10 +1330,14 @@ system failure, the processes are intertwined with the basic analysis itself. Each {\fm} entry in an FMEA report should have a reasoning or comments field. This should provide a guide to someone re-examining, or trying to re-use results on a similar project. -However, as with the compnents that we should check against a {\fm}, there are no guidelines for documenting +However, as with the components that we should check against a {\fm}, there are no guidelines for documenting the reasoning stages for an FMEA entry. %FMEA does not stipulat which - +Ideally each FMEA entry would contain a reasoning description +for each component the {\fm} is checked against, so that the the entry can be reviewed or revisited. +Because FMEA is traditionally performed with one entry per component {\fm} full reasoning descriptions +are rare. +This means that re-use, review and checking of traditional analysis must be started from `cold'. % MOVED TO CH3: 15MAR2013 % diff --git a/submission_thesis/CH3_FMEA_criticism/copy.tex b/submission_thesis/CH3_FMEA_criticism/copy.tex index d5029d2..b21cef8 100644 --- a/submission_thesis/CH3_FMEA_criticism/copy.tex +++ b/submission_thesis/CH3_FMEA_criticism/copy.tex @@ -69,8 +69,10 @@ and it can easily miss interactions with adjacent components, that could cause other system level symptoms. % Were we to compare the reasoning distance with the theoretical maximum, the sum of all failure -modes in a system, multiplied by the number of components in it, we could arrive at a comparison complexity figure. -This figure would mean we could compare the maximum number of checks (i.e. exhaustive%rigorous +modes in a system, multiplied by the number of components in it, we could arrive at a maximum +reasoning distance, which we can use as a comparison complexity figure. +% +This figure would mean we could compare the maximum number of checks (i.e. exhaustive %rigorous analysis) with the number actually performed. \paragraph{The ideal of exhaustive FMEA (XFMEA)} @@ -99,7 +101,7 @@ However with the {\bc} {\fm} to system level failure mode mapping work is likely to be repeated. -\section{software and FMEA} +\section{Software and FMEA} Traditional FMEA deals only with electrical and mechanical components, i.e. it does not have provision for software. Modern control systems nearly always have a significant software/firmware element, @@ -168,7 +170,7 @@ Some work has been performed to offer black~box---or functional testing---of the static analysis~\cite{Bishop:2010:ONT:1886301.1886325}. However, black box testing of smart instruments is yet to be a an approved method of validation. -% + Most modern instruments now use highly integrated electronics coupled to micro-controllers, which read and filter the measurements, and interface to an LCD readout. % @@ -218,7 +220,7 @@ utterly anachronistic in the distributed real time system environment. \centering \includegraphics[width=400pt]{./CH3_FMEA_criticism/distcon.png} % distcon.png: 1622x656 pixel, 72dpi, 57.22x23.14 cm, bb=0 0 1622 656 - \caption{Distributed Control System FMEA reasoning path for a single failure.} + \caption{Distributed Control System FMEA signal path for a single input.} \label{fig:distcon} \end{figure} @@ -233,11 +235,11 @@ utterly anachronistic in the distributed real time system environment. \begin{itemize} \item FMEA type methodologies were designed for simple electro-mechanical systems of the 1940's to 1960's. - \item Reasoning Distance - component failure to system level symptom + \item Reasoning Distance - component failure to system level symptom process is undefined in regard to the components to check against each given component{\fm}. \item State explosion - impossible to perform FMEA exhaustively %rigorously \item Difficult to re-use previous analysis work \item Very Difficult to model simultaneous failures. - \item Software and hardware models are separate. + \item Software and hardware models are separate (if the software is modelled at all). \item Distributed real time systems are very difficult to analyse with FMEA because they typically involve many hardware/software interfaces. \end{itemize} @@ -375,6 +377,7 @@ All these FMEA based methodologies have the following short comings: We now form a wish list, stating the features that we would want in an improved FMEA methodology, \begin{itemize} + \item Must be able to analyse {\fms} in hybrisd software/hardware systems, \item No state explosion making analysis impractical, \item Exhaustive checking (total failure coverage within {\fgs} all interacting component and failure modes checked), \item Reasoning Traceable in system models, diff --git a/submission_thesis/CH4_FMMD/copy.tex b/submission_thesis/CH4_FMMD/copy.tex index 5922815..097af8b 100644 --- a/submission_thesis/CH4_FMMD/copy.tex +++ b/submission_thesis/CH4_FMMD/copy.tex @@ -2213,6 +2213,13 @@ by a symptom within a {\fg}, and therefore the failure modes of a {\dc} are mutu Thus FMMD naturally produces {\dc} failure modes that are mutually exclusive. This property is examined in more detail in section~\ref{ch7:mutex}. +\paragraph{Objective and contextual/subjective failure symptoms.} +Because the top level failure symptoms of an FMMD analysis are objective, or the result of reasoning, +we can have a final stage where we consider the subjective or contextual effects of these symptoms. +% +With traditional FMEA methodologies we +have to make this decision (the contextual effects) for each component {\fm} in the system. + \paragraph{State explosion problem of FMEA solved by FMMD.} % Because FMMD considers failure modes within functional groups;