morning edit

2010-11-04 08:05:52 +00:00 · 2010-11-04 08:05:52 +00:00 · 53352f41d8
commit 53352f41d8
parent bf210d6bca
1 changed files with 60 additions and 44 deletions
--- a/fmmd_concept/fmmd_concept.tex
+++ b/fmmd_concept/fmmd_concept.tex
@ -7,14 +7,14 @@
 \abstract{ 
 This paper proposes a methodology for
 creating failure mode models of safety critical systems, which
-has a common notation
+have a common notation
 for mechanical, electronic and software domains and apply an
 incremental and rigorous approach.

 %% What I have done
 %%
 The Four main static failure mode analysis methodologies were examined and 
-in in the context of newer European safety standards assessed.
+in the context of newer European safety standards assessed.
 Some of the defeciencies in these methodologies lead to
 a wish list for a more ideal methodology.

@ -27,7 +27,7 @@ methodology is developed. The has been named Failure Mode Modular De-Composition
 %% Sell it
 %%
 In addition to addressing the traditional weaknesses of
-Fault Tree Analysis (FTA), Fault Mode Effects Analysis (FMEA), Faliue Mode Effects Criticallity Analysis (FMECA)
+Fault Tree Analysis (FTA), Fault Mode Effects Analysis (FMEA), Failure Mode Effects Criticallity Analysis (FMECA)
 and Failure Mode Effects and Diagnostic Analysis (FMEDA), FMMD provides the means to model multiple failure mode scenarios 
 as specified in newer European Safety Standards \cite{en298}.
 The proposed methodology is bottom-up and 
@ -145,14 +145,14 @@ are held in a computer program, we can determine if the model is complete

 \subsection{General Comments on bottom-up and top down approaches}

-\paragraph{A general Problem with top-down}
+\paragraph{A general defeciency in top-down systems analysis}
 With a top down approach the investigator has to determine
-a set of undesireable outcomes or accidents.
+a set of undesirable outcomes or accidents.
 As most accidents are unexpected and the causes unforseen \cite{safeware} 
 it is fair to say that a top down approach is not guaranteed to
 predict all possible undesirable outcomes.
 It also can miss known component failure modes, by
-simple not de-composing down to that level of detail.
+simply not de-composing down to that level of detail.

 \paragraph{A general problem with bottom-up}
 With the bottom up techniques we have all the known component failure modes
@ -165,22 +165,22 @@ we cannot consider them all and human judgement is used to
 decide which interactions are important.

 Let N be the number of components in our system, and K be the average number of component failure modes
-(ways in which the component can fail). The total number of base component failure modes
-is $N \times K$. To even examine the affect that one failure mode has on all the other components
+(ways in which the component can fail). The total number of base comp failure modes
+is $N \times K$. To examine the affect that one failure mode has on all the other components
 will be $(N-1) \times N \times K$, in effect a set cross product.


 Complicate this further with applied states or environmental conditions
 and another order of cross product of complexity is added.
-We may have a peice of self checking circuity for instance that
+We may have a piece of self checking circuitry for instance that
 has two states, normal and testing mode commanded by a logic line.
 Or we may have a mechanical device that has a different 
-failure mode behaviour for say, differnet ambient pressures or temperatures.
+failure mode behaviour for say, different ambient pressures or temperatures.

 If $E$ is the number of applied states or environmental conditions to consider
 in a system, the job of the bottom-up analyst is complicated by a cross product factor again 
 $(N-1) \times N \times K \times E$.
-If we put some typical very small embedded system numbersi\footnote{these figures would
+If we put some typical very small embedded system numbers\footnote{these figures would
 be typical of a very simple temperature controller, with a micro-controller sensor and heater circuit} into this, say $N=100$, $K=2.5$ and $E=10$
 we have $99 \times 100 \times 2.5 \times 10 = 247500 $.
 To look in detail at a quarter of a million test cases is obviously impractical.
@ -188,7 +188,7 @@ To look in detail at a quarter of a million test cases is obviously impractical.
 If we were to consider multiple simultaneous failure modes
 we have yet another complication cross product.

-For instance for looking at double simultaneous failure modes 
+For instance for looking at double simultaneous failure modes, 
 the equation reads $(N-2) \times (N-1) \times N \times K \times E$.

 The bottom-up methodologies FMEA, FMECA and FMEDA take single failure modes and link them
@ -198,8 +198,8 @@ component failure mode to the SYSTEM level.



-\paragraph{Ideal Static failure mode methodology}
-An ideal Static failure mode methodology would build a failure mode model
+\paragraph{Ideal static failure mode methodology}
+An ideal static failure mode methodology would build a failure mode model
 from which the traditional four models could be derived.
 It would address the short-comings in the other methodologies, and
 would have a user friendly interface, with a visual (rather than mathematical/formal) syntax with icons
@ -217,7 +217,7 @@ of missing component failure modes \cite{faa}[Ch.9].
 %, or modelling at
 %a too high level of failure mode abstraction.
 FTA was invented for use on the minuteman nuclear defence missile
-systems in the early 1960's and was not designed as a rigorous
+systems in the early 1960s and was not designed as a rigorous
 fault/failure mode methodology. It is more like a structure to
 be applied when discussing the safety of a system, with a top down hierarchical
 notation, that guides the analysis. This methodology was designed for
@ -244,7 +244,7 @@ The investigation will typically point to a particular failure
 of a component. 
 The methodology is now applied to find the significance of the failure.
 Its is based on a simple equation where $S$ ranks the severity (or cost \cite{fmea}) of the identified SYSTEM failure,
-$O$ its occurrance, and $D$ giving the failures detectability. Muliplying these
+$O$ its occurance, and $D$ giving the failures detectability. Muliplying these
 together, 
 gives a risk probability number (RPN), given by $RPN = S \times O \times D$.
 This gives in effect
@ -286,7 +286,7 @@ The results, as with FMEA are an $RPN$ number determining the significance of th
 %%-WIKI- while various forms of FMEA predominate in other industries.


-\subsubsection{ FMEA weaknesses }
+\subsubsection{ FMECA weaknesses }
 \begin{itemize}
 \item Possibility to miss the effects of failure modes at SYSTEM level.
 \item Possibility to miss environmental affects.
@ -314,7 +314,7 @@ The component may be mitigated by a vatriety of factors
 \item Coverage of self checking
 \end{itemize}

-Ultimately this tequnique calculates a risk factor for each component.
+Ultimately this technique calculates a risk factor for each component.
 The risk factors of all the components are summed and 
 give a value for the `safety level' for the equipment in a given environment.

@ -327,7 +327,7 @@ give a value for the `safety level' for the equipment in a given environment.
 %%-• The design strength (de-rating, safety factors) and
 %%-• The operational profile (environmental stress factors).

-This uses MTFF and other statisical models to determine the probability of
+This uses MTFF and other statistical models to determine the probability of
 failures occurring.
 %
 A component failure mode, given its MTTF
@ -342,21 +342,29 @@ and other factors such as de-rating and environmental stress.
 This can be calculated, with one component failure mode per row, on a spreadsheet
 and these are all summed to give the final assessment figure.

-\paragraph{Two statistical perspectives}
-The Statistical Analysis method is used from two perspectives,
+\subsubsection{Two statistical perspectives}
+he Statistical Analysis method is used from two perspectives,
 Probability of Failure on Demand (PFD), and Probability of Failure
-in continuous Operation, Failure in Time (FIT) and measured in failures per billion ($10^9$) hours of operation.
+in continuous Operation, Failure in Time (FIT).
+\paragraph{Failure in Time (FIT)}.
+
+Continuous operation is measured in failures per billion ($10^9$) hours of operation.
+For a continuously running nuclear powerstation
+we would be interested in its operational FIT values. 
+
+\paragraph{Probability of Failure on Demand (PFD)}.
 For instance with the anti-lock system on a automobile braking
 system, we would be interested in PFD.
-For a continuously running nuclear powerstation
-we would be interested in its 24/7 operation FIT values. 
+That is to say the ratio of it failing 
+to succeeding on demand.

+\subsubsection{FMEDA and determinability prediction accuracy}.
 This suffers from the same problems of 
 lack of determinability prediction accuracy, as FMEA above.
 %
 We have to decide how particular components failing will impact on the SYSTEM or top level.
 This involves a `leap of faith'. For instance, a resistor failing in a sensor circuit
-may be part of a critical monitioring function. 
+may be part of a critical monitoring function. 
 The analyst is now put in a position
 where he must assign a critical failure possibility to it.  
 %
@ -365,10 +373,10 @@ of how that resistor would/could  affect that circuit, but because the circuitry
 it is part of critical section it will be linked to a critical system level fault.
 %
 A $\beta$ factor, the hueristically defined probability
-of the failure causign the system fault may be applied.
+of the failure causing the system fault may be applied.
 %
 But because there is no detailed analysis of the failure mode behaviour
-of the component, traceable to the SYSTEM level, it becomnes more
+of the component, traceable to the SYSTEM level, it becomes more
 guess work than science.
 With FMEDA, there is no rigorous cause and effect analysis for the failure modes. Unintended side
 effects that lead to failure can be missed.
@ -405,6 +413,10 @@ for its results.
 \item It should be capable of producing reliability and danger evaluation statistics.
 \item It should be easy to use, Ideally using a graphical syntax (as oppossed to a formal mathematical one).
 \item From the top down, the failure mode model should follow a logical de-composition of the functionality
+for its results.
+\item It should be capable of producing reliability and danger evaluation statistics.
+\item It should be easy to use, ideally using a graphical syntax (as oppossed to a formal mathematical one).
+\item From the top down, the failure mode model should follow a logical de-composition of the functionality
 to smaller and smaller functional modules \cite{maikowski}.
 \item Multiple failure modes may be modelled from the base component level up.
 \end{itemize}
@ -412,7 +424,7 @@ to smaller and smaller functional modules \cite{maikowski}.

 \section{Design of a new static failure mode based methodology}

-\paragraph{New methodology Must be bottom-up}
+\paragraph{New methodology must be bottom-up}
 In order to ensure that all component failure modes have been covered
 the methodology will have to work from the bottom-up
 and start with the component failure modes.
@ -422,7 +434,7 @@ The traditional fault finding, or natural fault finding
 is to work from the top down. 
 %
 On encountering a 
-fault, the symptom is first know at the top or
+fault, the symptom is first observed at the top or
 SYSTEM level. By de-composing the functionality of the faulty system and testing
 we can further de-compose the system until we find the
 faulty base level component.
@ -432,10 +444,10 @@ Simpler and simpler functional blocks are discovered as we delve
 further into the way the system works and is built.

 \paragraph{Design Decision: Methodology must be bottom-up.}
-In oder to ensure that all component failure modes are handled,
+In order to ensure that all component failure modes are handled,
 this methodology must start at the bottom, with base component failure modes.
 In this way automated checking can be applied to all component failure modes
-to ensure none have been inadvertantly excluded from the process.
+to ensure none have been inadvertently excluded from the process.

 \paragraph{Need for a `bottom-up' system de-composition}
 There is an apparent conflict here. The natural way to 
@ -450,7 +462,7 @@ and then taking those to form higher level
 The philosophy of top down de-compositon is very similar.
 Top down de-compositon applies functional 
 de-composition, because it seeks to break the system down
-into manageable and separatetly testable entities.
+into manageable and separately testable entities.
 A second justification for this is that the design process for a product requires both top down and bottom-up
 thinking.

@ -463,17 +475,21 @@ The base components will typically have several failure modes each.
 Given a typical embedded system may have hundreds of components
 This means that we have to tie base component failure modes
 to SYSTEM level errors. This is the `possibility to miss failure mode effects
-at SYSTEM level' critism of the FTA, FMEDA and FMECA methodologies.
+at SYSTEM level' criticism of the FTA, FMEDA and FMECA methodologies.

 \paragraph{Design Decision: Methodolgy must reduce and collate errors at each functional group stage.}
-SYSTEMS typically have far fewer failure modes then the sum of their component failure modes.
+SYSTEMS typically have far fewer failure modes than the sum of their component failure modes.
 SYSTEM level failures may be caused by a variety of component failure modes.
 A SYSTEM level failure mode is an abstracted failure mode, in that
 it is a symptom of some lower level failure or failures.
 % ABSTRACTION
 For instance a failed resistor in a sensor at a base component level is a specific 
-failure mode. For example it could be called `RESISTOR 1 OPEN'. 
-Its symptom in a functional group comprising the sensor channel that reads from it may be more abstract.
+failure mode. 
+%
+For example it could be called `RESISTOR 1 OPEN'. 
+Its symptom in a functional group comprising the sensor channel that reads from it may be more abstract
+or in other words describe the effect more generally.
+%
 We might call it `READING~HIGH' perhaps. At a higher level still
 this may be called `SENSOR CHANNEL 1' fault.
 At a system level it may simply be a `SENSOR FAILURE'.
@ -489,7 +505,7 @@ of failure modes as the abstraction level reaches the SYSTEM level.
 The next problem is how to we build a failure mode model
 that converges to a finite set of SYSTEM level failure modes.
 %
-It would be better would be to analyse the failure mode behaviour of each
+It would be better to analyse the failure mode behaviour of each
 functional group, and determine the ways in which it, rather than its
 components, can fail.
 % 
@ -506,7 +522,7 @@ The number of symptoms of failure should be equal to or
 less than the number of component failure modes, simply because
 often there are several potential causes of failure symptoms.
 %
-When we have the the symptoms, we can start thinking of the {\fg} as a component in its own right.
+When we have the symptoms, we can start thinking of the {\fg} as a component in its own right.
 %with a simplified and reduced set of failure symptoms.
 %
 We can now create a new {\dc}, where its failure modes
@ -548,7 +564,7 @@ an entire system. It can be considered complete when
 all failure modes from all components are handled
 and connectable to a SYSTEM level failure mode.

-\paragraph{Directed Acyclic Graph}. This will naturally form a DAG
+\paragraph{Directed Acyclic Graph.} This will naturally form a DAG
 meaning that for all SYSTEM failure modes, we will be able to trace
 back through the DAG to possible component failure mode causes.
 If statistical models exist for the component failure modes
@ -577,7 +593,7 @@ We can then treat the {\fg} as a `black box' or component in its own right.
 We can now look at how the {\fg} can fail. 
 %
 Many of the component failure modes will
-cause the same failure symptoms in the {fg} failure behaviour.
+cause the same failure symptoms in the {\fg} failure behaviour.
 We can collect these failures as common symptoms.
 %
 When we have our set of symptoms, we can now create
@ -605,8 +621,8 @@ This ensures that all component failure modes are handled.


 \subsubsection{ It should be easy to integrate mechanical, electronic and software models.}
-Because component failure modes are considered, we have a generic enitity to model.
-We can describe a mecanical, electrical or software component in terms of its failure modes. 
+Because component failure modes are considered, we have a generic entity to model.
+We can describe a mechanical, electrical or software component in terms of its failure modes. 
 %
 Because of this 
 we can model and analyse integrated electro mechanical systems, controlled by computers,
@ -670,7 +686,7 @@ chosing {\fg}s and working bottom-up this hierarchical trait will occur as a nat

 \section{Conclusion}

-This paper provides the backgroud for the need for a new methodology for
+This paper provides the background for the need for a new methodology for
 static analysis that can span the mechanical electrical and software domains
 using a common notation.
 \vspace{60pt}