Failure Modes and Mechanisms in Safety Critical Task Analysis (SCTA)

In the realm of safety critical tasks industries, identifying both failure modes and mechanisms is essential for effective risk management and performance improvement. We dive into this crucial topic, exploring how failure modes and failure mechanisms impact human performance.

Jamie Henderson

Email the author | Connect on LinkedIn

Beginner, HUMAN FACTORS, Mining, Pharma, Power Generation, Process Industries, SCTA

Failure modes describe the different ways in which a task step might fail, e.g. the task step is omitted, done too early or too late, or the right action is done on the wrong object.

Failure mechanisms can be thought of as the potential reasons for a failure, e.g. something is forgotten, confused, or a wrong decision is made.

A useful way of thinking about the difference is that the failure mode is what you would see if you were present when a failure happened. For example, the valve not being opened, the valve being opened too late, or the wrong valve being opened. The failure mechanism, on the other hand, is the reason for the failure and is not usually directly observable. For example, forgetting to open the valve, being too busy to open a valve in time, or thinking that one valve should be opened when it should actually be a different one.

Another way of thinking of it, is that failure modes require very little Human Factors knowledge to identify, whereas failure mechanisms are likely to require some understanding of human performance issues.

There is a regulatory expectation, when a Safety Critical Task Analysis (SCTA) is being performed, that both potential failure modes and failure mechanisms should be identified and recorded. In addition, any proposed improvements should be directly linked to the identified failure mechanisms. The HSE Human Factors (HF) Delivery Guide makes specific reference to one type of failure mechanism model: Slips, Mistakes, and Violations (SMV).

As this is an expectation, we include information on both failure modes and mechanisms in our analyses. We also agree with the importance of understanding potential reasons for failures when analysing tasks, and thinking about potential scope for improvement. For example, an attentional failure that is more associated with a slip or lapse might be more effectively addressed through design, alerts, alarms and a checklist; a knowledge-based failure associated with a mistake is more effectively addressed through training, guidance and diagnosis aids. However, there are some important nuances to their consideration which we set out below.

Firstly, it is important to recognise that, whilst a useful framework, SMV is just one way of characterising possible reasons for failures, and it does not explain all potential causes.

For example, some failures may be the result of issues with motor control (perhaps exacerbated by poor equipment design), or others may be the result of insufficient time to complete a task (arising from poor task planning or resource management). Neither of these types of failure fit into the SMV framework. Rather, these failures arise from the situational context, or what we call Performance Influencing Factors (PIF).

Failure Modes and Mechanisms in Safety Critical Task Analysis: turning gas valve

Secondly, in our experience, in any given task, multiple failure mechanisms may be credible, but their respective likelihoods will be affected by the prevailing PIFs. Labelling a particular failure mode as being most likely to be a slip or a mistake may give misplaced confidence that all issues are being addressed.

For example, in any task which involves identifying and selecting equipment for use from a number of options, it is possible to either inadvertently select the incorrect piece of equipment (i.e. a slip), or think that the incorrect piece of equipment should be selected (i.e. a mistake). Both scenarios will always be possible, but it is the PIFs that will determine the likelihood of each of the failure mechanisms. If a valve is poorly labelled, looks similar to other valves, situated in an area where multiple pipes cross over each other, then the likelihood of a slip will be considerably increased.

Therefore, rather than focusing on labelling an identified failure, it is more important to optimise the PIFs to prevent both these types of failure. In other words, the goal should be thinking not of describing a failure mode as being the product of one particular failure mechanism or another, but instead looking for PIFs that increase the likelihood of any of the credible failure mechanisms for tasks of that type and optimising these underlying factors.

Understanding both failure modes and failure mechanisms is crucial for SCTAs. While failure modes provide insight into what happens when something goes wrong, failure mechanisms reveal the deeper, often hidden reasons behind these failures. The SMV categorisation in the HF Delivery Guide offers a useful starting point, but it is not exhaustive. Analysts must consider other potential causes of failure and pay close attention to PIFs, which play an equally important role in determining the likelihood failure. By focusing on optimising PIFs and addressing the root causes of failures, we can improve overall human performance and prevent incidents more effectively.

To explore the role of PIFs further, our colleague, Dominic, highlights the importance of leveraging positive PIFs to not only prevent errors but to enhance overall performance. Discover how focusing on these strengths can be just as powerful in driving improvements. Learn more here

If you’ve enjoyed this discussion and want to learn more than come on our flagship course on Human Factors Safety Critical Task Analysis (SCTA). Find out more here: https://the.humanreliabilityacademy.com/courses/human-factors-SCTA