SHARE

Hindsight not Always 20/20 with Problem Management

How often have you reviewed an incident and asked, ”How could they fail to see the cause of the error before it became such a huge problem?” Certainly there are benefits to reviewing a negative event, or series of events, and determining how to prevent them. During the review process, what must be avoided is […]

Written By

George Spafford

Sep 22, 2004

6 minute read

Datamation content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More

How often have you reviewed an incident and asked, ”How could they fail

to see the cause of the error before it became such a huge problem?”

Certainly there are benefits to reviewing a negative event, or series of

events, and determining how to prevent them. During the review process,

what must be avoided is allowing knowledge of the outcome to cloud your

judgment. The affect on perception due to the knowledge of the

subsequent outcome is known as ”hindsight bias” and it can definitely

affect the quality of the review process in a negative manner.

People involved with problem management must avoid this phenomenon and

ask probing questions that dig deeper into the causal elements of the

incident, gaining better insight.

Essentially, once you know the outcome of a chain of events, you tend to

view all actions performed and decisions made through the lens of the

final outcome.

For example, a series of warning alarms go off in a sequence never

before considered. Shortly after, the system begins to fail and the

operator makes matters worse by making a decision on the spot with very

limited information and time. In going back over the accident, it is

very clear to the reviewer that the chaotic warnings were indicating

that a key subsystem was failing because the incident report lists the

failure in detail.

In hindsight, it’s obvious. But in the heat of battle, it may have been

anything but obvious for a variety of reasons.

Organizations must take care not to rush to judgment prematurely. Far

too often these days, groups are deploying systems with little testing,

little to no documentation, and virtually no training. And in this day

of compressed time and high-speed systems, once the operators do

encounter an issue, the issues often compound and mushroom out of

control at amazing speeds.

How can any operator, or group of operators, be expected to effectively

respond without proper training and support mechanisms?

In short, they can’t.

Problem Management

From ITIL, we know that Problem Management essentially involves focusing

on an incident, or series of incidents, in order to identify underlying

causal factors — ”problems” — to prevent repetition. In order to

identify the root causes accurately, problem managers and problem review

boards must beware of allowing hindsight bias to cloud the problem

review process, allowing for oversimplification and/or the

personalization of causality.

In other words, a board must not look at an accident and literally say,

”The sequence is so simple! How could they miss it? It must be operator

error.”

When complex systems are involved, there are often far more contributing

factors than one might initially think.

Focus on the Processes

First and foremost, instead of personalizing the causality and blaming

the operators, reviewers must recognize that there very often are levels

of complexity beyond what is superficially visible. Furthermore, they

need to take a step back and look at what key control points and

processes are lacking.

For example, without exception, as the level of complexity increases in

a system, the value of an effective change management process increases.

Yet, this incredibly valuable process and the associated controls are

all too often overlooked or even discounted as too bureaucratic.

Returning to the point, a great many problems are rooted in process

failures that are exacerbated by the human element being involved.

Continue on to find out what questions will get you the answers you need…

Asking Questions

To reduce the risks associated with hindsight bias, develop post-problem

questionnaires in advance for each system, or class of system. When

incidents happen and it is time to interview and observe the team, use

the questionnaires as guides to templates.

Here are some questions that should give you a few starting points:

Processes — The first category to check is the processes

involved at the time of the incident:

and adopted just prior to the incident?

What processes failed and why?

Are people bypassing the formal documented processes? Why?

Are there processes and/or controls that need to be added?

Are there processes and/or controls that need to be changed?

Documentation — Was the organization, system and surrounding

processes mature enough to be documented?

Did documentation exist?

Was the documentation readily available?

Was it understandable?

Could they find the needed topics in the manual?

How can the documentation be improved?

Training — Ensuring there is proper training and understanding of

the processes and documentation is another major step on the road to

maturity.

Were the operators trained appropriately?

how similar was the training to reality?

How could training be improved?

The Operators — Fatigue, emotions and pressures all affect the

cognitive abilities of the people operating systems. Be sure to factor

them in.

Was fatigue a factor?

Were the operators angry, upset, anxious?

financial constraints?

Were there pressures to meet unrealistic deadlines?

Technical Questions — Yes, the bits, bytes and technical details

are last. You can fill in the questions pertinent to the systems used.

However, do consider including the following:

Was there failure in multiple subsystems or just one?

Was the failure, or sequence of failures, predictable?

Did the alarms work?

subsystems in production or was there a new variable?

What testing was done prior to going into production?

It is always beneficial to learn from mistakes and outages. Problem

review boards analyzing an incident after the fact need to beware of

allowing their knowledge of outcomes to bias their examination of the

steps that led up to the event. They must pay appropriate attention to

the processes and human factors that could create fertile environments

for failure, not just the technical elements.

In this age of ever increasing complexity, there will always be

incidents and underlying problems that must be addressed with proper

organizational learning and corrective actions to keep the problem from

popping up again.

Ethics and Artificial Intelligence: Driving Greater Equality

FEATURE | By James Maguire,
December 16, 2020
AI vs. Machine Learning vs. Deep Learning

FEATURE | By Cynthia Harvey,
December 11, 2020
Huawei’s AI Update: Things Are Moving Faster Than We Think

FEATURE | By Rob Enderle,
December 04, 2020
Keeping Machine Learning Algorithms Honest in the ‘Ethics-First’ Era

ARTIFICIAL INTELLIGENCE | By Guest Author,
November 18, 2020
Key Trends in Chatbots and RPA

FEATURE | By Guest Author,
November 10, 2020
Top 10 AIOps Companies

FEATURE | By Samuel Greengard,
November 05, 2020
What is Text Analysis?

ARTIFICIAL INTELLIGENCE | By Guest Author,
November 02, 2020
How Intel’s Work With Autonomous Cars Could Redefine General Purpose AI

ARTIFICIAL INTELLIGENCE | By Rob Enderle,
October 29, 2020
Dell Technologies World: Weaving Together Human And Machine Interaction For AI And Robotics

ARTIFICIAL INTELLIGENCE | By Rob Enderle,
October 23, 2020
The Super Moderator, or How IBM Project Debater Could Save Social Media

FEATURE | By Rob Enderle,
October 16, 2020
Top 10 Chatbot Platforms

FEATURE | By Cynthia Harvey,
October 07, 2020
Finding a Career Path in AI

ARTIFICIAL INTELLIGENCE | By Guest Author,
October 05, 2020
CIOs Discuss the Promise of AI and Data Science

FEATURE | By Guest Author,
September 25, 2020
Microsoft Is Building An AI Product That Could Predict The Future

FEATURE | By Rob Enderle,
September 25, 2020
Top 10 Machine Learning Companies 2021

FEATURE | By Cynthia Harvey,
September 22, 2020
NVIDIA and ARM: Massively Changing The AI Landscape

ARTIFICIAL INTELLIGENCE | By Rob Enderle,
September 18, 2020
Continuous Intelligence: Expert Discussion [Video and Podcast]

ARTIFICIAL INTELLIGENCE | By James Maguire,
September 14, 2020
Artificial Intelligence: Governance and Ethics [Video]

ARTIFICIAL INTELLIGENCE | By James Maguire,
September 13, 2020
IBM Watson At The US Open: Showcasing The Power Of A Mature Enterprise-Class AI

FEATURE | By Rob Enderle,
September 11, 2020
Artificial Intelligence: Perception vs. Reality

FEATURE | By James Maguire,
September 09, 2020

SEE ALL
ARTICLES

Hindsight not Always 20/20 with Problem Management

George Spafford

Company

Categories

Hindsight not Always 20/20 with Problem Management

RELATED NEWS AND ANALYSIS

George Spafford

Company

Categories