Datamation content and product recommendations are
editorially independent. We may make money when you click on links
to our partners.
Learn More
Complex networks require increasingly sophisticated monitoring systems. However, far too often, monitoring is an afterthought and not a holistically engineered part of the system. In fact, it is very common that the overall monitoring system is complicated and mission-critical, yet has varying degrees of documentation, training, fault-tolerance and security.
In order to improve, organizations must recognize that a monitoring system itself can cause problems and there are a unique set of issues that must be taken into account and mitigated.
Perceived Reliability
We must consider how people perceive the accuracy of the automated feedback systems. A properly designed monitoring system must be such that operators can realistically investigate and record the findings of all alerts raised or issues flagged.
In other words, the system must be a closed loop where in issues are raised, investigated, mitigated (if need be) and results logged. The problem is that as the number of erroneous alerts increase, the amount of personnel time wasted and level of frustration increases as well.
This “perceived reliability” is a key dynamic for any form of monitoring. If operators have expectations that are out of alignment with what the system can deliver, then they are far more likely to discount reports coming from that system and even falsify reports in order to “not waste time.”
Far too many accidents have taken place due to operators assuming that messages were false positives when, in fact, the alerts were accurate. From this, we can posit The Law of False Alerts: As the rate of erroneous alerts increases, operator reliance, or belief, in subsequent warnings decreases.
If a complex system has an area where there are constant false alarms coming from a monitoring system used to detect a security breach, or any critical parameter for that matter, wouldn’t that be a prime target by a hacker or terrorist? Whether it is an intrusion detection system that constantly reports non-existent incursions, a flaky motion sensor flagging movement that doesn’t exist, or an open/closed sensor providing a false report about a valve’s state, if it is a known weak link due to media reports or even the office rumor mill, then it is at risk of allowing a breach to happen.
What do we do?
First, we must treat monitoring as an intrinsic part of the overall system in question. By adding monitoring with little thought to a system, we risk monitoring the wrong events and/or wrongly interpreting reported data. In other words, there must be a holistic approach that identifies key performance indicators in the system, their acceptable bounds and key causal logic. “If these sensors register X, Y and Z then event Alpha must be taking place and the IT operations must be alerted immediately.”
The human factor must be taken into account and careful planning of what events trigger an alarm, processes to validate results, layout of the messages and so on. Always bear in mind that as the level of false positives increases, faith in the monitoring system decreases. The monitoring system must not only be accurate, it must be viewed as accurate and as providing value to the operators or they will increasingly ignore it over time, perhaps to disastrous results.
p
Second, build “monitoring in-depth.” This is a play on “defense in-depth” in that multiple sensors are arranged to confirm events.
For example, one potential scenario is that a more sensitive but more error-prone sensor is used to initially indicate a state and a less sensitive but more reliable sensor is used in series to corroborate the earlier “fast alert” probe.
Another scenario could involve an array of sensors used to confirm an event due to the critical need to be certain that the data collected is accurate. A single monitoring system is as susceptible to a single point-of-failure incident as any other system.
Third, plan for continuous improvement. Odds are high that most of the underlying systems monitored will evolve over time for one reason or another. In parallel, the monitoring system must evolve to continue meeting expectations.
A monitoring system that can only handle 10Mb/s will face a virtually impossible task if the underlying system is upgraded to gigabit speeds and it can’t sample the data fast enough. Furthermore, these systems must be reviewed over time to ensure that they still align with operator requirements. For example, filters may need to be added or modified in order to screen out unnecessary “noise” that the operators are contending with that didn’t initially exist (providing proper analysis is done to determine why the noise exists of course).
Fourth, treat monitoring as an important activity and have the appropriate engineering resources and processes, such as change advisory boards set up to review, approve and schedule changes.
Monitoring must evolve from a haphazard afterthought to a critical application with specified service levels identifying timeliness, accuracy, uptime and security. For all monitoring, and especially SCADA systems, there must be effective communication between functional groups to ensure the systems are designed, secured and maintained appropriately.
Summary
Complex systems require increasingly sophisticated monitoring systems. Care must be taken to design secure systems that meet requirements and are perceived as accurate by the operators.
If a monitoring system is perceived as not adding value, operators will depend on it less and less. This, in turn, creates a fertile environment for security breaches, accidents and all types of inefficiencies. With that in mind, monitoring systems must consistently evolve from afterthoughts to well engineered systems to ensure expectations are met.
-
Ethics and Artificial Intelligence: Driving Greater Equality
FEATURE | By James Maguire,
December 16, 2020
-
AI vs. Machine Learning vs. Deep Learning
FEATURE | By Cynthia Harvey,
December 11, 2020
-
Huawei’s AI Update: Things Are Moving Faster Than We Think
FEATURE | By Rob Enderle,
December 04, 2020
-
Keeping Machine Learning Algorithms Honest in the ‘Ethics-First’ Era
ARTIFICIAL INTELLIGENCE | By Guest Author,
November 18, 2020
-
Key Trends in Chatbots and RPA
FEATURE | By Guest Author,
November 10, 2020
-
Top 10 AIOps Companies
FEATURE | By Samuel Greengard,
November 05, 2020
-
What is Text Analysis?
ARTIFICIAL INTELLIGENCE | By Guest Author,
November 02, 2020
-
How Intel’s Work With Autonomous Cars Could Redefine General Purpose AI
ARTIFICIAL INTELLIGENCE | By Rob Enderle,
October 29, 2020
-
Dell Technologies World: Weaving Together Human And Machine Interaction For AI And Robotics
ARTIFICIAL INTELLIGENCE | By Rob Enderle,
October 23, 2020
-
The Super Moderator, or How IBM Project Debater Could Save Social Media
FEATURE | By Rob Enderle,
October 16, 2020
-
Top 10 Chatbot Platforms
FEATURE | By Cynthia Harvey,
October 07, 2020
-
Finding a Career Path in AI
ARTIFICIAL INTELLIGENCE | By Guest Author,
October 05, 2020
-
CIOs Discuss the Promise of AI and Data Science
FEATURE | By Guest Author,
September 25, 2020
-
Microsoft Is Building An AI Product That Could Predict The Future
FEATURE | By Rob Enderle,
September 25, 2020
-
Top 10 Machine Learning Companies 2021
FEATURE | By Cynthia Harvey,
September 22, 2020
-
NVIDIA and ARM: Massively Changing The AI Landscape
ARTIFICIAL INTELLIGENCE | By Rob Enderle,
September 18, 2020
-
Continuous Intelligence: Expert Discussion [Video and Podcast]
ARTIFICIAL INTELLIGENCE | By James Maguire,
September 14, 2020
-
Artificial Intelligence: Governance and Ethics [Video]
ARTIFICIAL INTELLIGENCE | By James Maguire,
September 13, 2020
-
IBM Watson At The US Open: Showcasing The Power Of A Mature Enterprise-Class AI
FEATURE | By Rob Enderle,
September 11, 2020
-
Artificial Intelligence: Perception vs. Reality
FEATURE | By James Maguire,
September 09, 2020
SEE ALL
ARTICLES