Tools, costs, and data volumes pose the biggest challenges to enterprise organizations when it comes to data observability. In recent studies, observability tool vendors Grafana Labs, Dynatrace, and Edge Delta surveyed enterprise organizations about the issues they’re facing around observability. Among a range of other issues, respondents to all three surveys said they struggle in three specific areas: tool selection and sprawl, data storage and tool costs, and the sheer amount of data they’re handling. Here’s our roundup of what the studies found and what it means for enterprise organizations, as well as some expert recommendations for solutions.
Table of Contents
Tool Sprawl in Observability: A Major Industry Hurdle
Observability ensures reliability and seamless system performance by providing transparency into internal systems, allowing teams to diagnose and correct issues promptly by facilitating the early detection of irregularities, performance bottlenecks, and system failures. But the impact of tool sprawl on organizational efficiency is significant and can hamper observability efforts. An excessive number of disparate monitoring and analytics tools within observability ecosystems can lead to inefficiencies and introduce complexity, creating the need to manage diverse user interfaces, data formats, and methodologies across multiple platforms.
The Grafana Labs “Observability Survey Report,” published in March 2024, reveals a staggering spread of observability technologies, with 62 different tools in use by respondents. In addition, 70 percent of teams rely on four or more observability technologies, underscoring the fragmented nature of these setups. This diversity presents a major toolset management challenge.
The Rise of Multi-Cloud Environments
Because they provide a secure, innovative, adaptable infrastructure for unique business needs, multi-cloud environments are gaining traction among organizations. However, the increasing adoption of these environments makes managing systems applications even more difficult for IT and cybersecurity teams.
According to Dynatrace’s 2024 “State of Observability” report, 88 percent of technology leaders said the complexity of their tech stacks has increased in the past year. Another 87 percent said multi-cloud complexity makes it more difficult to deliver an outstanding customer experience, and 84 percent said that the resulting complexity makes it harder to protect applications from security vulnerabilities and attacks.
Complex IT environments demand substantial resources for deployment, configuration, and maintenance. Fragmented insights from varying observability tools slow the ability to identify the root causes of issues and can lead to prolonged downtime and reduced system reliability. To address these challenges, organizations need to consolidate and standardize observability technologies to promote consistency, efficiency, and overall effectiveness.
Why Open Source Observability Tools Lead the Market
The Grafana Labs survey found that open source tools lead the market—98 percent of respondents said they use open source software (OSS), with Grafana, Prometheus, Grafana Loki, OpenTelemetry, and Elastic/ELK (Elasticsearch, Logstash, Kibana) being cited as the most popular.
These tools provide a wide range of features and capabilities to meet enterprise observability needs. Grafana’s user-friendly dashboards offer an array of visualization options. Prometheus’ time-series database lets you monitor containerized and microservices architectures. Loki, designed for Kubernetes, specializes in log aggregation, while OpenTelemetry enhances observability through standardized instrumentation and Elastic/ELK’s popular log management platform is known for its advanced features.
Integrating Different Open Source Tools
Since each of these open source tools serves specific purposes within the observability stack, integrating them can provide comprehensive monitoring and troubleshooting capabilities across different aspects of a system. Organizations benefit from a holistic approach to observability that gives insights into the performance, health, and behavior of their systems from several data sources and perspectives. Additionally, this integration enables smoother workflows and better interoperability.
Challenges of Integrating Multiple Tools
While integrating multiple observability tools brings undeniable benefits, doing so is not without challenges. One major obstacle is the difficulty of configuring components with their own setup requirements and specifications. Ensuring compatibility among tools can lead to issues or inconsistencies in data interpretation. Smooth communication and data exchange between tools can be a challenge when dealing with varying data formats and protocols.
Overcoming these integration obstacles requires careful planning, actionable data integration strategy, robust documentation, and possibly the development of custom integrations or middleware for seamless connection between the tools. Organizations must standardize data formats and protocols and implement centralized architectures that support complete monitoring and observability.
Unpacking the Costs and Complexity of Observability
While a growing number of organizations rely on observability tools, the rising costs and complication of observability introduce daunting challenges.
Edge Delta’s “Charting Observability Report,” published in late 2023, found unexpected hidden costs and spikes in observability expenses—98 percent of respondents said they experienced unexpected cost increases multiple times per year. Similarly, 56 percent of respondents to Grafana Labs’ “Observability Survey” cited cost as their primary concern, followed closely by complexity at 50 percent.
Challenge of Cost Management and Simplifying the Observability Stack
Organizations face a dual challenge in their observability efforts: controlling costs while streamlining their tech stacks.
On one side, leadership teams are pressuring for cost reduction—the Edge Delta report revealed that leadership teams are looking to cut costs, with 91 percent of respondents anticipating increased scrutiny in the next 12 months. On the other, managing numerous observability tools complicates operations and raises maintenance costs. Unfortunately, 85 percent of tech leaders surveyed for the Dynatrace report said the escalating costs associated with storing and analyzing logs outweighed the benefits.
How to Mitigate Cost and Complexity
To overcome the challenges associated with observability costs and complexity, organizations can combine two strategies: optimizing costs and consolidating tools.
Closely monitoring usage patterns, implementing budget controls, and using cloud cost management tools can help optimize costs by enabling early identification and mitigation of overspending and unexpected spikes, ultimately improving resource management and minimizing operational expenses.
Organizations should consolidate tools when applicable, standardize data formats, and automate routine tasks to reduce operational overhead and boost efficiency. Switching to a unified observability platform can simplify operations and reduce licensing and infrastructure costs.
The Data Overload Dilemma in Modern Organizations
Edge Delta’s report shows that 38 percent of companies produce between 500GB and 1TB of data on a daily basis. Another 15 percent generate more than 10TB. This massive growth in data volumes creates challenges to modern organizations—86 percent of technology leaders surveyed by Dynatrace said cloud-native technology stacks produce a vast amount of data beyond humans’ ability to manage.
This rapid escalation in data volume brings forth several serious security, compliance, and operational efficiency concerns:
- The expanded data landscape broadens the attack surface for cybersecurity threats.
- Compliance becomes more challenging due to the increasing data volumes, warranting more governance frameworks to ensure adherence and mitigate risks.
- IT teams may face difficulty extracting useful information from large datasets, slowing problem-solving, reducing flexibility, and negatively affecting operational efficiency.
Using Data Management and Analytics to Extract Value from Data
Organizations can address the challenges of data overload by implementing reliable data management and data analytics strategies. Data categorization, prioritization, and automation let organizations elevate data management practices.
Data categorization aids in identifying and focusing on the most relevant data, reducing noise and improving decision-making. Prioritization enhances resource allocation by handling critical data first, and automation simplifies repetitive tasks so organizations can efficiently manage large data volumes.
In addition, advanced analytics techniques like machine learning (ML) and artificial intelligence (AI) extract valuable insights from data by uncovering hidden patterns, correlations, and trends. These insights promote informed decision-making, process refinement, and better risk management strategies.
Expert Strategies for Enhancing Observability
Observability is becoming increasingly important for organizations managing complex hybrid and multi-cloud environments. To help you resolve the challenges introduced by data explosion, rising complexity, and cost concerns, experts from the three companies whose studies we quoted here—Grafana Labs, Dynatrace, and Edge Delta—shared valuable strategies for enhancing observability at your organization.
Unified Platforms Provide Better Insights
Andi Grabner, DevOps Activist at Dynatrace, underscored the value of having a unified observability platform that uses causal, predictive, and generative AI and connects all pillars of observability: logs, metrics, traces, events, end user behavior, and security.
“By unifying their data and converging multiple AI techniques, organizations can unlock meaningful insights, powering advanced analytics and automation,” Grabner said. “This enables them to drive smarter decision-making and more efficient ways of working.”
Better Groundwork Means Better Data Quality
Organizations should lay a solid groundwork to maximize the benefits of their observability strategies and eliminate data silos, find and fill any gaps, and deliver truly data-driven insights, said Grafana Labs’ Chief Technology Officer Tom Wilkie.
“The real difference between having data and being able to use that data is its quality,” Wilkie said. “A centralized observability approach … gives you a holistic view of your system’s behavior, helping you make well-informed decisions while avoiding vendor lock-in.”
An iterative method that uses standards like OpenTelemetry, establishes uniformity through Service Level Objectives (SLOs), and enhances metadata can fill in missing components so you can shift your focus to distilling observability insights.
Preprocess Data For Better Cost Management
Pre-processing data upstream before it is ingested into your observability platform can help your organization manage exponential data growth and optimize costs, said Riley Peronto, Director of Product Marketing at Edge Delta.
“This approach can help you summarize and derive analytics from large-scale datasets,” he said. “The net benefit here is that you’re reducing your data footprint without sacrificing visibility.” By putting these strategies into action, your organization can strengthen its observability frameworks, leading to better efficiency, optimized costs, and a competitive advantage.
Success Stories: Overcoming Observability Challenges
To show how other organizations are making observability work for them, we asked the vendors to share some customer success stories.
FreedomPay
Dynatrace Observability customer FreedomPay reduced its mean time to repair (MTTR) by streamlining observability and speeding up collaboration.
“In our worst cases, we’ve reduced our MTTR from weeks down to hours by simplifying the observability landscape,” said Mark Tomlinson, Director of Performance and Observability for FreedomPay. “The right people have the exact access they need, conveniently managed, fully enabled and educated, with accelerated collaboration around shared telemetry data. This was unimaginably difficult to do with siloed teams and disparate tooling.”
Sailpoint
Grafana Labs customer Sailpoint achieved a 33 percent reduction in metrics volume and improved cost efficiency using the platform’s Adaptive Metrics, showcasing the benefits of optimizing metric usage.
“One of the challenges observability teams are currently battling is rising cloud costs,” Tom Wilkie said. “With Grafana’s Adaptive Metrics, a solution aimed at optimizing metric usage in Grafana Cloud, Sailpoint reduced their metrics volume by 33 percent, boosting cost efficiency.”
MediaKind
Riley Peronto said that Edge Delta saw a lot of customers facing exponential log data growth and the expenses associated with it.
“When you’re using most traditional observability tooling, costs scale linearly with data volume,” he said. As a byproduct, many of the teams we talk to are facing bigger and bigger costs.” Teams were discarding data to lower expenses, but doing so created blind spots that put the organization at risk.
By pre-processing data before it reaches the platform, businesses can streamline analytics insights and reduce costs. Some customers, like MediaKind, have reduced log ingestion by as much as 80 percent.
The Next Phase of Observability: Embracing Predictive Analytics and AI
As observability evolves, predictive analytics, AI, and other innovations are poised to play key roles in its future. Predictive analytics will empower organizations to anticipate and overcome issues before they occur, while AI technologies will aid observability tools to analyze vast amounts of data rapidly and efficiently, uncovering insights and patterns that humans might overlook. These innovations will transform how professionals monitor and manage systems, giving real-time insights and predictive capabilities.
Dynatrace’s Grabner highlighted the shift toward leveraging Kubernetes to build Internal Development Platforms (IDPs), enabling self-service capabilities for various tasks. By adding observability at the core of these platforms, organizations gain a holistic view of their environment, enhancing decision-making and operational efficiency.
Wilkie said he foresees a pragmatic approach to AI adoption, focusing on tangible benefits, like reducing toil and elevating user experience. The adoption of OpenTelemetry is gaining momentum, with a majority of respondents investing in it. Grafana Labs is concentrating on strengthening the integration between OpenTelemetry and Prometheus, recognizing the power of this combination in delivering comprehensive insights into application and infrastructure performance, availability, and reliability.
Implications for Professionals and Organizations in Staying Ahead
As observability tools become more advanced, they will offer improved problem solving and real-time data-driven insights. However, this advancement requires professionals and organizations to be adaptable.
Professionals must continuously update their skill sets, while organizations must make strategic investments in observability technologies and predictive analytics tools to maintain competitiveness. By taking advantage of these technologies, professionals and organizations can anticipate and overcome challenges.
Bottom Line: Critical Takeaways on the State of Observability
Organizations interested in observability face a range of challenges, including tool proliferation, data volumes, and cost. Consolidating observability tools can streamline operations, reducing complexity and bolstering efficiency. Proactive management can help mitigate soaring costs and complication, where cost-effective solutions and optimized processes are key to sustainable operations.
The challenge of data overload also demands better data management and analytics strategies to maximize data assets while upholding security and compliance. Adopting open source solutions offers flexibility and innovation, aiding in cost control. Emerging trends like predictive analytics and AI highlight the ongoing need to stay informed of technological advancements to boost system reliability. These takeaways underscore the significance of strategic adaptation for professionals and organizations aiming to achieve peak performance.
Read our expert picks for the top data observability tools for 2024.