Staffing levels within IT operations (ITOps) departments are flat or declining, enterprise IT environments are more complex by the day and the transition to the cloud is accelerating. Meanwhile the volume of data generated by monitoring and alerting systems is skyrocketing, and Ops teams are under pressure to respond faster to incidents.
Faced with these challenges, companies are increasingly turning to AIOps – the use of machine learning and artificial intelligence to analyze large volumes of IT operations data – to help automate and optimize IT operations. Yet before investing in a new technology, leaders want confidence that it will indeed bring value to end users, customers and the business at large.
Leaders looking to measure the benefits of AIOps and build KPIs (key performance indicators) for both IT and business audiences should focus on key factors such as uptime, incident response and remediation time, and predictive maintenance so that potential outages affecting employees and customers can be prevented.
Business KPIs connected to AIOps include employee productivity, customer satisfaction, and web site metrics such as conversion rate or lead generation. Bottom line, AIOps teams can help companies cut IT operations costs through automation and rapid analysis; and it can support revenue growth by enabling business processes to run smoothly and with excellent user experiences.
KPIs to Measure AIOps
These common KPIs can measure the impact of AIOps on business processes:
1. Mean time to detect (MTTD): This KPI refers to how quickly it takes for an issue to be identified. AIOps can help companies drive down MTTD through the use of machine learning to detect patterns, block out the noise and identify outages. Amid an avalanche of alerts, ITOps can understand the importance and scope of an issue, which leads to faster identification of an incident, reduced down time, and better performance of business processes.
2. Mean time to acknowledge (MTTA): Once an issue has been detected, IT teams need to acknowledge the issue and determine who will address it. AIOps can use machine learning to automate that decision making process and quickly make sure that the right teams are working on the problem.
3. Mean time to restore/resolve (MTTR): When a key business process or application goes down, speedy restoration of service is key. ITOps plays an important role in using machine learning to understand if the issue has been seen previously and, based on past experiences, to recommend the most effective way to get the service back up and running.
4. Service Availability: Often expressed in terms of percentage of uptime over a period of time or outage minutes per period of time, AIOps can help boost service availability through the application of predictive maintenance.
5. Percentage of automated versus manual resolution: Increasingly, organizations are leveraging intelligent automation to resolve issues without manual intervention. Machine learning techniques can be trained to identify patterns, such as previous scripts that had been executed to remedy a problem, and take the place of a human operator.
6. User Reported versus Monitoring Detected: IT operations should be able to detect and remediate a problem before the end user is even aware of it. For example, if application performance or Web site performance is slowing down by milliseconds, ITOps wants to get an alert and fix the issue before the slowdown worsens and affects users. AIOps enables the use of dynamic thresholds to ensure that alerts are generated automatically and routed to the correct team for investigation or auto-remediated when policies dictate.
7. Time savings and associated cost savings: The use of AIOps whether to perform automation or more quickly identify and resolve issues will result in savings both in operator time and business time to value. These have a direct impact on the bottom line.
These seven KPIs can be correlated to business KPIs around user experience, application performance, customer satisfaction, improved e-commerce sales, employee productivity, and increased revenue. ITOps teams need the ability to quickly connect the dots between infrastructure and business metrics so that IT is prioritizing spend and effort on real business needs. Hopefully, as machine learning matures, AIOps tools can recommend ways to improve business outcomes or provide insights as to why digital programs succeed or miss the mark.
This article is comprised of industry information offered by Ciaran Byrne, VP of Product Management at OpsRamp.