THE CHALLENGE
The client’s environment was comprised of 100+ subsystems that had over 10,000 components. The systems averaged one to four million alerts a day with an accuracy of 45%. The client wanted to reach an accuracy rating of 75% through the use of predictive analytics.
THE SOLUTION
The solution included organizing existing workflows and knowledge base, labeling data for common issues and resolutions, then building machine learning (ML) model-based data to identify anomalies and find associations.
By combining repeating alerts and categorizing similar ones, the volume of alerts was reduced 95.2%. Efficiency was further improved by automating the alert-handling process where automated detection and diagnosis were implemented to minimize human engagement. Alerts were also batch-processed to increase the throughput.
At the system and data level, EveryIT performed the association analysis to identify 40 hidden relationships amongst alerts. With the help of the expert system, this effort allowed a comprehensive data collection and integration that successfully facilitated root cause analysis.
With the benefits generated by accurate prediction, anomaly detection, association analysis, and resolution recommendation, the client achieved several important efficiency improvements:
- 30% reduction in labor cost
- 50% reduction in false alerts
- 80% reduction in time to mitigate
- Reduced mishandled incidents from 2% to 0.5%