首页> 外文会议>IEEE International Conference on Networking, Sensing and Control >Root Cause Analysis of Concurrent Alarms Based on Random Walk over Anomaly Propagation Graph
【24h】

Root Cause Analysis of Concurrent Alarms Based on Random Walk over Anomaly Propagation Graph

机译:基于异常传播图随机游走的并发警报的根本原因分析

获取原文

摘要

With the development of Internet technology, IT systems are getting more and more complex, in which there are two main relationships among system components: service call relationship and deployment configuration relationship. Once a local anomaly occurs in the system, it tends to spread, triggering emergent and dense concurrent alarms. Hence, it is important to quickly and precisely locate the root cause of concurrent alarms. In this paper, we first construct an anomaly propagation graph using collected system data. Then, based on the graph, we propose two optional algorithms: random walk and state iteration, to track anomaly propagation process and locate the root cause. Simulation experiments demonstrate that our proposed method can localize root causes correctly and rapidly for scenarios with complex call chains and resource competition, and is robust to alarm error. The proposed method pays more attention to system characteristics and depends little on experience knowledge of IT operators.
机译:随着Internet技术的发展,IT系统变得越来越复杂,其中系统组件之间存在两个主要关系:服务调用关系和部署配置关系。一旦系统中发生局部异常,它就会蔓延开来,触发紧急并密集的并发警报。因此,快速准确地定位并发警报的根本原因很重要。在本文中,我们首先使用收集的系统数据构造异常传播图。然后,基于该图,我们提出了两种可选算法:随机游走和状态迭代,以跟踪异常传播过程并找到根本原因。仿真实验表明,对于复杂的呼叫链和资源竞争的情况,我们提出的方法可以正确,快速地定位根本原因,并且对警报错误具有鲁棒性。所提出的方法更加关注系统特性,并且很少依赖IT运营商的经验知识。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号