首页> 中文期刊> 《计算机学报》 >云计算系统中基于伴随状态追踪的故障检测机制

云计算系统中基于伴随状态追踪的故障检测机制

         

摘要

在运行时检测分布式系统内所产生的故障需要事先获得故障特征模型.构造故障特征模型的常见做法为将故障注入系统并根据随后系统内所产生的特征症状(如异常事件日志)建模.已有建模方法通常使用从故障发生到给定时间窗口之内的特征症状.然而,根据真实系统观察,不同故障的传播影响时间相差很大,且故障特征会在故障传播过程中发生改变.因此,已有方法对检测时间窗口之后发的故障特征症状不能识别或会产生大量错误报警.为了解决此问题,文中提出一种基于故障注入测试的故障特征提取方法,该方法主要由3步组成:(1)过滤噪声日志;(2)构造1个故障识别器识别不同故障的早期特征;(3)为每类故障构造限状态追踪器追踪该故障的后期传播状态,从而在故障被识别出来后持续跟踪故障传播状态.通过在企业级云计算系统中进行实验验证,与已有方法相比该文方法具备更高的故障检测精确度.%A common way to construct a fault model is injecting the fault into the system and ob-serving the subsequent symptoms, e. G. Event logs. However, fault features would vary during the propagation period, and present different symptoms at different stage of the fault propagation process. The exiting detection window based feature extraction methods can only identify the ear-ly symptoms of a fault, but fail to detect the latter symptoms and cause false alarms. To solve the problem, we present a fault feature extraction method, called Companion State Tracer (CSTracer) ,which consists of 3 integrated steps: (1) pre-process logs to remove the unrelated logst (2) con-struct a general identifier for the early symptoms of a fau (3) construct a finite state machine model for the fault to trace the latter symptoms. CSTracer can persistently monitor a fault after the fault has been identified. We have justified the effectiveness of CSTracer in an enterprise cloud system. Compared with the existing, the results show that CSTracer has a better detection accuracy.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号