首页> 外文会议>Software Maintenance and Reengineering (CSMR), 2009 13th European Conference on >Automatic Failure Diagnosis Support in Distributed Large-Scale Software Systems Based on Timing Behavior Anomaly Correlation
【24h】

Automatic Failure Diagnosis Support in Distributed Large-Scale Software Systems Based on Timing Behavior Anomaly Correlation

机译:基于时序行为异常关联的分布式大型软件系统自动故障诊断支持

获取原文

摘要

Manual failure diagnosis in large-scale software systems is time-consuming and error-prone. Automatic failure diagnosis support mechanisms can potentially narrow down, or even localize faults within a very short time which both helps to preserve system availability. A large class of automatic failure diagnosis approaches consists of two steps: 1) computation of component anomaly scores; 2) global correlation of the anomaly scores for fault localization. In this paper, we present an architecture-centric approach for the second step. In our approach, component anomaly scores are correlated based on architectural dependency graphs of the software system and a rule set to address error propagation. Moreover, the results are graphically visualized in order to support fault localization and to enhance maintainability. The visualization combines architectural diagrams automatically derived from monitoring data with failure diagnosis results. In a case study, the approach is applied to a distributed sample Web application which is subject to fault injection.
机译:大型软件系统中的手动故障诊断既耗时又容易出错。自动故障诊断支持机制可以在很短的时间内缩小故障范围,甚至定位故障,这都有助于保持系统的可用性。一大类自动故障诊断方法包括两个步骤:1)组件异常分数的计算; 2)异常分数的全局相关性以进行故障定位。在本文中,我们为第二步提出了一种以架构为中心的方法。在我们的方法中,基于软件系统的体系结构相关性图和用于解决错误传播的规则集,将组件异常评分相关联。此外,结果以图形方式可视化,以支持故障定位和增强可维护性。可视化将自动从监视数据中导出的架构图与故障诊断结果结合在一起。在一个案例研究中,该方法适用于受故障注入影响的分布式示例Web应用程序。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号