首页> 外文会议>ACM symposium on Applied computing >Combining supervised and unsupervised monitoring for fault detection in distributed computing systems
【24h】

Combining supervised and unsupervised monitoring for fault detection in distributed computing systems

机译:分布式计算系统中故障检测的监督和无监督监测相结合

获取原文

摘要

Fast and accurate fault detection is becoming an essential component of management software for mission critical systems. A good fault detector makes possible to initiate repair actions quickly, increasing the availability of the system. The contribution of this paper is twofold. First a new concept of supervised and unsupervised monitoring is proposed for system fault detection. We use a statistical method, canonical correlation analysis (CCA), to model the contextual dependencies between system inputs u and internal behavior x. By means of CCA, the space x is transformed into two subsets of variables, which are monitored in a supervised and unsupervised manner respectively. By doing so, our approach can reduce the false alarms resulting from unusual workload changes, and hence achieve high fault detection rate. Second, in order to test the performance of our approach, we simulate a variety of system faults in a real e-commerce application based on the multi-tiered J2EE architecture. Experimental results demonstrate that the CCA based approach can detect injected failures at their early stages when unusual phenomenon is very weak, and hence contribute to enormous time and cost savings in managing large scale distributed systems.
机译:快速准确的故障检测正成为关键任务系统管理软件的重要组成部分。一个好的故障探测器可以快速启动修复操作,提高系统的可用性。本文的贡献是双重的。首先提出了用于系统故障检测的监督和无监督监测的新概念。我们使用统计方法,规范相关分析(CCA),以模拟系统之间的上下文依赖性输入U和内部行为 x。通过CCA来模拟,空间 x 是转换成两个变量子集,分别以监督和无人监督的方式监控。通过这样做,我们的方法可以减少因不寻常的工作量变化而导致的误报,从而实现高故障检测率。其次,为了测试我们的方法的性能,我们在基于多层J2EE架构的真实电子商务应用程序中模拟了各种系统故障。实验结果表明,基于CCA的方法可以在不寻常的现象非常弱时检测他们早期阶段的注射失败,因此有助于管理大规模分布式系统的巨大时间和成本。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号