...
首页> 外文期刊>Journal of supercomputing >Data flow analysis for anomaly detection and identification toward resiliency in extreme scale systems
【24h】

Data flow analysis for anomaly detection and identification toward resiliency in extreme scale systems

机译:数据流分析,用于异常检测和识别极端规模系统中的弹性

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

The increased complexity and scale of high performance computing and future extreme-scale systems have made resilience a key issue, since it is expected that future systems will have various faults during critical operations. It is also expected that current solutions for resiliency, mainly counting on checkpointing in hardware and applications, will become infeasible because of unacceptable recovery time for checkpointing and restarting. In this paper, we present innovative concepts for anomaly detection and identification, analyzing the duration of pattern transition sequences of an execution window. We use a three-dimensional array of features to capture spatial and temporal variability to be used by an anomaly analysis system to immediately generate an alert and identify the source of faults when an abnormal behavior pattern is captured, indicating some kind of software or hardware failure. The main contributions of this paper include the innovative analysis methodology and feature selection to detect and identify anomalous behavior. Evaluating the effectiveness of this approach to detect faults injected asynchronously shows a detection rate of above 99.9% with no occurrences of false alarms for a wide range of scenarios, and accuracy rate of 100% with short root cause analysis time.
机译:高性能计算的复杂性和规模以及未来的极端规模系统的日益增长,已使弹性成为关键问题,因为预计未来的系统在关键操作期间会出现各种故障。还可以预期,目前的弹性解决方案(主要依靠硬件和应用程序中的检查点)将变得不可行,因为检查点和重新启动的恢复时间不可接受。在本文中,我们提出了用于异常检测和识别的创新概念,分析了执行窗口的模式转换序列的持续时间。我们使用三维特征数组来捕获空间和时间变化,以供异常分析系统使用,以在捕获异常行为模式时立即生成警报并识别故障源,从而指示某种软件或硬件故障。本文的主要贡献包括创新的分析方法和特征选择,以检测和识别异常行为。评估这种方法以检测异步注入的故障的有效性显示,在多种情况下,检出率均高于99.9%,且没有发生误报;在根本原因分析时间短的情况下,检出率达100%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号