首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >Monitoring High-Dimensional Data for Failure Detection and Localization in Large-Scale Computing Systems
【24h】

Monitoring High-Dimensional Data for Failure Detection and Localization in Large-Scale Computing Systems

机译:在大型计算系统中监视高维数据以进行故障检测和定位

获取原文
获取原文并翻译 | 示例

摘要

It is a major challenge to process the high dimensional measurements for failure detection and localization in large scale computing systems. However, it is observed that in information systems those measurements are usually located in a low dimensional structure that is embedded in the high dimensional space. From this perspective, a novel approach is proposed in this paper to model the geometry of underlying data generation and detect anomalies based on that model. We consider both linear and nonlinear data generation models. Two statistics, the Hotelling $T^2$ and the squared prediction error ($SPE$), are used to reflect data variations within and outside the model. We track the probabilistic density of extracted statistics to monitor the system''s health. After a failure has been detected, a localization process is also proposed to find the most suspicious attributes related to the failure. Experimental results on both synthetic data and a real e-commerce application demonstrate the effectiveness of our approach in detecting and localizing failures in computing systems.
机译:在大型计算系统中处理故障检测和定位的高维测量是一个重大挑战。但是,可以观察到,在信息系统中,这些度量通常位于嵌入高维空间的低维结构中。从这个角度出发,本文提出了一种新颖的方法来对基础数据生成的几何建模,并基于该模型检测异常。我们考虑线性和非线性数据生成模型。两种统计量,即Hotelling $ T ^ 2 $和预测误差平方($ SPE $),用于反映模型内外的数据变化。我们跟踪提取的统计信息的概率密度,以监视系统的运行状况。在检测到故障之后,还提出了定位过程以查找与故障相关的最可疑属性。综合数据和实际电子商务应用程序的实验结果证明了我们的方法在检测和定位计算系统中的故障方面的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号