首页> 外文会议>International Conference on Cloud Computing and Big Data >Analyzing and Predicting Failure in Hadoop Clusters Using Distributed Hidden Markov Model
【24h】

Analyzing and Predicting Failure in Hadoop Clusters Using Distributed Hidden Markov Model

机译:使用分布式隐马尔可夫模型分析和预测Hadoop集群的失败

获取原文

摘要

In this paper, we propose a novel approach to analyze and predict failures in Hadoop cluster. We enumerate several key challenges that hinder failure prediction in such systems: heterogeneity of the system, hidden complexity, time limitation and scalability. At first, clustering approach is applied to group similar error sequences, which makes training of the model effectual subsequently Hidden Markov Models (HMMs) is used to predict failure, using the MapReduce programming framework. The effectiveness of the failure prediction algorithm is measured by precision, recall and accuracy metrics. Our algorithm can predict failure with an accuracy of 91% with 2 days in advance using 87% of data as training sets. Although the model presented in this paper focuses on Hadoop clusters, the model can be generalized in other cloud computing frameworks as well.
机译:在本文中,我们提出了一种新颖的分析和预测Hadoop集群失败的方法。我们枚举了几种关键挑战,即阻碍这种系统中的失败预测:系统的异质性,隐藏的复杂性,时间限制和可扩展性。首先,将聚类方法应用于组类似的错误序列,其使模型的训练有效的随后隐藏的马尔可夫模型(HMMS)用于使用MapReduce编程框架来预测失败。故障预测算法的有效性是通过精度,召回和精度度量来衡量的。我们的算法可以预先使用87%的数据预先使用2天的准确度预测91%的故障,作为训练集。虽然本文中呈现的模型侧重于Hadoop集群,但该模型也可以在其他云计算框架中广泛化。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号