首页> 外文会议>International Conference on Cloud Computing and Big Data >Analyzing and Predicting Failure in Hadoop Clusters Using Distributed Hidden Markov Model
【24h】

Analyzing and Predicting Failure in Hadoop Clusters Using Distributed Hidden Markov Model

机译:使用分布式隐马尔可夫模型分析和预测Hadoop集群中的故障

获取原文

摘要

In this paper, we propose a novel approach to analyze and predict failures in Hadoop cluster. We enumerate several key challenges that hinder failure prediction in such systems: heterogeneity of the system, hidden complexity, time limitation and scalability. At first, clustering approach is applied to group similar error sequences, which makes training of the model effectual subsequently Hidden Markov Models (HMMs) is used to predict failure, using the MapReduce programming framework. The effectiveness of the failure prediction algorithm is measured by precision, recall and accuracy metrics. Our algorithm can predict failure with an accuracy of 91 % with 2 days in advance using 87% of data as training sets. Although the model presented in this paper focuses on Hadoop clusters, the model can be generalized in other cloud computing frameworks as well.
机译:在本文中,我们提出了一种新颖的方法来分析和预测Hadoop集群中的故障。我们列举了阻碍此类系统中故障预测的几个关键挑战:系统的异构性,隐藏的复杂性,时间限制和可伸缩性。首先,将聚类方法应用于对相似的错误序列进行分组,从而使模型的训练有效,随后使用MapReduce编程框架将隐马尔可夫模型(HMM)用于预测故障。故障预测算法的有效性通过精度,召回率和准确性指标来衡量。我们的算法可以使用87%的数据作为训练集,提前2天以91%的精度预测故障。尽管本文介绍的模型着重于Hadoop集群,但该模型也可以在其他云计算框架中推广。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号