...
首页> 外文期刊>Journal of Theoretical and Applied Information Technology >INCREMENTAL PARALLEL CLASSIFIER FOR BIG DATA WITH CASE STUDY: NAIVE BAYES USING MAPREDUCE PATTERNS
【24h】

INCREMENTAL PARALLEL CLASSIFIER FOR BIG DATA WITH CASE STUDY: NAIVE BAYES USING MAPREDUCE PATTERNS

机译:具有案例研究的大数据增量并行分类器:使用映射减少模式的朴素贝叶斯

获取原文
           

摘要

Classification methods can be used to derive values from big data in the form of models, which then can be utilized to predict new cases. Several parallel classification methods for big data have been developed based on Hadoop MapReduce as well as for Spark system. As big data keeps on coming, the models must be updated from time to time to represent the old as well as the new data. The computations must be efficient and scalable for handling big data. This research aims to enhance the existing parallel classifiers such that they will perform as incremental classifier handling batches of big data. The research results are presented as follows. First, the architecture and main concept of the enhancement is presented. Secondly, the proposed incremental parallel Na?ve Bayes classifier (NBC) based on MapReduce that handles dataset with discrete attributes is discussed in detailed. Two series of experiment were performed on Hadoop clusters with 5 and 10 nodes. The results show that the incremental parallel NBC has acceptable accuracy, is efficient and scalable.
机译:分类方法可用于以模型形式从大数据中获取值,然后可用于预测新情况。已经基于Hadoop MapReduce和Spark系统开发了几种并行的大数据分类方法。随着大数据的不断涌现,必须不时更新模型以代表新旧数据。计算必须高效且可扩展,以处理大数据。这项研究旨在增强现有的并行分类器,使其可以作为增量分类器来处理大数据批次。研究结果如下。首先,介绍了增强的体系结构和主要概念。其次,详细讨论了基于MapReduce的增量并行幼稚贝叶斯分类器(NBC),该分类器处理具有离散属性的数据集。在具有5个和10个节点的Hadoop集群上进行了两个系列的实验。结果表明,增量并行NBC具有可接受的精度,有效且可扩展。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号