首页> 外文会议>2018 8th International Conference on Cloud Computing, Data Science amp; Engineering >Big Data Analytics Using Multi-Classifier Approach with Rhadoop
【24h】

Big Data Analytics Using Multi-Classifier Approach with Rhadoop

机译:使用带有Rhadoop的多分类器方法进行大数据分析

获取原文
获取原文并翻译 | 示例

摘要

Big Data is the massive amount of data that is generated at such a high speed that is very difficult to analyze with traditional tools. Hadoop provides distributed storage and processing, to extract useful information from such huge data. On the other hand, R is open-source data analysis and programming language that facilitates statistical analysis and data visualization. But R is not scalable, it becomes difficult to process big data using R due to its memory limitations. To utilize data visualization, data transformation capabilities of R on Big Data, in this paper we have integrated R with Hadoop using RHadoop[] package and implemented map reduce form of K-Nearest Neighbor, Naive Bayes and Decision Tree Classifiers in R. In this paper we have also implemented Multi-Classifier to improve the accuracy of classification. Multi-Classifier combines the power of individual classifier to increase the eciency and accuracy of classication. We have used Bayesian combinatorial function and majority voting to combine powers of the above mentioned classifiers. We have found that Multi-Classifier approach gives an improvement in parameters like precision, recall and accuracy.
机译:大数据是高速生成的海量数据,使用传统工具很难分析。 Hadoop提供分布式存储和处理,以从如此庞大的数据中提取有用的信息。另一方面,R是开源数据分析和编程语言,可促进统计分析和数据可视化。但是R不可扩展,由于R的内存限制,使用R处理大数据变得困难。为了利用数据可视化,R在大数据上的数据转换功能,在本文中,我们使用RHadoop []包将R与Hadoop集成在一起,并在R中实现了K最近邻居,朴素贝叶斯和决策树分类器的地图简化形式。论文中我们还实现了多分类器以提高分类的准确性。多分类器结合了单个分类器的功能,可以提高分类的效率和准确性。我们使用贝叶斯组合函数和多数表决来组合上述分类器的功能。我们发现,多分类器方法可以改善诸如精度,召回率和准确性之类的参数。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号