【24h】

Map Reduce by K-Nearest Neighbor Joins

机译:通过K最近邻居加入来减少地图

获取原文

摘要

Knowledge discovery and Data mining plays a major role in computational intensive tasks with high range of applications. With the increase of volume and dimension of data, the distributed features perform operations in a reasonable period. MapReduce programming is suitable for distributed large scale data processing that provides different ways of solutions to the same problem, that (one) has particular constraints and properties. In this paper, we give comparative analysis and its approaches for computing KNN on MapReduce[1] theoretically and experimental evaluation. Load balancing, accuracy and complexity are analyzed on each step of data preprocessing, data partitioning and computation. The experiment results in this are produced by using variety of datasets. Time and Space complexity are analyzed periodically on each dataset and gives new advantages and short comings that are discussed for each algorithm. Finally this paper can be used as a reference material to handle KNN [2] based problems in the idea of Mapreducing in Big Data.
机译:知识发现和数据挖掘在具有大量应用程序的计算密集型任务中扮演着重要角色。随着数据量和数据量的增加,分布式功能会在合理的时间内执行操作。 MapReduce编程适合于分布式大规模数据处理,该处理为同一问题提供了不同的解决方案,(一个)具有特定的约束和属性。本文对MapReduce [1]上的KNN计算进行了比较分析及其方法,从理论上和实验上进行了评估。在数据预处理,数据分区和计算的每个步骤中分析负载平衡,准确性和复杂性。通过使用各种数据集可以得出实验结果。定期分析每个数据集的时间和空间复杂度,并为每种算法提供了新的优点和缺点。最后,本文可作为参考材料,用于处理大数据中的Mapreducing概念中基于KNN [2]的问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号