【24h】

Map Reduce by K-Nearest Neighbor Joins

机译:由k-incelt邻居加入映射

获取原文

摘要

Knowledge discovery and Data mining plays a major role in computational intensive tasks with high range of applications. With the increase of volume and dimension of data, the distributed features perform operations in a reasonable period. MapReduce programming is suitable for distributed large scale data processing that provides different ways of solutions to the same problem, that (one) has particular constraints and properties. In this paper, we give comparative analysis and its approaches for computing KNN on MapReduce[1] theoretically and experimental evaluation. Load balancing, accuracy and complexity are analyzed on each step of data preprocessing, data partitioning and computation. The experiment results in this are produced by using variety of datasets. Time and Space complexity are analyzed periodically on each dataset and gives new advantages and short comings that are discussed for each algorithm. Finally this paper can be used as a reference material to handle KNN [2] based problems in the idea of Mapreducing in Big Data.
机译:知识发现和数据挖掘在具有高范围应用的计算密集型任务中发挥着重要作用。随着数据量和维度的增加,分布式功能在合理的时间内执行操作。 MapReduce编程适用于分布式大规模数据处理,该数据处理提供不同的解决方案方式与同一问题,(一个)具有特定的约束和属性。在本文中,我们对Mapreduce的knn进行了比较分析及其在理论上和实验评价中的计算knn。在数据预处理,数据分区和计算的每个步骤上分析负载平衡,准确性和复杂性。实验导致这是通过使用各种数据集来生产的。周期性和空间复杂性在每个数据集上定期分析,并提供对每种算法讨论的新优点和短暂的关注。最后,本文可用作参考材料来处理基于knn [2]在大数据的MapRoding的想法中的问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号