首页> 外文会议>International Conference on Frontiers of Materials and Smart System Technologies >ANALYSIS OF KNN ALGORITHM WITH MAPREDUCE TECHNIQUE ON BIG DATA
【24h】

ANALYSIS OF KNN ALGORITHM WITH MAPREDUCE TECHNIQUE ON BIG DATA

机译:麦地图技术对大数据的knn算法分析

获取原文

摘要

Due the fast growth of new technology application like social media analysis,web data analysis and medical information network analysis,here the various types of data are processed frequently.The large amount of effective data management and analysis is very vital goal.To reduce the data processing complexity,time complexity,and space complexity in Big Data,the paper going to propose the k-nearest neighbor join(KNN)operation.KNN is used to find the K nearest points in S.It is a computational task that will handle the large range of applications such as knowledge discovery or data mining.When the volume and the dimension of data increases,then only distributed approaches can perform the big operations in a given time.Recent works have done on implementing the efficient solutions using the map reduce programming model because it is used for distributing the large scale data processing.Although these works provide different solutions for the same problem,each one has particular constraints and properties.This paper compares the existing of different computation of KNN on MapReduce.First the paper compares the solutions in to three steps for KNN computation on MapReduce:1)Data processing,2)Data partitioning and 3)Computation.The Experiment in this paper explains the variety of different data sets,and analyzes the data volume,data dimension and the value of k from many perspectives like time and space complexity,and accuracy.
机译:由于社交媒体分析等新技术应用的快速增长,Web数据分析和医疗信息网络分析,这里经常处理各种类型的数据。大量有效的数据管理和分析是非常重要的目标。减少数据在大数据中处理复杂性,时间复杂性和空间复杂性,该纸张提出k最近邻居(knn)操作.knn用于找到s.it中的k最近点是一个计算任务,它将处理大量应用程序,如知识发现或数据挖掘。当数据的卷和维度增加时,只有分布式方法可以在给定的时间内执行大操作程。在给定的时间内执行大型操作。在使用地图上使用地图实现高效的解决方案进行了工作模型,因为它用于分发大规模数据处理。虽然这些工作为同一问题提供了不同的解决方案,但每个工作都具有特定的约束S和属性。本文将现有的knn在mapreduce上进行了比较。首先,纸张将解决方案与MapReduce上的KNN计算的三个步骤进行了比较:1)数据处理,2)数据分区和3)计算。实验本文解释了各种数据集,分析了数据量,数据维度和k的值,如时间和空间复杂性,准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号