Research on improved K - nearest neighbor algorithm based on spark platform

机译：基于火花平台的改进k - 最近邻算法研究

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Today, big data technology is growing rapidly. The birth of Hadoop makes people concerned about the study of MapReduce, And Spark through the introduction of RDD data model and memory-based computing model, So that it can be well adapted to the data mining of big data this scene, And superior to Hadoop in iterative computing, Quickly became the majority of enterprises, scholars of the research focus. K nearest neighbor algorithm (KNN is used instead of the following) is a very important classification algorithm. A lot of people are studying it, But there is no mature solution to the algorithm in the spark platform to achieve parallelization. In this paper, The author realizes the parallelization of the improved KNN on the spark platform. We use clustering algorithms, Find the weight of each training sample in the training sample set, The weights of the K samples are used to distinguish the K nearest neighbors from the test sample. It is proved by experiments that the improved KNN has better accuracy.

机译：今天，大数据技术正在迅速增长。 Hadoop的诞生使人们通过引入RDD数据模型和基于内存的计算模型的引入来使人们担心研究MapReduce，并且可以很好地适应该场景的大数据的数据挖掘，并且优于Hadoop在迭代计算中，迅速成为大多数企业，学者的研究重点。 K最近邻算法（knn使用而不是以下内容）是一个非常重要的分类算法。很多人都在学习它，但是在火花平台中没有成熟的算法来实现并行化。在本文中，作者实现了改进的KNN对火花平台的平行化。我们使用聚类算法，找到训练样本集中的每个训练样本的重量，K样品的权重用于区分K最近邻居从测试样品。通过实验证明，改进的KNN具有更好的准确性。

著录项

来源
《Joint International Information Technology and Mechanical and Electronic Engineering Conference》|2017年|651p|共5页
会议地点
作者
Yushui Geng; Xianzhao Yan;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TN-53;
关键词
Big data; Hadoop; Spark; K - Nearest Neighbor Algorithm; Weight;

机译：大数据;Hadoop;火花;K - 最近邻算法;重量;

相似文献

外文文献
中文文献
专利

1. Bark Classification of Trees Using K-Nearest Neighbor & Nearest Neighbor Algorithms [J] . Muhammad Tariq Muhammad Ibrahim Computer Engineering and Intelligent Systems . 2016,第1期

机译：使用K最近邻和最近邻算法的树皮分类
2. Identification of Canola Seeds using Nearest Neighbor and K-Nearest Neighbor Algorithms [J] . Altaf Saeed Computer Engineering and Intelligent Systems . 2015,第10期

机译：使用最近邻和K近邻算法识别油菜籽
3. Identification of Canola Seeds using Nearest Neighbor and K-Nearest Neighbor Algorithms [J] . Altaf Saeed Journal of Economics and Sustainable Development . 2015,第10期

机译：使用最近邻和K近邻算法识别油菜籽
4. Research on improved K - nearest neighbor algorithm based on spark platform [C] . Yushui Geng, Xianzhao Yan Joint International Information Technology and Mechanical and Electronic Engineering Conference . 2017

机译：基于火花平台的改进k - 最近邻算法研究
5. Voting Nearest Neighbors: SVM Constraints Selection Algorithm Based on K-Nearest Neighbors [D] . Moreira da Costa, Leandro. 2019

机译：投票最近的邻居：基于K-Indect邻居的SVM约束选择算法
6. Adaptive Residual Weighted K-Nearest Neighbor Fingerprint Positioning Algorithm Based on Visible Light Communication [O] . Shiwu Xu, Chih-Cheng Chen, Yi Wu, 2020

机译：基于可见光通信的自适应残余加权k最近邻指纹定位算法
7. Research on improved K - nearest neighbor algorithm based on spark platform [O] . Yushui Geng, Xianzhao Yan 2017

机译：基于火花平台的改进k - 最近邻算法研究

Research on improved K - nearest neighbor algorithm based on spark platform

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅