基于Hadoop的K-means聚类算法的实现

周婷; 张君瑛; 罗成

首页> 中文期刊> 《计算机技术与发展》 >基于Hadoop的K-means聚类算法的实现

基于Hadoop的K-means聚类算法的实现

开具论文收录证明 >>

期刊封面封底目录下载 >>

文献代查 >>

文献数据库（团队版） >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

For the problem of high time complexity of K-means algorithm,propose a method using MapReduce programming model and Hadoop cloud platform to reduce the time complexity of K-means algorithm in dealing with huge data.Design Map function to calculate the distance of each record to each center key and mark their categories,and design Reduce function to update the center keys and calculate the distance of each record to its center key,then make a summary of the distance results.Through the experiment,verify that compared with the traditional serial algorithm when dealing with huge data,the new K-means algorithm can indeed reduce the time complexity,and also has good stability and expansibility.%文中针对传统并行K-means聚类算法时间复杂度比较高的问题,结合Hadoop平台以及MapReduce编程模型的优势,提出了利用Hadoop及MapReduce编程模型实现大数据量下的K-means聚类算法.其中,Map函数完成每条记录到各个质心距离的计算并标记其所属类别,Reduce函数完成质心的更新,同时计算每条数据到其所属中心点的距离,并累计求和.通过实验,验证了K-means算法部署在Hadoop集群上并行化运行,在处理大数据时,同传统的串行算法相比,确实能够降低时间复杂度,而且表现出很好的稳定性和扩展性.

著录项

来源
《计算机技术与发展》 |2013年第7期|18-21|共4页
作者
周婷; 张君瑛; 罗成;
展开▼
作者单位

同济大学电子与信息工程学院;

上海201804;

上海陈家镇建设发展有限公司;

上海202150;

同济大学电子与信息工程学院;

上海201804;

展开▼
原文格式 PDF
正文语种 chi
中图分类算法理论;
关键词
数据挖掘; K-means算法; Hadoop; MapReduce;

相似文献

中文文献
外文文献
专利

1. 基于Hadoop平台的一种改进K-means文本聚类算法 [J] . 潘俊辉 ,王辉 ,张强 . 微型电脑应用 . 2022,第1期
2. 基于Hadoop平台的K-means聚类算法并行化改进研究 [J] . 禤世丽 ,刘建明 . 玉林师范学院学报 . 2020,第3期
3. 基于Hadoop平台的K-means聚类算法 [J] . 刘宝龙 ,苏金 . 计算机系统应用 . 2017,第006期
4. 基于Hadoop平台的K-means聚类算法优化研究 [J] . 卢胜宇 ,王静宇 ,张晓琳 . 内蒙古科技大学学报 . 2016,第003期
5. 基于Hadoop的K-Means聚类算法在高校图书馆工作中的应用研究 [J] . 李萍 . 大学图书情报学刊 . 2014,第005期
6. 基于Hadoop的k-means聚类算法并行实现 [C] . 顾嘉伟 ,尚俊娜 . 浙江省信号处理学会2015学术年会 . 2015
7. 改进K-Means聚类算法在基于Hadoop平台的图像检索系统中的研究与实现 [A] . 黎光谱 . 2014

基于Hadoop的K-means聚类算法的实现

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅