Design of intelligent k-means based on spark for big data clustering

机译：基于Spark的大数据聚类智能k均值设计

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The growth of data has bring us to the big data generation where the amount of data cannot be computed using conventional environment. There are a lot of computational environment that had been developed to compute big data, one of them is Hadoop that has Distributed File System and MapReduce framework. Spark is newly framework that can be combined with Hadoop and run on top of it. In this paper, we design intelligent k-means based on Spark for big data clustering. Our design is using batch of data instead using original Resilient Distributed Dataset (RDD). We compare our design with the implementation that using original RDD of data. Result of experiment shows that implementation using batch of data is faster than the implementation using original RDD.

机译：数据的增长使我们进入了大数据生成时代，在大数据时代无法使用常规环境计算数据量。已经开发了许多计算环境来计算大数据，其中之一就是具有分布式文件系统和MapReduce框架的Hadoop。 Spark是可以与Hadoop结合并在其之上运行的新框架。在本文中，我们设计了基于Spark的智能k均值用于大数据聚类。我们的设计使用的是批量数据，而不是原始的弹性分布式数据集（RDD）。我们将我们的设计与使用原始RDD数据的实施方案进行比较。实验结果表明，使用批量数据的实现比使用原始RDD的实现更快。

著录项

来源
《International Workshop on Big Data and Information Security》|2016年|89-96|共8页
会议地点
作者
Ilham Kusuma; M. Anwar Masum; Novian Habibie; Wisnu Jatmiko; Heru Suhartanto;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Sparks; Clustering algorithms; Big data; Algorithm design and analysis; File systems; Distributed databases; Data mining;

机译：Sparks;聚类算法;大数据;算法设计和分析;文件系统;分布式数据库;数据挖掘;

相似文献

外文文献
中文文献
专利

1. A Support Vector and K-Means Based Hybrid Intelligent Data Clustering Algorithm [J] . Liang SUN, Shinichi YOSHIDA, Yanchun LIANG IEICE transactions on information and systems . 2011,第11期

机译：基于支持向量和K-Means的混合智能数据聚类算法
2. A Support Vector and K-Means Based Hybrid Intelligent Data Clustering Algorithm [J] . Liang SUN, Shinichi YOSHIDA, Yanchun LIANG IEICE Transactions on Information and Systems . 2011,第11期

机译：基于支持向量和K-Means的混合智能数据聚类算法
3. Design of electricity tariff plans using gap statistic for K-means clustering based on consumers monthly electricity consumption data [J] . Ravindra R. Rathod, Rahul Dev Garg International Journal of Energy Sector Management . 2017,第2期

机译：基于差距月统计量的K-means聚类设计电价计划，基于消费者每月的用电量数据
4. Design of intelligent k-means based on spark for big data clustering [C] . Ilham Kusuma, M. Anwar Masum, Novian Habibie, International Workshop on Big Data and Information Security . 2016

机译：基于大数据聚类的智能K型智能k型设计
5. Electromagnetsim Based K-Means Clustering for Big Data [D] . Eerlapati, Abhinav. 2017

机译：基于电磁的大数据K均值聚类
6. Analysis of big data job requirements based on K-means text clustering in China [O] . Dai Debao, Ma Yinxia, Zhao Min, 2021

机译：基于K-MESS文本聚类的大数据职能分析
7. Design Method of Data Acquisition in Intelligent Sensor based on Web Data Mining Clustering Technology [O] . Tingzhong Wang, Huijuan Sun 2015

机译：基于Web数据挖掘聚类技术的智能传感器数据采集设计方法

Design of intelligent k-means based on spark for big data clustering

摘要

著录项

相似文献

相关主题

期刊订阅