Parallel K-Means Implementation for Data Clustering Using Hadoop Map-Reduce

Maithri C; Chandramouli E

首页> 外文期刊>Journal of computational and theoretical nanoscience >Parallel K-Means Implementation for Data Clustering Using Hadoop Map-Reduce

【24h】

Parallel K-Means Implementation for Data Clustering Using Hadoop Map-Reduce

机译：使用Hadoop地图 - 减少的数据群集并行K-Meanse实现

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The electronic information from online newspapers, journals, conference proceedings website pages and emails are growing rapidly which are generating huge amount of data. Controlling, indexing and searching of these huge electronic data is not feasible especially for human and alsofor search engines. Thus, automatic document organization is an important issue for this huge information. Using document clustering methods insights the data distribution or pre-process data for other applications. In this paper, a parallel clustering algorithm based on K-mean clusteringis proposed which is to iterate and optimize documents upload and access. Specifically, the proposed algorithm is implemented on Apache Hadoop architecture with the huge document access data set and the algorithm is evaluated on different conditions with different possible input documents.This paper is used to present the information of frequent access of each access and to suggest the pattern of document representation in the cloud storage.

机译：来自在线报纸，期刊，会议诉讼网站页面和电子邮件的电子信息正在快速增长，这是产生大量数据。控制，索引和搜索这些庞大的电子数据是不可行的，特别是人类和Alsofor搜索引擎。因此，自动文档组织是这种巨大信息的重要问题。使用文档群集方法对其他应用程序的数据分发或预处理数据深入了解。在本文中，基于K-MEAL Clusteringis的并行聚类算法，该算法是迭代和优化上传和访问的文档。具体地，该算法在Apache Hadoop架构上实现了具有巨大的文档访问数据集，并且在具有不同可能的输入文档的不同条件下评估算法。本文用于呈现每个访问的频繁访问的信息，并建议文档表示的模式在云存储中。

著录项

来源
《Journal of computational and theoretical nanoscience》 |2018年第12期|共6页
作者
Maithri C; Chandramouli E;
展开▼
作者单位

Dept. of CSE Kalpataru Institute of Technology;

Dept. of CSE East Point College of Engineering and Technology;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类薄膜技术;
关键词
K-Means; Hadoop; Document Clustering; Data Mining; Parallel Clustering;

机译：K-means;Hadoop;文档聚类;数据挖掘;并行聚类;

相似文献

外文文献
中文文献
专利

1. Parallel K-Means Implementation for Data Clustering Using Hadoop Map-Reduce [J] . Maithri C, Chandramouli E Journal of computational and theoretical nanoscience . 2018,第11a12期

机译：使用Hadoop地图 - 减少的数据群集并行K-Meanse实现
2. Implementation of hadoop optimization K-means parallel clustering algorithm [J] . Huang Suyu, Tan Lingli Basic & clinical pharmacology & toxicology. . 2019,第S1期

机译：Hadoop优化K-mears并行聚类算法的实现
3. Improved K-Means Clustering Algorithm for Big Data Mining under Hadoop Parallel Framework [J] . Journal of grid computing . 2020,第2期

机译：Hadoop并行框架下的大数据挖掘改进的K-means聚类算法
4. Genetic Algorithm Based Parallel K-Means Data Clustering Algorithm Using MapReduce Programming Paradigm on Hadoop Environment (GAPKCA) [C] . Sayer Alshammari, Maslina Binti Zolkepli, Rusli Bin Abdullah International Conference on Soft Computing and Data Mining . 2020

机译：基于遗传算法的并行k均值数据聚类算法使用MapReduce编程范例对Hadoop环境（Gapkca）
5. Visual data mining: Using parallel coordinate plots with K-means clustering and color to find correlations in a multidimensional dataset. [D] . Peterson, Angela R. 2009

机译：可视数据挖掘：使用具有K均值聚类和颜色的平行坐标图来查找多维数据集中的相关性。
6. ParaKMeans: Implementation of a parallelized K-means algorithm suitable for general laboratory use [O] . Piotr Kraj, Ashok Sharma, Nikhil Garge, 2008

机译：ParaKMeans：实现适用于一般实验室的并行化K均值算法
7. Perform wordcount Map-Reduce Job in Single Node Apache Hadoop cluster and compress data using Lempel-Ziv-Oberhumer (LZO) algorithm [O] . Nandan Nagarajappa Mirajkar, Sandeep Bhujbal, Aaradhana Arvind Deshmukh 2013

机译：在单节点apache Hadoop集群中执行wordcount map-Reduce作业并使用Lempel-Ziv-Oberhumer（LZO）算法压缩数据
8. Parallel k-Means Clustering for Quantitative Ecoregion Delineation Using Large Data Sets. [R] . Kumar, J., Mills, R. T., Hoffman, F. M., 2011

机译：用大数据集进行定量生态区域划分的并行k均值聚类。

Parallel K-Means Implementation for Data Clustering Using Hadoop Map-Reduce

摘要

著录项

相似文献

相关主题

期刊订阅