Parallel K-prototypes for Clustering Big Data

机译：用于大数据聚类的并行K原型

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Big data clustering has become an important challenge in data mining. Indeed, Big data are often characterized by a huge volume and a variety of attributes namely, numerical and categorical. To deal with these challenges, we propose the parallel k-prototypes method which is based on the Map-Reduce model. This method is able to perform efficient groupings on large-scale and mixed type of data. Experiments realized on huge data sets show the performance of the proposed method in clustering large-scale of mixed data.

机译：大数据集群已成为数据挖掘中的重要挑战。确实，大数据通常具有庞大的数量和各种属性，即数值和分类属性。为了应对这些挑战，我们提出了基于Map-Reduce模型的并行k-原型方法。此方法能够对大规模和混合类型的数据执行有效的分组。在海量数据集上进行的实验表明，该方法在大规模混合数据聚类中的性能。

著录项

来源
《International conference on computational collective intelligence》|2015年|628-637|共10页
会议地点
作者
Mohamed Aymen Ben HajKacem; Chiheb-Eddine Ben Ncir; Nadia Essoussi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Big data; K-prototypes; Map-reduce; Mixed data;

机译：大数据; K-原型;映射减少混合数据;

相似文献

外文文献
中文文献
专利

1. Glowworm Swarm Optimization Algorithm- and K-Prototypes Algorithm-Based Metadata Tree Clustering [J] . Yaping Li Mathematical Problems in Engineering: Theory, Methods and Applications . 2021,第a期

机译：基于萤石群优化算法和基于k原型的算法的元数据树聚类
2. An efficient privacy preserving on high-order heterogeneous data using fuzzy K-prototype clustering [J] . Dilip Golda Journal of ambient intelligence and humanized computing . 2021,第5期

机译：使用模糊K-Prototype聚类在高阶异构数据上保留的有效隐私
3. Clustering the mixed panel dataset using Gower's distance and k-prototypes algorithms [J] . Akay Ozlem, Yuksel Guzin Communications in Statistics . 2018,第8a10期

机译：使用Gower距离和k-原型算法对混合面板数据集进行聚类
4. Parallel K-prototypes for Clustering Big Data [C] . Mohamed Aymen Ben HajKacem, Chiheb-Eddine Ben Ncir, Nadia Essoussi International Conference on Computational Collective Intelligence . 2015

机译：用于聚类大数据的并行k原型
5. Visual data mining: Using parallel coordinate plots with K-means clustering and color to find correlations in a multidimensional dataset. [D] . Peterson, Angela R. 2009

机译：可视数据挖掘：使用具有K均值聚类和颜色的平行坐标图来查找多维数据集中的相关性。
6. Fine-grained parallelization of fitness functions in bioinformatics optimization problems: gene selection for cancer classification and biclustering of gene expression data [O] . Juan A. Gomez-Pulido, Jose L. Cerrada-Barrios, Sebastian Trinidad-Amado, 2016

机译：生物信息学优化问题中适应度函数的细粒度并行化：用于癌症分类的基因选择和基因表达数据的聚类
7. K-prototypes Algorithm for Clustering Schools Based on The Student Admission Data in IPB University [O] . Sri Sulastri, Lismayani Usman, Utami Dyah Syafitri 2021

机译：基于IPB大学学生入学数据的聚类学校k原型算法
8. Parallel k-Means Clustering for Quantitative Ecoregion Delineation Using Large Data Sets. [R] . Kumar, J., Mills, R. T., Hoffman, F. M., 2011

机译：用大数据集进行定量生态区域划分的并行k均值聚类。

Parallel K-prototypes for Clustering Big Data

摘要

著录项

相似文献

相关主题

期刊订阅