k-PbC: an improved cluster center initialization for categorical data clustering

首页> 外文期刊>Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies >k-PbC: an improved cluster center initialization for categorical data clustering

【24h】

k-PbC: an improved cluster center initialization for categorical data clustering

机译：K-PBC：分类数据聚类的改进的集群中心初始化

获取原文

获取原文并翻译 | 示例

获取外文期刊封面目录资料

开具论文收录证明 >>

文献代查 >>

文献数据库（团队版） >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The performance of a partitional clustering algorithm is influenced by the initial random choice of cluster centers. Different runs of the clustering algorithm on the same data set often yield different results. This paper addresses that challenge by proposing an algorithm namedk-PbC, which takes advantage of non-random initialization from the view of pattern mining to improve clustering quality. Specifically,k-PbC first performs a maximal frequent itemset mining approach to find a set of initial clusters. It then uses a kernel-based method to form cluster centers and an information-theoretic based dissimilarity measure to estimate the distance between cluster centers and data objects. An extensive experimental study was performed on various real categorical data sets to draw a comparison betweenk-PbC and state-of-the-art categorical clustering algorithms in terms of clustering quality. Comparative results have revealed that the proposed initialization method can enhance clustering results andk-PbC outperforms compared algorithms for both internal and external validation metrics.

机译：分区聚类算法的性能受到集群中心的初始随机选择的影响。在同一数据集上的不同运行群集算法通常会产生不同的结果。本文通过提出算法Namedk-PBC来解决这一挑战，这利用了模式挖掘视图来提高聚类质量的非随机初始化。具体而言，K-PBC首先执行最大频繁的项目集挖掘方法，以查找一组初始集群。然后，它使用基于内核的方法来形成群集中心和基于信息的信息，以估计群集中心和数据对象之间的距离。对各种真实的基本数据集进行了广泛的实验研究，以在聚类质量方面绘制PBC和最先进的分类聚类算法。比较结果表明，所提出的初始化方法可以增强聚类结果ANDK-PBC优于内部和外部验证度量的比较算法。

著录项

来源
《Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies》 |2020年第8期|共23页
作者

展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词
Data mining; Distance-based clustering; Pattern mining; Maximal frequent itemsets; Cluster center initialization; Categorical data;

机译：数据挖掘;基于距离的聚类;模式挖掘;最大频繁项目集;群集中心初始化;分类数据;

相似文献

外文文献
中文文献
专利

1. k-PbC: an improved cluster center initialization for categorical data clustering [J] . Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies . 2020,第8期

机译：K-PBC：分类数据聚类的改进的集群中心初始化
2. An initialization method to simultaneously find initial cluster centers and the number of clusters for clustering categorical data [J] . Liang Bai, Jiye Liang, Chuangyin Dang Knowledge-Based Systems . 2011,第6期

机译：同时查找初始聚类中心和聚类数量以聚类分类数据的初始化方法
3. A cluster centers initialization method for clustering categorical data [J] . Liang Bai, Jiye Liang, Chuangyin Dang, Expert systems with applications . 2012,第9期

机译：用于分类数据聚类的聚类中心初始化方法
4. Improving Performance of K-Means Clustering by Initializing Cluster Centers Using Genetic Algorithm and Entropy Based Fuzzy Clustering for Categorization of Diabetic Patients [C] . Asha Gowda Karegowda, Vidya T., Shama, International Conference on Advances in Computing . 2013

机译：利用基于遗传算法初始化群集中心和基于糖尿病患者的熵的模糊聚类来提高K-Means聚类的性能
5. Automatic categorical data clustering and spatial data clustering by consecutive resolution refinement. [D] . Foss, Andrew Philip Ogilvie. 2002

机译：通过连续的分辨率优化自动分类数据聚类和空间数据聚类。
6. Does Determination of Initial Cluster Centroids Improve the Performance of K-Means Clustering Algorithm? Comparison of Three Hybrid Methods by Genetic Algorithm Minimum Spanning Tree and Hierarchical Clustering in an Applied Study [O] . Saeedeh Pourahmad, Atefeh Basirat, Amir Rahimi, 2020

机译：初始簇质心的确定是否提高了K-Means聚类算法的性能？应用研究中遗传算法最小生成树和分层聚类的三种混合方法的比较
7. Comparing of EA K- modes clustering and NBEA K - modes clustering , A new method for clustering categorical data applying them on the injecting drug users data set [O] . Zamani Nasab Zahra 2017

机译：EA K-模式聚类和NBEA K-模式聚类的比较，一种将分类数据应用于注射毒品使用者数据集的新方法

k-PbC: an improved cluster center initialization for categorical data clustering

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅