首页> 外文期刊>Knowledge and Data Engineering, IEEE Transactions on >A Link-Based Cluster Ensemble Approach for Categorical Data Clustering
【24h】

A Link-Based Cluster Ensemble Approach for Categorical Data Clustering

机译:分类数据聚类的基于链接的聚类集成方法

获取原文
获取原文并翻译 | 示例

摘要

Although attempts have been made to solve the problem of clustering categorical data via cluster ensembles, with the results being competitive to conventional algorithms, it is observed that these techniques unfortunately generate a final data partition based on incomplete information. The underlying ensemble-information matrix presents only cluster-data point relations, with many entries being left unknown. The paper presents an analysis that suggests this problem degrades the quality of the clustering result, and it presents a new link-based approach, which improves the conventional matrix by discovering unknown entries through similarity between clusters in an ensemble. In particular, an efficient link-based algorithm is proposed for the underlying similarity assessment. Afterward, to obtain the final clustering result, a graph partitioning technique is applied to a weighted bipartite graph that is formulated from the refined matrix. Experimental results on multiple real data sets suggest that the proposed link-based method almost always outperforms both conventional clustering algorithms for categorical data and well-known cluster ensemble techniques.
机译:尽管已尝试解决通过聚类集成对分类数据进行聚类的问题,但其结果与传统算法相比具有竞争力,但可以观察到,这些技术不幸地基于不完整的信息生成了最终的数据分区。底层的集成信息矩阵仅显示群集数据点关系,许多条目保持未知。本文提出了一种分析,表明此问题降低了聚类结果的质量,并提出了一种基于链接的新方法,该方法通过通过整体中聚类之间的相似性发现未知条目来改进常规矩阵。特别是,提出了一种有效的基于链接的算法,用于基础相似性评估。然后,为了获得最终的聚类结果,将图划分技术应用于由精制矩阵构成的加权二部图。在多个真实数据集上的实验结果表明,所提出的基于链接的方法几乎总是优于传统的分类数据聚类算法和众所周知的聚类集成技术。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号