...
首页> 外文期刊>Data & Knowledge Engineering >Determining The Best K For Clustering Transactional Datasets: A Coverage Density-based Approach
【24h】

Determining The Best K For Clustering Transactional Datasets: A Coverage Density-based Approach

机译:确定用于聚类交易数据集的最佳K:基于覆盖密度的方法

获取原文
获取原文并翻译 | 示例

摘要

The problem of determining the optimal number of clusters is important but mysterious in cluster analysis. In this paper, we propose a novel method to find a set of candidate optimal number Ks of clusters in transactional datasets. Concretely, we propose Transactional-clus-ter-modes Dissimilarity based on the concept of coverage density as an intuitive transactional inter-cluster dissimilarity measure. Based on the above measure, an agglomerative hierarchical clustering algorithm is developed and the Merging Dissimilarity Indexes, which are generated in hierarchical cluster merging processes, are used to find the candidate optimal number Ks of clusters of transactional data. Our experimental results on both synthetic and real data show that the new method often effectively estimates the number of clusters of transactional data.
机译:确定最佳聚类数的问题很重要,但在聚类分析中却是个谜。在本文中,我们提出了一种新颖的方法来在事务数据集中找到一组候选最优簇数Ks。具体而言,我们基于覆盖密度的概念提出了事务性-cl-ter-模式差异,作为一种直观的事务性群集间差异性度量。基于上述措施,开发了一种聚类的层次聚类算法,并使用层次聚类合并过程中生成的合并不相似指数来查找事务数据聚类的候选最佳数目Ks。我们在综合数据和真实数据上的实验结果表明,该新方法通常可以有效地估算交易数据簇的数量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号