首页> 外文会议>European Conference on Principles and Practice of Knowledge Discovery in Databases >Agglomerative Hierarchical Clustering with Constraints: Theoretical and Empirical Results
【24h】

Agglomerative Hierarchical Clustering with Constraints: Theoretical and Empirical Results

机译:与限制的附聚层次聚类:理论和经验结果

获取原文

摘要

We explore the use of instance and cluster-level constraints with ag-glomerative hierarchical clustering. Though previous work has illustrated the benefits of using constraints for non-hierarchical clustering, their application to hierarchical clustering is not straight-forward for two primary reasons. First, some constraint combinations make the feasibility problem (Does there exist a single feasible solution?) NP-complete. Second, some constraint combinations when used with traditional agglomerative algorithms can cause the dendrogram to stop prematurely in a dead-end solution even though there exist other feasible solutions with a significantly smaller number of clusters. When constraints lead to efficiently solvable feasibility problems and standard agglomerative algorithms do not give rise to dead-end solutions, we empirically illustrate the benefits of using constraints to improve cluster purity and average distortion. Furthermore, we introduce the new γ constraint and use it in conjunction with the triangle inequality to considerably improve the efficiency of agglomerative clustering.
机译:我们探索使用Ag-Blomerative分层群集的实例和群集级别约束。虽然以前的工作已经说明了使用非分层聚类的约束的好处,但它们在分层群集的应用程序是两种主要原因的直截了当的。首先,一些约束组合使得可行性问题(有一个可行的解决方案存在?)NP-Tression。其次,与传统的附聚算法一起使用时的一些约束组合可以使树形图在死端溶液中过早地停止,即使存在具有明显较少数量的簇的其他可行的解决方案。当约束导致有效可解决可行的可行性问题和标准附聚算法不会产生死端解决方案时,我们经验阐述了使用约束来提高集群纯度和平均失真的益处。此外,我们介绍了新的γ约束,并与三角形不等式结合使用,以大大提高凝聚聚类的效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号