【24h】

Solving Cluster Ensemble Problems by Correlation's matrix GA

机译:用相关矩阵和遗传算法求解聚类集合问题

获取原文
获取原文并翻译 | 示例

摘要

Clustering ensembles have emerged as a powerful method for improving both the robustness and the stability of unsupervised classification solutions. However, finding a consensus clustering from multiple partitions is a difficult problem that can be approached from graph-based, combinatorial or statistical perspectives. We offer a probabilistic model of consensus using a finite mixture of multinomial distributions in a space of clustering. A combined partition is found as a solution to the corresponding maximum likelihood problem using the GA algorithm. The excellent scalability of this algorithm and comprehensible underlying model are particularly important for clustering of large datasets. This study includes two sections, at the first, calculate correlation matrix this matrix show correlation between samples and we found the best samples that can be in the center of clusters. In the other section a genetic algorithm is employed to produce the most stable partitions from an evolving ensemble (population) of clustering algorithms along with a special objective function. The objective function evaluates multiple partitions according to changes caused by data perturbations and prefers those clustering that are least susceptible to those perturbations.
机译:聚类集成已成为一种强大的方法,可同时提高无监督分类解决方案的鲁棒性和稳定性。但是,从多个分区中找到共识聚类是一个困难的问题,可以从基于图形的,组合的或统计的角度来解决。我们在群集空间中使用多项式分布的有限混合来提供共识的概率模型。使用GA算法发现了组合分区作为相应最大似然问题的解决方案。该算法的出色可伸缩性和可理解的基础模型对于大型数据集的聚簇尤为重要。这项研究包括两个部分,首先,计算相关矩阵,该矩阵显示样本之间的相关性,我们找到了可以位于聚类中心的最佳样本。在另一部分中,采用遗传算法从聚类算法的演化集合(种群)以及特殊的目标函数中生成最稳定的分区。目标函数根据数据扰动引起的变化来评估多个分区,并优先选择那些对这些扰动最不敏感的聚类。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号