首页> 外文期刊>Knowledge-Based Systems >Reporting and analyzing alternative clustering solutions by employing multi-objective genetic algorithm and conducting experiments on cancer data
【24h】

Reporting and analyzing alternative clustering solutions by employing multi-objective genetic algorithm and conducting experiments on cancer data

机译:通过采用多目标遗传算法并进行癌症数据实验来报告和分析替代聚类解决方案

获取原文
获取原文并翻译 | 示例

摘要

Clustering is an essential research problem which has received considerable attention in the research community for decades. It is a challenge because there is no unique solution that fits all problems and satisfies all applications. We target to get the most appropriate clustering solution for a given application domain. In other words, clustering algorithms in general need prior specification of the number of clusters, and this is hard even for domain experts to estimate especially in a dynamic environment where the data changes and/or become available incrementally. In this paper, we described and analyze the effectiveness of a robust clustering algorithm which integrates multi-objective genetic algorithm into a framework capable of producing alternative clustering solutions; it is called Multi-objective K-Means Genetic Algorithm (MOKGA). We investigate its application for clustering a variety of datasets, including micro-array gene expression data. The reported results are promising. Though we concentrate on gene expression and mostly cancer data, the proposed approach is general enough and works equally to cluster other datasets as demonstrated by the two datasets Iris and Ruspini. After running MOKGA, a pareto-optimal front is obtained, and gives the optimal number of clusters as a solution set. The achieved clustering results are then analyzed and validated under several cluster validity techniques proposed in the literature. As a result, the optimal clusters are ranked for each validity index. We apply majority voting to decide on the most appropriate set of validity indexes applicable to every tested dataset. The proposed clustering approach is tested by conducting experiments using seven well cited benchmark data sets. The obtained results are compared with those reported in the literature to demonstrate the applicability and effectiveness of the proposed approach.
机译:聚类是一个必不可少的研究问题,数十年来一直受到研究界的广泛关注。这是一个挑战,因为没有适合所有问题并满足所有应用程序的独特解决方案。我们的目标是为给定的应用程序域获得最合适的群集解决方案。换句话说,聚类算法通常需要事先指定聚类的数量,即使领域专家也很难做到这一点,尤其是在数据变化和/或递增可用的动态环境中。在本文中,我们描述并分析了一种健壮的聚类算法的有效性,该算法将多目标遗传算法集成到能够产生替代聚类解决方案的框架中;它被称为多目标K均值遗传算法(MOKGA)。我们研究其在聚类各种数据集(包括微阵列基因表达数据)中的应用。报告的结果是有希望的。尽管我们专注于基因表达和大多数癌症数据,但是所提出的方法足够通用,并且可以等效地聚类两个数据集,如两个数据集Iris和Ruspini所示。在运行MOKGA之后,获得了一个pareto-optimized前沿,并给出了最优的集群数作为解集。然后在文献中提出的几种聚类有效性技术下对获得的聚类结果进行分析和验证。结果,针对每个有效性指标对最佳聚类进行排名。我们采用多数表决权,以决定适用于每个测试数据集的最合适的有效性指标集。通过使用七个被引用良好的基准数据集进行实验,对提出的聚类方法进行了测试。将获得的结果与文献报道的结果进行比较,以证明所提出方法的适用性和有效性。

著录项

  • 来源
    《Knowledge-Based Systems》 |2014年第1期|108-122|共15页
  • 作者单位

    Department of Computer Science, University of Calgary, Calgary, Alberta, Canada;

    Department of Computer Science, University of Calgary, Calgary, Alberta, Canada;

    Department of Computer Science, University of Calgary, Calgary, Alberta, Canada;

    Department of Computer Engineering, Cankaya University, Ankara, Turkey;

    Department of Computing, University of Bradford, Bradford, UK;

    Department of Computer Science, University of Calgary, Calgary, Alberta, Canada;

    Department of Computer Science, University of Calgary, Calgary, Alberta, Canada;

    Department of Computer Engineering, TOBB University, Ankara, Turkey;

    Department of Computer Engineering, Firat University 23119, Elazig, Turkey;

    Department of Computing, University of Bradford, Bradford, UK;

    Department of Computer Science, University of Calgary, Calgary, Alberta, Canada;

    Department of Computer Science, University of Calgary, Calgary, Alberta, Canada ,Department of Computer Science, Global University, Beirut, Lebanon;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Clustering; Genetic algorithm; Gene expression data; Multi-objective optimization; Cluster validity analysis;

    机译:集群;遗传算法基因表达数据;多目标优化;聚类有效性分析;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号