首页> 外文OA文献 >Algoritmos de agrupamento aplicados a dados de expressão gênica de câncer: um estudo comparativo
【2h】

Algoritmos de agrupamento aplicados a dados de expressão gênica de câncer: um estudo comparativo

机译:聚类算法应用于癌症基因表达数据的比较研究

摘要

The use of clustering methods for the discovery of cancer subtypes has drawn a great deal of attention in the scientific community. While bioinformaticians have proposed new clustering methods that take advantage of characteristics of the gene expression data, the medical community has a preference for using classic clustering methods. There have been no studies thus far performing a large-scale evaluation of different clustering methods in this context. This work presents the first large-scale analysis of seven different clustering methods and four proximity measures for the analysis of 35 cancer gene expression data sets. Results reveal that the finitemixture of Gaussians, followed closely by k-means, exhibited the best performance in terms of recovering the true structure of the data sets. These methods also exhibited, on average, the smallest difference between the actual number of classes in the data sets and the best number of clusters as indicated by our validation criteria. Furthermore, hierarchical methods, whichhave been widely used by the medical community, exhibited a poorer recovery performance than that of the other methods evaluated. Moreover, as a stable basis for the assessment and comparison of different clustering methods for cancer gene expression data, this study provides a common group of data sets (benchmark data sets) to be shared among researchers and usedfor comparisons with new methods
机译:使用聚类方法发现癌症亚型引起了科学界的极大关注。生物信息学家提出了利用基因表达数据特征的新聚类方法,而医学界则倾向于使用经典聚类方法。到目前为止,还没有研究在这种情况下对不同聚类方法进行大规模评估。这项工作为7种不同的聚类方法和4种邻近度分析方法进行了首次大规模分析,以分析35种癌症基因表达数据集。结果表明,就恢复数据集的真实结构而言,高斯的有限混合以及紧随其后的k均值表现出最佳性能。如我们的验证标准所示,这些方法还平均表现出数据集中类别的实际数目与群集的最佳数目之间的最小差异。此外,已被医学界广泛使用的分层方法显示出比其他评估方法更差的恢复性能。此外,作为评估和比较癌症基因表达数据的不同聚类方法的稳定基础,本研究提供了一组共同​​的数据集(基准数据集)供研究人员共享,并用于与新方法进行比较

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号