...
首页> 外文期刊>IEEE/ACM transactions on computational biology and bioinformatics >Coclustering of Human Cancer Microarrays Using Minimum Sum-Squared Residue Coclustering
【24h】

Coclustering of Human Cancer Microarrays Using Minimum Sum-Squared Residue Coclustering

机译:使用最小和平方余数共聚的人类癌症微阵列聚类

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

It is a consensus in microarray analysis that identifying potential local patterns, characterized by coherent groups of genes and conditions, may shed light on the discovery of previously undetectable biological cellular processes of genes as well as macroscopic phenotypes of related samples. In order to simultaneously cluster genes and conditions, we have previously developed a fast co-clustering algorithm, Minimum Sum-Squared Residue Co-clustering (MSSRCC), which employs an alternating minimization scheme and generates what we call co-clusters in a checkerboard structure. In this paper, we propose specific strategies that enable MSSRCC to escape poor local minima and resolve the degeneracy problem in partitional clustering algorithms. The strategies include binormalization, deterministic spectral initialization, and incremental local search. We assess the effects of various strategies on both synthetic gene expression datasets and real human cancer microarrays and provide empirical evidence that MSSRCC with the proposed strategies performs better than existing co-clustering and clustering algorithms. In particular, the combination of all the three strategies leads to the best performance. Furthermore, we illustrate coherence of the resulting co-clusters in a checkerboard structure, where genes in a co-cluster manifest the phenotype structure of corresponding specific samples, and evaluate the enrichment of functional annotations in Gene Ontology (GO).
机译:在微阵列分析中达成共识是,识别潜在的局部模式(以基因和条件的连贯性为特征),可能有助于发现以前无法检测到的基因生物细胞过程以及相关样品的宏观表型。为了同时对基因和条件进行聚类,我们之前开发了一种快速的共聚算法,即最小和平方残差共聚(MSSRCC),该算法采用交替的最小化方案,并在棋盘结构中生成所谓的共聚簇。在本文中,我们提出了使MSSRCC能够逃避较差的局部最小值并解决分区聚类算法中的退化问题的特定策略。这些策略包括双归一化,确定性频谱初始化和增量局部搜索。我们评估了各种策略对合成基因表达数据集和真实人类癌症微阵列的影响,并提供了经验证据,表明所提出策略的MSSRCC比现有的共聚和聚类算法表现更好。尤其是,所有这三种策略的组合可以带来最佳性能。此外,我们说明了棋盘格结构中所产生的共同簇的一致性,其中共同簇中的基因表现出相应特定样品的表型结构,并评估了基因本体论(GO)中功能注释的丰富性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号