...
首页> 外文期刊>Advances in Experimental Medicine and Biology >KMeans greedy search hybrid algorithm for biclustering gene expression data.
【24h】

KMeans greedy search hybrid algorithm for biclustering gene expression data.

机译:用于对基因表达数据进行聚类的KMeans贪婪搜索混合算法。

获取原文
获取原文并翻译 | 示例
           

摘要

Microarray technology demands the development of algorithms capable of extracting novel and useful patterns like biclusters. A bicluster is a submatrix of the gene expression datamatrix such that the genes show highly correlated activities across all conditions in the submatrix. A measure called Mean Squared Residue (MSR) is used to evaluate the coherence of rows and columns within the submatrix. In this paper, the KMeans greedy search hybrid algorithm is developed for finding biclusters from the gene expression data. This algorithm has two steps. In the first step, high quality bicluster seeds are generated using KMeans clustering algorithm. In the second step, these seeds are enlarged by adding more genes and conditions using the greedy strategy. Here, the objective is to find the biclusters with maximum size and the MSR value lower than a given threshold. The biclusters obtained from this algorithm on both the bench mark datasets are of high quality. The statistical significance and biological relevance of the biclusters are verified using gene ontology database.
机译:微阵列技术要求开发能够提取新颖且有用的模式(例如双峰)的算法。双峰是基因表达数据矩阵的子矩阵,因此基因在子矩阵中的所有条件下均显示高度相关的活动。一种称为均方差残值(MSR)的度量用于评估子矩阵内行和列的相干性。本文提出了KMeans贪婪搜索混合算法,用于从基因表达数据中查找双聚类。该算法有两个步骤。第一步,使用KMeans聚类算法生成高质量的双簇种子。第二步,使用贪婪策略通过添加更多基因和条件来扩大这些种子。在这里,目标是找到具有最大大小且MSR值低于给定阈值的双峰。在两个基准数据集上从此算法获得的二元组是高质量的。利用基因本体论数据库验证了双簇的统计显着性和生物学相关性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号