【24h】

Biclustering of gene expression data by simulated annealing

机译:通过模拟退火对基因表达数据进行分类

获取原文

摘要

A bicluster of a gene expression dataset is a subset of genes which exhibit similar expression patterns along a subset of conditions. Biclustering algorithms aim at finding subsets of genes and subsets of conditions, such that a single cellular process is the main contributor to the expression of the gene subset over the condition subset. We believe that the size of biclusters should be small compared to the size of the gene expression data matrix and we have observed that a conceptually simpler way to perform biclustering from gene expression data is to apply standard oneway clustering algorithms to the rows and columns of the data matrix separately and then to combine the results to obtain bicluster seeds. Our algorithm has three steps. First, we generate a set of high quality bicluster seeds. In the second phase, these bicluster seeds are enlarged by adding more genes and conditions using a simulated annealing based technique. In the third phase, we find the p-values of the biclusters produced for statistical validation.
机译:基因表达数据集的二聚体是沿条件子集表现出相似表达模式的基因子集。双聚类算法旨在发现基因的子集和条件的子集,从而单个细胞过程是条件子集上基因子集表达的主要贡献者。我们认为,与基因表达数据矩阵的大小相比,双聚类的大小应该较小,并且我们已经观察到,从基因表达数据执行双聚类的概念上更简单的方法是将标准单向聚类算法应用于聚类分析的行和列。数据矩阵分开,然后将结果合并以获得双簇种子。我们的算法分为三个步骤。首先,我们生成了一组高质量的双簇种子。在第二阶段,使用基于模拟退火的技术通过添加更多基因和条件来扩大这些双簇种子。在第三阶段中,我们找到了为统计验证所产生的二元图的p值。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号