首页> 外文期刊>International Journal of Biometric and Bioinformatics >Biological Significance of Gene Expression Data Using Similarity Based Biclustering Algorithm
【24h】

Biological Significance of Gene Expression Data Using Similarity Based Biclustering Algorithm

机译:基于相似度的双聚类算法对基因表达数据的生物学意义

获取原文
           

摘要

Unlocking the complexity of a living organism?s biological processes, functions and genetic network is vital in learning how to improve the health of humankind. Genetic analysis, especially biclustering, is a significant step in this process. Though many biclustering methods exist, only few provide a query based approach for biologists to search the biclusters which contain a certain gene of interest. This proposed query based biclustering algorithm SIMBIC+ first identifies a functionally rich query gene. After identifying the query gene, sets of genes including query gene that show coherent expression patterns across subsets of experimental conditions is identified. It performs simultaneous clustering on both row and column dimension to extract biclusters using Top down approach. Since it uses novel ?ratio? based similarity measure, biclusters with more coherence and with more biological meaning are identified. SIMBIC+ uses score based approach with an aim of maximizing the similarity of the bicluster. Contribution entropy based condition selection and multiple row / column deletion methods are used to reduce the complexity of the algorithm to identify biclusters with maximum similarity value. Experiments are conducted on Yeast Saccharomyces dataset and the biclusters obtained are compared with biclusters of popular MSB (Maximum Similarity Bicluster) algorithm. The biological significance of the biclusters obtained by the proposed algorithm and MSB are compared and the comparison proves that SIMBIC+ identifies biclusters with more significant GO (Gene Ontology).
机译:释放生物体的生物过程,功能和遗传网络的复杂性对于学习如何改善人类健康至关重要。遗传分析,尤其是双簇分析,是此过程中的重要一步。尽管存在许多双聚类方法,但只有很少的方法为生物学家提供了一种基于查询的方法来搜索包含特定目标基因的双聚类。该提出的基于查询的双聚类算法SIMBIC +首先识别功能丰富的查询基因。鉴定查询基因后,鉴定包括查询基因的基因集,这些基因在整个实验条件的子集上显示出一致的表达模式。它使用“自顶向下”方法在行和列维度上同时执行聚类以提取双峰。由于使用新颖的“比率”?基于相似性度量,可以确定具有更多连贯性和更多生物学意义的双聚类。 SIMBIC +使用基于分数的方法,旨在最大程度地提高双扬声器的相似性。使用基于贡献熵的条件选择和多种行/列删除方法来降低识别具有最大相似性值的双聚类的算法的复杂性。在酵母酵母数据集上进行了实验,并将获得的双峰与流行的MSB(最大相似度双峰)算法的双峰进行了比较。比较了所提算法和MSB所获得的二聚体的生物学意义,比较结果表明SIMBIC +可以识别出具有更高显着性GO(基因本体论)的二聚体。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号