首页> 外文期刊>BMC Bioinformatics >Prior knowledge driven Granger causality analysis on gene regulatory network discovery
【24h】

Prior knowledge driven Granger causality analysis on gene regulatory network discovery

机译:基于先验知识的基因调控网络发现的格兰杰因果关系分析

获取原文
           

摘要

Our study focuses on discovering gene regulatory networks from time series gene expression data using the Granger causality (GC) model. However, the number of available time points (T) usually is much smaller than the number of target genes (n) in biological datasets. The widely applied pairwise GC model (PGC) and other regularization strategies can lead to a significant number of false identifications when nT. In this study, we proposed a new method, viz., CGC-2SPR (CGC using two-step prior Ridge regularization) to resolve the problem by incorporating prior biological knowledge about a target gene data set. In our simulation experiments, the propose new methodology CGC-2SPR showed significant performance improvement in terms of accuracy over other widely used GC modeling (PGC, Ridge and Lasso) and MI-based (MRNET and ARACNE) methods. In addition, we applied CGC-2SPR to a real biological dataset, i.e., the yeast metabolic cycle, and discovered more true positive edges with CGC-2SPR than with the other existing methods. In our research, we noticed a “ 1+1>2” effect when we combined prior knowledge and gene expression data to discover regulatory networks. Based on causality networks, we made a functional prediction that the Abm1 gene (its functions previously were unknown) might be related to the yeast’s responses to different levels of glucose. Our research improves causality modeling by combining heterogeneous knowledge, which is well aligned with the future direction in system biology. Furthermore, we proposed a method of Monte Carlo significance estimation (MCSE) to calculate the edge significances which provide statistical meanings to the discovered causality networks. All of our data and source codes will be available under the link https://bitbucket.org/dtyu/granger-causality/wiki/Home .
机译:我们的研究重点是使用格兰杰因果关系(GC)模型从时序基因表达数据中发现基因调控网络。但是,可用时间点(T)的数量通常比生物学数据集中的目标基因的数量(n)小得多。当n T时,被广泛应用的成对GC模型(PGC)和其他正则化策略可能导致大量错误识别。在这项研究中,我们提出了一种新方法,即CGC-2SPR(使用两步先验Ridge正则化的CGC),通过结合有关目标基因数据集的先验生物学知识来解决该问题。在我们的模拟实验中,与其他广泛使用的GC建模(PGC,Ridge和Lasso)和基于MI的方法(MRNET和ARACNE)相比,建议的新方法CGC-2SPR在准确性方面显示出显着的性能改进。另外,我们将CGC-2SPR应用于真实的生物学数据集,即酵母的代谢周期,并发现CGC-2SPR比其他现有方法具有更多真实的阳性边缘。在我们的研究中,当我们将先验知识和基因表达数据结合起来发现调节网络时,我们注意到“ 1 + 1> 2”效应。基于因果关系网络,我们做出了功能预测,即Abm1基因(其功能先前未知)可能与酵母菌对不同葡萄糖水平的反应有关。我们的研究通过结合异构知识来改善因果关系建模,这与系统生物学的未来方向非常吻合。此外,我们提出了一种蒙特卡洛重要性估计(MCSE)方法来计算边缘重要性,该边缘重要性为发现的因果网络提供了统计意义。我们所有的数据和源代码将在https://bitbucket.org/dtyu/granger-causality/wiki/Home下提供。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号