...
首页> 外文期刊>BMC Systems Biology >Modeling gene expression regulatory networks with the sparse vector autoregressive model
【24h】

Modeling gene expression regulatory networks with the sparse vector autoregressive model

机译:用稀疏向量自回归模型建模基因表达调控网络

获取原文
           

摘要

Background To understand the molecular mechanisms underlying important biological processes, a detailed description of the gene products networks involved is required. In order to define and understand such molecular networks, some statistical methods are proposed in the literature to estimate gene regulatory networks from time-series microarray data. However, several problems still need to be overcome. Firstly, information flow need to be inferred, in addition to the correlation between genes. Secondly, we usually try to identify large networks from a large number of genes (parameters) originating from a smaller number of microarray experiments (samples). Due to this situation, which is rather frequent in Bioinformatics, it is difficult to perform statistical tests using methods that model large gene-gene networks. In addition, most of the models are based on dimension reduction using clustering techniques, therefore, the resulting network is not a gene-gene network but a module-module network. Here, we present the Sparse Vector Autoregressive model as a solution to these problems. Results We have applied the Sparse Vector Autoregressive model to estimate gene regulatory networks based on gene expression profiles obtained from time-series microarray experiments. Through extensive simulations, by applying the SVAR method to artificial regulatory networks, we show that SVAR can infer true positive edges even under conditions in which the number of samples is smaller than the number of genes. Moreover, it is possible to control for false positives, a significant advantage when compared to other methods described in the literature, which are based on ranks or score functions. By applying SVAR to actual HeLa cell cycle gene expression data, we were able to identify well known transcription factor targets. Conclusion The proposed SVAR method is able to model gene regulatory networks in frequent situations in which the number of samples is lower than the number of genes, making it possible to naturally infer partial Granger causalities without any a priori information. In addition, we present a statistical test to control the false discovery rate, which was not previously possible using other gene regulatory network models.
机译:背景技术为了解重要生物学过程的分子机制,需要对涉及的基因产物网络进行详细描述。为了定义和理解这种分子网络,文献中提出了一些统计方法,以从时间序列微阵列数据估计基因调控网络。但是,仍然需要克服几个问题。首先,除了基因之间的相关性外,还需要推断信息流。其次,我们通常尝试从少量基因芯片实验(样品)中产生的大量基因(参数)中识别大型网络。由于这种情况(在生物信息学中很常见),因此很难使用对大型基因-基因网络进行建模的方法来执行统计测试。另外,大多数模型基于使用聚类技术的降维,因此,生成的网络不是基因-基因网络,而是模块-模块网络。在这里,我们提出稀疏向量自回归模型作为这些问题的解决方案。结果我们已应用稀疏向量自回归模型基于从时间序列微阵列实验获得的基因表达谱估计基因调控网络。通过广泛的仿真,通过将SVAR方法应用于人工调控网络,我们证明了SVAR甚至可以在样本数量小于基因数量的条件下推断出真正的阳性边缘。此外,与基于等级或得分函数的文献中描述的其他方法相比,可以控制误报,这是一个明显的优势。通过将SVAR应用于实际的HeLa细胞周期基因表达数据,我们能够鉴定出众所周知的转录因子靶标。结论所提出的SVAR方法能够在样本数量少于基因数量的常见情况下对基因调控网络进行建模,从而可以自然推断出部分Granger因果关系而无需任何先验信息。此外,我们提出了一项统计测试来控制错误发现率,这在以前使用其他基因调控网络模型是不可能的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号