首页> 美国卫生研究院文献>BMC Bioinformatics >PFBNet: a priori-fused boosting method for gene regulatory network inference
【2h】

PFBNet: a priori-fused boosting method for gene regulatory network inference

机译:PFBNET:基因监管网络推论的优先融合促进方法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

In system biology, comprehending the intricate gene regulatory network (GRN) is of significant important, since it provides insights to understand the cell physiology, development and pathogenesis [1, 2]. With the advent of high-throughput experimental techniques such as RNA-Seq and DNA microarrays, inferring the GRN from such data at genomic scale is feasible. However, it is still a challenge due to the high-dimensional and noisy characteristics of the data, and the regulatory network may be obscured by the indirect connections. Another problem is that the samples of the data are often relatively few compared to the number of genes (i.e., the n≪p problem [3]). So far, various methods have been developed for inferring GRNs from expression data, including Bayesian Networks-based methods [4–9], information theory-based methods [6, 10–15], Ordinary Differential Equation (ODE) based methods [16–19], ensemble framework based methods [20–25], etc. Here we briefly review some algorithms that are related to our work. Among these approaches, the algorithms that under the ensemble framework have emerged as the strong players, such as GENIE [22], TIGRESS [21], BiXGBoost [25], etc. The key idea of the ensemble framework is to decompose the GRN inference problem into p feature selection subproblems (p is the number of genes in the data) and solve each subproblem with the corresponding regression model. As the regression model is selected, the confidences of the regulation relationships that from each candidate regulator (i.e., transcription factors (TFs)) to the corresponding target gene could be calculated as the feature weight. Finally, outputs from each subproblem are fused to reconstruct the GRN. Several algorithms (e.g., TIGRESS) chose the linear model to address the problem, however, they may not perform well if the data presents a higher-order structure. On the other hand, the algorithms that utilizing the nonlinear model can easily be computationally intractable as the number of the candidate regulators increase remarkably. Although these algorithms are successful, they inferred the GRN only used a single type of data (i.e., the gene expression), whereas other types of data (e.g., expression from the knockout) may provide non-redundant information about the directionality of regulatory relationships [23]. To this end, it is important to incorporate the prior information (e.g., the information from the knockout data) in GRN inference, which may lead the GRN to be more reliable and interpretable.
机译:在系统生物学中,理解复杂的基因监管网络(GRN)具有重要意义,因为它提供了了解细胞生理,发育和发病机制的见解[1,2]。随着高通量实验技术的出现,如RNA-SEQ和DNA微阵列,在基因组尺度下推断出GRN的GRN是可行的。然而,由于数据的高维和嘈杂特性,它仍然是一个挑战,并且监管网络可能被间接连接模糊。另一个问题是,与基因数量相比,数据的样本通常相对较少(即,N.,N.P问题[3])。到目前为止,已经开发了各种方法,用于从表达数据推断GRN,包括基于贝叶斯网络的方法[4-9],基于信息理论的方法[6,10-15],基于常规方程(ode)的方法[16 -19],基于集合框架的方法[20-25]等。在这里,我们简要介绍一些与我们的工作有关的算法。在这些方法中,在集合框架下的算法被出现为强大的球员,例如Genie [22],Tigress [21],BixGoost [25]等。集合框架的关键思想是分解GRN推理p特征选择子问题的问题(p是数据中的基因数),并用相应的回归模型解决每个子问题。 As the regression model is selected, the confidences of the regulation relationships that from each candidate regulator (i.e., transcription factors (TFs)) to the corresponding target gene could be calculated as the feature weight.最后,每个子问题的输出融合以重建GN。几种算法(例如,TIGRESS)选择线性模型来解决问题,但是,如果数据呈现更高阶结构,则它们可能无法执行。另一方面,随着候选调节器的数量显着增加,利用非线性模型的算法可以很容易地进行计算地难以解决。虽然这些算法是成功的,但他们推断GRN仅使用单一类型的数据(即,基因表达),而其他类型的数据(例如,来自敲除的表达)可以提供关于监管关系方向性的非冗余信息[23]。为此,必须在GRN推断中纳入先前的信息(例如,来自淘汰数据的信息)是重要的,这可能导致GRN更可靠和可解释。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号