首页> 美国卫生研究院文献>BMC Bioinformatics >Predicting viral exposure response from modeling the changes of co-expression networks using time series gene expression data
【2h】

Predicting viral exposure response from modeling the changes of co-expression networks using time series gene expression data

机译:预测使用时间序列基因表达数据模拟共表达网络的变化的病毒曝光响应

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

In genomics studies, time-series gene expression data [1–3] often need to be processed and analyzed. In 2016, DREAM CHALLENGES released an open challenge called ‘Respiratory Viral DREAM Challenge: Discovering dynamic molecular signatures in response to virus exposure’ (https://www.synapse.org/#!Synapse:syn5647810/wiki/399108). The aim was to develop early predictors of susceptibility and contagiousness based on expression profiles collected prior to and at early time points following viral exposure. Some work reported the differences of transcriptomics [4–6] in the host response between symptomatic and asymptomatic subjects exposure to respiratory viruses. Additionally, as what were done by most participants (https://www.synapse.org/#!Synapse:syn5647810/wiki/402364), some common machine learning algorithms [7] can be used if we treat the challenge as a prediction problem. The challenge results [see Additional file 1 for parts of the challenge results] demonstrate that the prediction performance significantly depends on the participants’ models. However, we need to average the time series data across time or only use cross-sectional data at a time to perform ensemble learning, and the dynamic information of the time series data is lost in these approaches. Moreover, in the early stage of infection (within 24 h), there is little separation of the trajectories of genes among subjects with different clinical responses. Previous studies [8, 9] also showed that the individual responses after exposure to respiratory virus are influenced not only by the baseline immune status of the host but also by the dynamics of the early host immune response immediately following exposure. If we only consider a single gene, there is no distinct pattern in both cross-sectional and dynamic data. It is difficult to differentiate between positive and negative groups by gene expression levels at early stage. In this paper, we resort to gene sets analysis to correlate exposure response with dynamic gene expression patterns in gene sets. To consider multiple genes, some methods have been proposed to infer the relationship between genes. For example, the Dynamic Bayesian Network (DBN) was used to establish the dynamic regulatory network [10]. We note that a number of groups have studied time-varying dynamic Bayesian networks (TV-DBN) to model the varying network structures and reveal the dynamics of biological systems [11, 12]. The dynamic mixed membership stochastic block model (dMMSB) helps to infer the biological functions of genes through modeling the dynamic tomography of networks [13]. The review of differential network biology [14] advocated that differential network mapping at large scales may provide a deeper understanding of complex biological phenomena. The work [15] analyzed multiple differential co-expression networks based on time-course RNA-Seq data. Through Multiple Differential Modules (M-DMs), they found that dynamic modules are associated with the development of heart failure. These results in the literature suggest that considering the dynamics of networks may help us to better understand disease onset and progression. However, how to extract useful dynamic information from time-series gene expression data to build predictive model remains a challenging problem.
机译:在基因组学研究中,通常需要处理和分析时间序列基因表达数据[1-3]。 2016年,梦想挑战发布了一个名为“呼吸道病毒梦想挑战”的开放挑战:发现响应病毒暴露的动态分子签名'(https://www.synapse.org/#!synapse:Syn5647810/Wiki/399108)。目的是基于在病毒暴露后早期收集的表达谱来发展易感性和传染病的早期预测因子。一些作品报告了转录组织[4-6]在症状和无症状受试者接触呼吸道病毒之间的宿主响应中的差异。此外,正如大多数参与者所做的那样(https://www.synapse.org/# !synapse:syn5647810/Wiki/402364),如果我们将挑战视为预测,可以使用一些公共机器学习算法[7]问题。挑战结果[有关挑战结果的部分,请参阅附加文件1]证明预测性能显着取决于参与者的模型。然而,我们需要在时间跨越时间序列数据,或者一次仅使用横截面数据来执行集合学习,并且在这些方法中丢失了时间序列数据的动态信息。此外,在感染的早期阶段(在24小时内),在具有不同临床反应的受试者中基因轨迹几乎没有分离。之前的研究[8,9]还表明,暴露于呼吸道病毒后的个体反应不仅受到宿主的基线免疫状态,而且在暴露后立即受早期宿主免疫反应的动态影响。如果我们只考虑单个基因,则横截面和动态数据中没有明显的模式。难以在早期的基因表达水平分化正面和阴性群。在本文中,我们采用基因集分析来与基因组动态基因表达模式相关的曝光响应。为了考虑多种基因,已经提出了一些方法来推断基因之间的关系。例如,动态贝叶斯网络(DBN)用于建立动态监管网络[10]。我们注意到,许多组已经研究了时变的动态贝叶斯网络(TV-DBN)来建模不同的网络结构,并揭示生物系统的动态[11,12]。动态混合隶属随机块模型(DMMSB)有助于通过建模网络的动态断层扫描来推断基因的生物学功能[13]。差异网络生物学的审查[14]主张大规模的差分网络映射可以提供对复杂生物现象的更深入的了解。工作[15]基于时间课程RNA-SEQ数据分析多个差分共表达网络。通过多个差分模块(M-DMS),他们发现动态模块与心力衰竭的开发相关联。这些结果表明,考虑到网络的动态可能有助于我们更好地了解疾病发作和进展。然而,如何从时序基因表达数据中提取有用的动态信息以构建预测模型仍然是一个具有挑战性的问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号