首页> 外文OA文献 >Sparse hierarchical Bayesian models for detecting relevant antigenic sites in virus evolution
【2h】

Sparse hierarchical Bayesian models for detecting relevant antigenic sites in virus evolution

机译:用于检测病毒进化中相关抗原位点的稀疏贝叶斯模型

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Understanding how virus strains offer protection against closely related emerging strains is vital for creating effective vaccines. For many viruses, including Foot-and-Mouth Disease Virus (FMDV) and the Influenza virus where multiple serotypes often co-circulate, in vitro testing of large numbers of vaccines can be infeasible. Therefore the development of an in silico predictor of cross-protection between strains is important to help optimise vaccine choice. Vaccines will offer cross-protection against closely related strains, but not against those that are antigenically distinct. To be able to predict cross-protection we must understand the antigenic variability within a virus serotype, distinct lineages of a virus, and identify the antigenic residues and evolutionary changes that cause the variability. In this thesis we present a family of sparse hierarchical Bayesian models for detecting relevant antigenic sites in virus evolution (SABRE), as well as an extended version of the method, the extended SABRE (eSABRE) method, which better takes into account the data collection process. ududThe SABRE methods are a family of sparse Bayesian hierarchical models that use spike and slab priors to identify sites in the viral protein which are important for the neutralisation of the virus. In this thesis we demonstrate how the SABRE methods can be used to identify antigenic residues within different serotypes and show how the SABRE method outperforms established methods, mixed-effects models based on forward variable selection or l1 regularisation, on both synthetic and viral datasets. In addition we also test a number of different versions of the SABRE method, compare conjugate and semi-conjugate prior specifications and an alternative to the spike and slab prior; the binary mask model. We also propose novel proposal mechanisms for the Markov chain Monte Carlo (MCMC) simulations, which improve mixing and convergence over that of the established component-wise Gibbs sampler. The SABRE method is then applied to datasets from FMDV and the Influenza virus in order to identify a number of known antigenic residue and to provide hypotheses of other potentially antigenic residues. We also demonstrate how the SABRE methods can be used to create accurate predictions of the important evolutionary changes of the FMDV serotypes.ududIn this thesis we provide an extended version of the SABRE method, the eSABRE method, based on a latent variable model. The eSABRE method takes further into account the structure of the datasets for FMDV and the Influenza virus through the latent variable model and gives an improvement in the modelling of the error. We show how the eSABRE method outperforms the SABRE methods in simulation studies and propose a new information criterion for selecting the random effects factors that should be included in the eSABRE method; block integrated Widely Applicable Information Criterion (biWAIC). We demonstrate how biWAIC performs equally to two other methods for selecting the random effects factors and combine it with the eSABRE method to apply it to two large Influenza datasets. Inference in these large datasets is computationally infeasible with the SABRE methods, but as a result of the improved structure of the likelihood, we are able to show how the eSABRE method offers a computational improvement, leading it to be used on these datasets. The results of the eSABRE method show that we can use the method in a fully automatic manner to identify a large number of antigenic residues on a variety of the antigenic sites of two Influenza serotypes, as well as making predictions of a number of nearby sites that may also be antigenic and are worthy of further experiment investigation.
机译:了解病毒株如何提供针对紧密相关的新兴株的保护对于创建有效疫苗至关重要。对于许多病毒,包括口蹄疫病毒(FMDV)和经常共同传播多种血清型的流感病毒,在体外测试大量疫苗是不可行的。因此,菌株间交叉保护的计算机预测因子的发展对于帮助优化疫苗选择很重要。疫苗将对紧密相关的菌株提供交叉保护,但对抗原性不同的菌株不提供交叉保护。为了能够预测交叉保护,我们必须了解病毒血清型内的抗原变异性,病毒的不同谱系,并确定导致变异性的抗原残基和进化变化。在本文中,我们提出了一个用于检测病毒进化中相关抗原位点的稀疏层次贝叶斯模型家族,以及该方法的扩展版本,即扩展SABER(eSABRE)方法,该方法可以更好地考虑数据收集处理。 ud udSABER方法是一类稀疏的贝叶斯分级模型,该模型使用先验和先验先验来鉴定病毒蛋白中对病毒中和重要的位点。在本文中,我们演示了如何使用SABRE方法鉴定不同血清型中的抗原残基,并展示SABRE方法在合成数据集和病毒数据集上如何胜过已建立的方法,基于前向变量选择或11正则化的混合效应模型。此外,我们还测试了SABER方法的许多不同版本,比较了共轭和半共轭先验规格,以及尖峰和平板先验的替代品;二进制掩码模型。我们还为马尔可夫链蒙特卡洛(MCMC)模拟提出了新颖的提议机制,该机制比已建立的基于组件的Gibbs采样器改善了混合和收敛。然后,将SABRE方法应用于来自FMDV和流感病毒的数据集,以识别许多已知的抗原残基并提供其他潜在抗原残基的假设。我们还演示了如何使用SABRE方法为FMDV血清型的重要进化变化创建准确的预测。 ud ud在本文中,我们基于潜在变量模型提供了SABRE方法的扩展版本eSABRE方法。 。 eSABRE方法通过潜在变量模型进一步考虑了FMDV和流感病毒的数据集结构,并改进了错误建模。我们展示了eSABRE方法在模拟研究中如何胜过SABER方法,并提出了一种新的信息准则,用于选择应包含在eSABRE方法中的随机影响因子;块集成了广泛适用的信息标准(biWAIC)。我们演示了biWAIC如何与选择随机影响因子的其他两种方法同样发挥作用,并将其与eSABRE方法组合以将其应用于两个大型流感数据集。使用SABRE方法无法在这些大型数据集中进行推理,但是由于可能性结构的改进,我们能够证明eSABRE方法如何提供计算改进,从而使其可以在这些数据集上使用。 eSABRE方法的结果表明,我们可以以全自动方式使用该方法来识别两种流感病毒血清型的多种抗原位点上的大量抗原残基,并可以预测许多也可能具有抗原性,值得进一步的实验研究。

著录项

  • 作者

    Davies Vinny;

  • 作者单位
  • 年度 2016
  • 总页数
  • 原文格式 PDF
  • 正文语种 en
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号