首页> 美国卫生研究院文献>Systematic Biology >Online Bayesian Phylogenetic Inference: Theoretical Foundations via Sequential Monte Carlo
【2h】

Online Bayesian Phylogenetic Inference: Theoretical Foundations via Sequential Monte Carlo

机译:在线贝叶斯系统发生学推论:通过顺序蒙特卡洛理论的基础

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Phylogenetics, the inference of evolutionary trees from molecular sequence data such as DNA, is an enterprise that yields valuable evolutionary understanding of many biological systems. Bayesian phylogenetic algorithms, which approximate a posterior distribution on trees, have become a popular if computationally expensive means of doing phylogenetics. Modern data collection technologies are quickly adding new sequences to already substantial databases. With all current techniques for Bayesian phylogenetics, computation must start anew each time a sequence becomes available, making it costly to maintain an up-to-date estimate of a phylogenetic posterior. These considerations highlight the need for an online Bayesian phylogenetic method which can update an existing posterior with new sequences. Here, we provide theoretical results on the consistency and stability of methods for online Bayesian phylogenetic inference based on Sequential Monte Carlo (SMC) and Markov chain Monte Carlo. We first show a consistency result, demonstrating that the method samples from the correct distribution in the limit of a large number of particles. Next, we derive the first reported set of bounds on how phylogenetic likelihood surfaces change when new sequences are added. These bounds enable us to characterize the theoretical performance of sampling algorithms by bounding the effective sample size (ESS) with a given number of particles from below. We show that the ESS is guaranteed to grow linearly as the number of particles in an SMC sampler grows. Surprisingly, this result holds even though the dimensions of the phylogenetic model grow with each new added sequence.
机译:系统发育学是从诸如DNA之类的分子序列数据中推论出进化树的企业,它对许多生物系统产生了有价值的进化理解。贝叶斯系统进化算法近似于树上的后验分布,已成为一种流行的计算系统进化方法。现代数据收集技术正在迅速将新序列添加到已经很庞大的数据库中。使用贝叶斯系统发育学的所有当前技术,每次序列可用时,都必须重新开始计算,这使得维护系统发育后验的最新估计成本很高。这些考虑突出了对在线贝叶斯系统发生方法的需求,该方法可以用新序列更新现有的后验。在此,我们提供了基于顺序蒙特卡洛(SMC)和马尔可夫链蒙特卡洛的在线贝叶斯系统发生推断方法的一致性和稳定性的理论结果。我们首先显示出一致性结果,这表明该方法在大量颗粒的限制下从正确的分布中采样。接下来,我们得出关于添加新序列时系统发生似然表面如何变化的第一个报告边界集。这些界限使我们能够通过将有效样本大小(ESS)与来自下方的给定数量的粒子进行界限来表征采样算法的理论性能。我们表明,随着SMC采样器中粒子数量的增长,ESS可以保证线性增长。出乎意料的是,即使系统发育模型的尺寸随着每个新添加的序列而增长,该结果仍然成立。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号