首页> 外文期刊>BMC Bioinformatics >Bayesian estimation of scaled mutation rate under the coalescent: a sequential Monte Carlo approach
【24h】

Bayesian estimation of scaled mutation rate under the coalescent: a sequential Monte Carlo approach

机译:联合条件下规模化突变率的贝叶斯估计:顺序蒙特卡洛方法

获取原文
获取外文期刊封面目录资料

摘要

Samples of molecular sequence data of a locus obtained from random individuals in a population are often related by an unknown genealogy. More importantly, population genetics parameters, for instance, the scaled population mutation rate Θ=4N e μ for diploids or Θ=2N e μ for haploids (where N e is the effective population size and μ is the mutation rate per site per generation), which explains some of the evolutionary history and past qualities of the population that the samples are obtained from, is of significant interest. In this paper, we present the evolution of sequence data in a Bayesian framework and the approximation of the posterior distributions of the unknown parameters of the model, which include Θ via the sequential Monte Carlo (SMC) samplers for static models. Specifically, we approximate the posterior distributions of the unknown parameters with a set of weighted samples i.e., the set of highly probable genealogies out of the infinite set of possible genealogies that describe the sampled sequences. The proposed SMC algorithm is evaluated on simulated DNA sequence datasets under different mutational models and real biological sequences. In terms of the accuracy of the estimates, the proposed SMC method shows a comparable and sometimes, better performance than the state-of-the-art MCMC algorithms. We showed that the SMC algorithm for static model is a promising alternative to the state-of-the-art approach for simulating from the posterior distributions of population genetics parameters.
机译:从人群中的随机个体获得的基因座的分子序列数据样本通常与未知的族谱有关。更重要的是,种群遗传学参数,例如,二倍体的缩放种群突变率Θ= 4N eμ或单倍体的缩放种群突变率Θ= 2N eμ(其中N e是有效种群大小,μ是每代每个位点的突变率) ,这解释了从中获得样本的种群的一些进化历史和过去的品质,引起了人们的极大兴趣。在本文中,我们介绍了贝叶斯框架中序列数据的演变以及该模型未知参数的后验分布的近似值,其中包括通过静态模型的顺序蒙特卡洛(SMC)采样器获得的Θ。具体来说,我们使用一组加权样本(即描述样本序列的无限可能谱系中的一组高度概率谱系)来估计未知参数的后验分布。在不同的突变模型和真实的生物序列下,在模拟的DNA序列数据集上评估了所提出的SMC算法。就估计的准确性而言,所提出的SMC方法与最新的MCMC算法相比具有可比的,有时甚至更好的性能。我们表明,用于静态模型的SMC算法是从人口遗传学参数的后验分布进行模拟的最新方法的有希望的替代方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号