首页> 外文期刊>Molecular biology and evolution >Inferring Population Histories Using Genome-Wide Allele Frequency Data
【24h】

Inferring Population Histories Using Genome-Wide Allele Frequency Data

机译:使用全基因组等位基因频率数据推断人口历史

获取原文
获取原文并翻译 | 示例
           

摘要

The recent development of high-throughput genotyping technologies has revolutionized the collection of data in a wide range of both model and nonmodel species. These data generally contain huge amounts of information about the demographic history of populations. In this study, we introduce a new method to estimate divergence times on a diffusion time scale from large single-nucleotide polymorphism (SNP) data sets, conditionally on a population history that is represented as a tree. We further assume that all the observed polymorphisms originate from the most ancestral (root) population; that is, we neglect mutations that occur after the split of the most ancestral population. This method relies on a hierarchical Bayesian model, based on Kimura's time-dependent diffusion approximation of genetic drift. We implemented a Metropolis-Hastings within Gibbs sampler to estimate the posterior distribution of the parameters of interest in this model, which we refer to as the Kimura model. Evaluating the Kimura model on simulated population histories, we found that it provides accurate estimates of divergence time. Assessing model fit using the deviance information criterion (DIC) proved efficient for retrieving the correct tree topology among a set of competing histories. We show that this procedure is robust to low-to-moderate gene flow, as well as to ascertainment bias, providing that the most distantly related populations are represented in the discovery panel. As an illustrative example, we finally analyzed published human data consisting in genotypes for 452,198 SNPs from individuals belonging to four populations worldwide. Our results suggest that the Kimura model may be helpful to characterize the demographic history of differentiated populations, using genome-wide allele frequency data.
机译:高通量基因分型技术的最新发展彻底改变了模型和非模型物种的数据收集方式。这些数据通常包含有关人口统计历史的大量信息。在这项研究中,我们介绍了一种新的方法,可以根据大型单核苷酸多态性(SNP)数据集(有条件地以树为代表的种群历史)来估计扩散时间尺度上的发散时间。我们进一步假设所有观察到的多态性都起源于最原始的(根)种群。也就是说,我们忽略了大多数祖先群体分裂后发生的突变。该方法依赖于基于木村的遗传漂移随时间的扩散近似的分层贝叶斯模型。我们在Gibbs采样器中实现了Metropolis-Hastings,以估计该模型中感兴趣参数的后验分布,我们将该模型称为Kimura模型。在模拟的人口历史上对木村模型进行评估,我们发现该模型可提供准确的发散时间估计。实践证明,使用偏差信息准则(DIC)评估模型拟合对于在一组竞争历史中检索正确的树形拓扑非常有效。我们表明,该程序对于低至中度的基因流以及确定偏倚具有鲁棒性,前提是发现面板中代表了最远相关的种群。作为说明性例子,我们最终分析了来自人类的452,198个SNP的基因型构成的已公布人类数据,这些个体来自世界各地的四个人群。我们的结果表明,使用全基因组等位基因频率数据,木村模型可能有助于表征分化人群的人口统计学历史。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号