首页> 外文期刊>Genetics: A Periodical Record of Investigations Bearing on Heredity and Variation >Estimating Variable Effective Population Sizes from Multiple Genomes: A Sequentially Markov Conditional Sampling Distribution Approach
【24h】

Estimating Variable Effective Population Sizes from Multiple Genomes: A Sequentially Markov Conditional Sampling Distribution Approach

机译:从多个基因组估计有效种群的可变大小:顺序马尔可夫条件抽样分配方法

获取原文
           

摘要

Throughout history, the population size of modern humans has varied considerably due to changes in environment, culture, and technology. More accurate estimates of population size changes, and when they occurred, should provide a clearer picture of human colonization history and help remove confounding effects from natural selection inference. Demography influences the pattern of genetic variation in a population, and thus genomic data of multiple individuals sampled from one or more present-day populations contain valuable information about the past demographic history. Recently, Li and Durbin developed a coalescent-based hidden Markov model, called the pairwise sequentially Markovian coalescent (PSMC), for a pair of chromosomes (or one diploid individual) to estimate past population sizes. This is an efficient, useful approach, but its accuracy in the very recent past is hampered by the fact that, because of the small sample size, only few coalescence events occur in that period. Multiple genomes from the same population contain more information about the recent past, but are also more computationally challenging to study jointly in a coalescent framework. Here, we present a new coalescent-based method that can efficiently infer population size changes from multiple genomes, providing access to a new store of information about the recent past. Our work generalizes the recently developed sequentially Markov conditional sampling distribution framework, which provides an accurate approximation of the probability of observing a newly sampled haplotype given a set of previously sampled haplotypes. Simulation results demonstrate that we can accurately reconstruct the true population histories, with a significant improvement over the PSMC in the recent past. We apply our method, called diCal, to the genomes of multiple human individuals of European and African ancestry to obtain a detailed population size change history during recent times.
机译:在整个历史中,由于环境,文化和技术的变化,现代人类的人口规模相差很大。人口大小变化的更准确估计值以及发生时的变化,应该可以更清楚地了解人类殖民历史,并有助于从自然选择推断中消除混淆效应。人口统计学影响人口中遗传变异的模式,因此从一个或多个当今人口中采样的多个个体的基因组数据包含有关过去人口统计学历史的宝贵信息。最近,Li和Durbin为一对染色体(或一个二倍体个体)开发了一个基于聚结的隐马尔可夫模型,称为成对顺序马尔可夫聚结(PSMC),以估计过去的种群数量。这是一种有效且有用的方法,但由于样本量小,在此期间仅发生了很少的合并事件,因此在最近的准确性受到了限制。来自同一种群的多个基因组包含有关过去的更多信息,但是在合并框架中联合研究也具有更大的计算挑战性。在这里,我们提出了一种基于合并的新方法,该方法可以从多个基因组中有效推断种群大小的变化,从而可以访问有关过去的新信息。我们的工作概括了最近开发的顺序马尔可夫条件采样分布框架,该框架在给定一组先前采样的单倍型的情况下,提供了观察新采样的单倍型的概率的精确近似值。仿真结果表明,我们可以准确地重建真实的人口历史,并且与最近的PSMC相比有了显着改进。我们将称为diCal的方法应用于欧洲和非洲血统的多个人类个体的基因组,以获取最近时期详细的人口规模变化历史。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号