...
首页> 外文期刊>Molecular ecology resources >Leveraging whole genome sequencing data for demographic inference with approximate Bayesian computation
【24h】

Leveraging whole genome sequencing data for demographic inference with approximate Bayesian computation

机译:利用全基因组测序数据,用于近似贝叶斯计算的人口统计推断

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Accounting for historical demographic features, such as the strength and timing of gene flow and divergence times between closely related lineages, is vital for many inferences in evolutionary biology. Approximate Bayesian computation (ABC) is one method commonly used to estimate demographic parameters. However, the DNA sequences used as input for this method, often microsatellites or RADseq loci, usually represent a small fraction of the genome. Whole genome sequencing (WGS) data, on the other hand, have been used less often with ABC, and questions remain about the potential benefit of, and how to best implement, this type of data; we used pseudo-observed data sets to explore such questions. Specifically, we addressed the potential improvements in parameter estimation accuracy that could be associated with WGS data in multiple contexts; namely, we quantified the effects of (a) more data, (b) haplotype-based summary statistics, and (c) locus length. Compared with a hypothetical RADseq data set with 2.5 Mbp of data, using a 1 Gbp data set consisting of 100 Kbp sequences led to substantial gains in the accuracy of parameter estimates, which was mostly due to haplotype statistics and increased data. We also quantified the effects of including (a) locus-specific recombination rates, and (b) background selection information in ABC analyses. Importantly, assuming uniform recombination or ignoring background selection had a negative effect on accuracy in many cases. Software and results from this method validation study should be useful for future demographic history analyses.
机译:占历史人口统计特征的核算,如密切相关谱系之间基因流动和分歧时间的强度和时间,对于进化生物学的许多推论至关重要。近似贝叶斯计算(ABC)是一种常用于估算人口统计参数的方法。然而,用作该方法的输入的DNA序列通常微卫星或Radseq基因座,通常代表基因组的一小部分。另一方面,全基因组测序(WGS)数据较少使用ABC使用,并且仍然存在关于潜在利益的问题,以及如何最好的实现这种类型的数据;我们使用伪观察数据集来探索这些问题。具体地,我们解决了参数估计精度的潜在改进,其可以在多个上下文中与WGS数据相关联;即,我们量化了(a)更多数据,(b)基于单倍型的汇总统计,(c)轨迹长度的效果。与具有2.5 MBP的数据集的假设Radseq数据相比,使用由100 kBp序列组成的1 GBP数据集,该数据集导致参数估计的准确性大致收益,这主要是由于单倍型统计和增加的数据。我们还量化了包括(a)轨迹特异性重组率的效果,以及(b)ABC分析中的背景信息。重要的是,假设均匀的重组或忽略背景选择对许多情况下的准确性产生负面影响。该方法验证研究的软件和结果对于未来的人口统计历史分析应该是有用的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号