首页> 外文期刊>Human Genomics >On the stability of the Bayenv method in assessing human SNP-environment associations
【24h】

On the stability of the Bayenv method in assessing human SNP-environment associations

机译:Bayenv方法在评估人类SNP与环境关联中的稳定性

获取原文
       

摘要

Background Phenotypic variation along environmental gradients has been documented among and within many species, and in some cases, genetic variation has been shown to be associated with these gradients. Bayenv is a relatively new method developed to detect patterns of polymorphisms associated with environmental gradients. Using a Bayesian Markov Chain Monte Carlo (MCMC) approach, Bayenv evaluates whether a linear model relating population allele frequencies to environmental variables is more probable than a null model based on observed frequencies of neutral markers. Although this method has been used to detect environmental adaptation in a number of species, including humans, plants, fish, and mosquitoes, stability between independent runs of this MCMC algorithm has not been characterized. In this paper, we explore the variability of results between runs and the factors contributing to it. Results Independent runs of the Bayenv program were carried out using genome-wide single-nucleotide polymorphism (SNP) data from samples from 60 worldwide human populations following previous applications of the Bayenv method. To assess factors contributing to the method's stability, we used varying numbers of MCMC iterations and also analyzed a second modified data set that excluded two Siberian populations with extreme climate variables. Between any two runs, correlations between Bayes factors and the overlap of SNPs in the empirical p value tails were surprisingly low. Enrichments of genic versus non-genic SNPs in the empirical tails were more robust than the empirical p values; however, the significance of the enrichments for some environmental variables still varied among runs, contradicting previously published conclusions. Runs with a greater number of MCMC iterations slightly reduced run-to-run variability, and excluding the Siberian populations did not have a large effect on the stability of the runs. Conclusions Because of high run-to-run variability, we advise against making conclusions about genome-wide patterns of adaptation based on only one run of the Bayenv algorithm and recommend caution in interpreting previous studies that have used only one run. Moving forward, we suggest carrying out multiple independent runs of Bayenv and averaging Bayes factors between runs to produce more stable and reliable results. With these modifications, future discoveries of environmental adaptation within species using the Bayenv method will be more accurate, interpretable, and easily compared between studies.
机译:背景技术已经记录了许多物种之间及其内部沿着环境梯度的表型变异,并且在某些情况下,遗传变异已证明与这些梯度有关。 Bayenv是一种相对较新的方法,用于检测与环境梯度相关的多态性模式。 Bayenv使用贝叶斯马尔可夫链蒙特卡罗(MCMC)方法,根据观察到的中性标记频率评估将种群等位基因频率与环境变量相关联的线性模型是否比空模型更可能。尽管此方法已用于检测许多物种(包括人类,植物,鱼类和蚊子)的环境适应性,但尚未描述此MCMC算法独立运行之间的稳定性。在本文中,我们探讨了运行之间结果的变异性以及对其造成影响的因素。结果Bayenv程序的独立运行是使用Bayenv方法的先前应用程序,使用来自60个全球人群的样本的全基因组单核苷酸多态性(SNP)数据进行的。为了评估影响该方法稳定性的因素,我们使用了不同数量的MCMC迭代,还分析了第二个修改的数据集,该数据集排除了两个具有极端气候变量的西伯利亚种群。在任意两次运行之间,经验p值尾部中的贝叶斯因子与SNP重叠之间的相关性令人惊讶地低。经验尾巴中基因和非基因SNP的富集比经验p值更稳定。但是,富集对于某些环境变量的重要性在各次运行之间仍存在差异,这与先前发表的结论相矛盾。具有更多MCMC迭代次数的运行会稍微减少运行间的差异,并且排除西伯利亚种群不会对运行稳定性产生很大影响。结论由于每次运行之间的差异很大,因此我们建议不要仅根据一次Bayenv算法得出关于全基因组适应模式的结论,并建议谨慎解释以前仅使用一次运行的研究。展望未来,我们建议进行Bayenv的多个独立运行,并在运行之间平均Bayes因子,以产生更稳定和可靠的结果。通过这些修改,使用Bayenv方法对物种内部环境适应的未来发现将更加准确,可解释,并且可以在研究之间轻松进行比较。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号