...
首页> 外文期刊>BMC Genomics >Imputation of missing genotypes within LD-blocks relying on the basic coalescent and beyond: consideration of population growth and structure
【24h】

Imputation of missing genotypes within LD-blocks relying on the basic coalescent and beyond: consideration of population growth and structure

机译:依赖于基本结合及其他因素,在LD-区块内估算缺失基因型:考虑人口增长和结构

获取原文

摘要

Genotypes not directly measured in genetic studies are often imputed to improve statistical power and to increase mapping resolution. The accuracy of standard imputation techniques strongly depends on the similarity of linkage disequilibrium (LD) patterns in the study and reference populations. Here we develop a novel approach for genotype imputation in low-recombination regions that relies on the coalescent and permits to explicitly account for population demographic factors. To test the new method, study and reference haplotypes were simulated and gene trees were inferred under the basic coalescent and also considering population growth and structure. The reference haplotypes that first coalesced with study haplotypes were used as templates for genotype imputation. Computer simulations were complemented with the analysis of real data. Genotype concordance rates were used to compare the accuracies of coalescent-based and standard (IMPUTE2) imputation. Simulations revealed that, in LD-blocks, imputation accuracy relying on the basic coalescent was higher and less variable than with IMPUTE2. Explicit consideration of population growth and structure, even if present, did not practically improve accuracy. The advantage of coalescent-based over standard imputation increased with the minor allele frequency and it decreased with population stratification. Results based on real data indicated that, even in low-recombination regions, further research is needed to incorporate recombination in coalescence inference, in particular for studies with genetically diverse and admixed individuals. To exploit the full potential of coalescent-based methods for the imputation of missing genotypes in genetic studies, further methodological research is needed to reduce computer time, to take into account recombination, and to implement these methods in user-friendly computer programs. Here we provide reproducible code which takes advantage of publicly available software to facilitate further developments in the field.
机译:在基因研究中未直接测量的基因型通常被认为可提高统计功效并提高作图分辨率。标准插补技术的准确性很大程度上取决于研究人群和参考人群中连锁不平衡(LD)模式的相似性。在这里,我们开发了一种在低重组地区进行基因型插补的新方法,该方法依赖于合并并允许明确考虑人口统计学因素。为了测试该新方法,模拟了研究和参考单倍型,并在基本合并下并考虑了种群增长和结构,推断了基因树。首先与研究单倍型合并的参考单倍型用作基因型估算的模板。计算机仿真辅以实际数据分析。基因型一致性率用于比较基于合并和标准(IMPUTE2)估算的准确性。仿真显示,在LD块中,依赖于基本合并的插补精度比IMPUTE2的插补精度更高且变化更少。即使存在人口增长和结构,也没有明确考虑,实际上并没有提高准确性。与较小的等位基因频率相比,基于合并的优势优于标准归因,随着群体分层而降低。基于真实数据的结果表明,即使在低重组区域,也需要进一步研究以将重组结合到合并推断中,特别是对于具有遗传多样性和混合个体的研究。为了充分利用基于聚结的方法来估算遗传研究中缺失基因型的潜力,需要进一步的方法学研究以减少计算机时间,考虑重组并在用户友好的计算机程序中实施这些方法。在这里,我们提供了可重现的代码,该代码利用了公共软件来促进该领域的进一步发展。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号