首页> 美国卫生研究院文献>Frontiers in Genetics >Optimizing Selection of the Reference Population for Genotype Imputation From Array to Sequence Variants
【2h】

Optimizing Selection of the Reference Population for Genotype Imputation From Array to Sequence Variants

机译:从阵列到序列变体的基因型估算的参考种群的优化选择

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Imputation of high-density genotypes to whole-genome sequences (WGS) is a cost-effective method to increase the density of available markers within a population. Imputed genotypes have been successfully used for genomic selection and discovery of variants associated with traits of interest for the population. To allow for the use of imputed genotypes for genomic analyses, accuracy of imputation must be high. Accuracy of imputation is influenced by multiple factors, such as size and composition of the reference group, and the allele frequency of variants included. Understanding the use of imputed WGSs prior to the generation of the reference population is important, as accurate imputation might be more focused, for instance, on common or on rare variants. The aim of this study was to present and evaluate new methods to select animals for sequencing relying on a previously genotyped population. The Genetic Diversity Index method optimizes the number of unique haplotypes in the future reference population, while the Highly Segregating Haplotype selection method targets haplotype alleles found throughout the majority of the population of interest. First the WGSs of a dairy cattle population were simulated. The simulated sequences mimicked the linkage disequilibrium level and the variants’ frequency distribution observed in currently available Holstein sequences. Then, reference populations of different sizes, in which animals were selected using both novel methods proposed here as well as two other methods presented in previous studies, were created. Finally, accuracies of imputation obtained with different reference populations were compared against each other. The novel methods were found to have overall accuracies of imputation of more than 0.85. Accuracies of imputation of rare variants reached values above 0.50. In conclusion, if imputed sequences are to be used for discovery of novel associations between variants and traits of interest in the population, animals carrying novel information should be selected and, consequently, the Genetic Diversity Index method proposed here may be used. If sequences are to be used to impute the overall genotyped population, a reference population consisting of common haplotypes carriers selected using the proposed Highly Segregating Haplotype method is recommended.
机译:将高密度基因型插入全基因组序列(WGS)是一种经济有效的方法,可以提高种群中可用标记的密度。推算的基因型已成功用于基因组选择和发现与人群感兴趣的性状相关的变异。为了允许将估算的基因型用于基因组分析,估算的准确性必须很高。插补的准确性受多个因素的影响,例如参考组的大小和组成以及所含变体的等位基因频率。重要的是在生成参考种群之前了解估算的WGS的使用,因为准确的估算可能会更加集中,例如,常见的或稀有的变体。这项研究的目的是提出和评估新的方法,以依靠先前的基因分型种群选择动物进行测序。遗传多样性指数方法优化了未来参考人群中独特单倍型的数量,而高度分离单倍型选择方法则针对在大多数目标人群中发现的单倍型等位基因。首先模拟了奶牛种群的WGS。模拟的序列模拟了连锁不平衡水平和在当前可用的荷斯坦序列中观察到的变体频率分布。然后,创建了不同大小的参考种群,其中使用此处提出的新方法以及先前研究中提出的其他两种方法选择了动物。最后,将使用不同参考人群获得的估算准确性进行了比较。发现该新方法具有超过0.85的归因于整体准确性。稀有变异的估算准确性达到了0.50以上的值。总之,如果将推算序列用于发现种群中感兴趣的变异和性状之间的新型关联,则应选择携带新型信息的动物,因此,可以使用此处提出的遗传多样性指数方法。如果要使用序列推算总体基因型种群,则建议使用由建议的高度分离单倍型方法选择的普通单倍型载体组成的参考种群。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号