...
首页> 外文期刊>Genetics, selection, evolution >A class of Bayesian methods to combine large numbers of genotyped and non-genotyped animals for whole-genome analyses
【24h】

A class of Bayesian methods to combine large numbers of genotyped and non-genotyped animals for whole-genome analyses

机译:一类结合大量基因型和非基因型动物的贝叶斯方法,用于全基因组分析

获取原文

摘要

Background To obtain predictions that are not biased by selection, the conditional mean of the breeding values must be computed given the data that were used for selection. When single nucleotide polymorphism (SNP) effects have a normal distribution, it can be argued that single-step best linear unbiased prediction (SS-BLUP) yields a conditional mean of the breeding values. Obtaining SS-BLUP, however, requires computing the inverse of the dense matrix G of genomic relationships, which will become infeasible as the number of genotyped animals increases. Also, computing G requires the frequencies of SNP alleles in the founders, which are not available in most situations. Furthermore, SS-BLUP is expected to perform poorly relative to variable selection models such as BayesB and BayesC as marker densities increase. Methods A strategy is presented for Bayesian regression models (SSBR) that combines all available data from genotyped and non-genotyped animals, as in SS-BLUP, but accommodates a wider class of models. Our strategy uses imputed marker covariates for animals that are not genotyped, together with an appropriate residual genetic effect to accommodate deviations between true and imputed genotypes. Under normality, one formulation of SSBR yields results identical to SS-BLUP, but does not require computing G or its inverse and provides richer inferences. At present, Bayesian regression analyses are used with a few thousand genotyped individuals. However, when SSBR is applied to all animals in a breeding program, there will be a 100 to 200-fold increase in the number of animals and an associated 100 to 200-fold increase in computing time. Parallel computing strategies can be used to reduce computing time. In one such strategy, a 58-fold speedup was achieved using 120 cores. Discussion In SSBR and SS-BLUP, phenotype, genotype and pedigree information are combined in a single-step. Unlike SS-BLUP, SSBR is not limited to normally distributed marker effects; it can be used when marker effects have a t distribution, as in BayesA, or mixture distributions, as in BayesB or BayesC π. Furthermore, it has the advantage that matrix inversion is not required. We have investigated parallel computing to speedup SSBR analyses so they can be used for routine applications.
机译:背景为了获得不因选择而产生偏差的预测,必须根据给定的选择数据计算出育种值的条件均值。当单核苷酸多态性(SNP)效应具有正态分布时,可以认为单步最佳线性无偏预测(SS-BLUP)产生了育种值的条件平均值。然而,获得SS-BLUP需要计算基因组关系的稠密矩阵G的逆,随着基因型动物数量的增加,这将变得不可行。同样,计算G需要创始人中SNP等位基因的频率,这在大多数情况下不可用。此外,随着标记密度的增加,相对于诸如BayesB和BayesC之类的变量选择模型,SS-BLUP的性能较差。方法提出了一种针对贝叶斯回归模型(SSBR)的策略,该策略结合了基因分型和非基因分型动物的所有可用数据,如SS-BLUP中一样,但适用于更广泛的模型。我们的策略将推算标记协变量用于未进行基因分型的动物,以及适当的残留遗传效应,以适应真实和推定基因型之间的差异。在正常情况下,SSBR的一种公式产生的结果与SS-BLUP相同,但是不需要计算G或其逆,并且可以提供更丰富的推论。目前,贝叶斯回归分析用于几千个基因型个体。但是,在育种程序中将SSBR应用于所有动物时,动物数量将增加100到200倍,计算时间也会相应增加100到200倍。可以使用并行计算策略来减少计算时间。在这样的一种策略中,使用120个内核实现了58倍的加速。讨论在SSBR和SS-BLUP中,表型,基因型和谱系信息可以一步完成。与SS-BLUP不同,SSBR不限于正态分布的标记效果。当标记效果具有t分布(例如在BayesA中)或混合物分布(例如在BayesB或BayesCπ中)时,可以使用它。此外,其优点在于不需要矩阵求逆。我们已经研究了并行计算以加快SSBR分析的速度,因此它们可用于常规应用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号