...
首页> 外文期刊>BMC Genetics >Multi-population genomic prediction using a multi-task Bayesian learning model
【24h】

Multi-population genomic prediction using a multi-task Bayesian learning model

机译:使用多任务贝叶斯学习模型的多人群基因组预测

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Background Genomic prediction in multiple populations can be viewed as a multi-task learning problem where tasks are to derive prediction equations for each population and multi-task learning property can be improved by sharing information across populations. The goal of this study was to develop a multi-task Bayesian learning model for multi-population genomic prediction with a strategy to effectively share information across populations. Simulation studies and real data from Holstein and Ayrshire dairy breeds with phenotypes on five milk production traits were used to evaluate the proposed multi-task Bayesian learning model and compare with a single-task model and a simple data pooling method. Results A multi-task Bayesian learning model was proposed for multi-population genomic prediction. Information was shared across populations through a common set of latent indicator variables while SNP effects were allowed to vary in different populations. Both simulation studies and real data analysis showed the effectiveness of the multi-task model in improving genomic prediction accuracy for the smaller Ayshire breed. Simulation studies suggested that the multi-task model was most effective when the number of QTL was small (n?=?20), with an increase of accuracy by up to 0.09 when QTL effects were lowly correlated between two populations (ρ?=?0.2), and up to 0.16 when QTL effects were highly correlated (ρ?=?0.8). When QTL genotypes were included for training and validation, the improvements were 0.16 and 0.22, respectively, for scenarios of the low and high correlation of QTL effects between two populations. When the number of QTL was large (n?=?200), improvement was small with a maximum of 0.02 when QTL genotypes were not included for genomic prediction. Reduction in accuracy was observed for the simple pooling method when the number of QTL was small and correlation of QTL effects between the two populations was low. For the real data, the multi-task model achieved an increase of accuracy between 0 and 0.07 in the Ayrshire validation set when 28,206 SNPs were used, while the simple data pooling method resulted in a reduction of accuracy for all traits except for protein percentage. When 246,668 SNPs were used, the accuracy achieved from the multi-task model increased by 0 to 0.03, while using the pooling method resulted in a reduction of accuracy by 0.01 to 0.09. In the Holstein population, the three methods had similar performance. Conclusions Results in this study suggest that the proposed multi-task Bayesian learning model for multi-population genomic prediction is effective and has the potential to improve the accuracy of genomic prediction.
机译:可以将多个群体中的背景基因组预测视为一个多任务学习问题,其中任务是为每个群体推导预测方程式,并且可以通过跨群体共享信息来改善多任务学习属性。这项研究的目的是开发一种用于多人群基因组预测的多任务贝叶斯学习模型,该策略具有在人群之间有效共享信息的策略。利用荷斯坦奶牛和艾尔郡奶牛五个产奶性状的表型的模拟研究和真实数据,对所提出的多任务贝叶斯学习模型进行了评估,并与单任务模型和简单的数据汇总方法进行了比较。结果提出了一种多任务贝叶斯学习模型,用于多种群基因组预测。通过一组共同的潜在指标变量在人群之间共享信息,同时允许SNP效应在不同人群中变化。仿真研究和实际数据分析均显示了多任务模型在提高较小的艾希尔郡品种的基因组预测准确性方面的有效性。仿真研究表明,当QTL的数量较少时(n?=?20),多任务模型最有效;当两个人群之间的QTL效应相关性较低时(ρ?=?),精度提高了0.09。 QTL效应高度相关时(ρ?=?0.8),最高为0.16)。当包括QTL基因型进行训练和验证时,对于两个人群之间QTL效应的低相关性和高相关性的情况,分别提高了0.16和0.22。当QTL的数量大(n≥200)时,当不包括用于基因组预测的QTL基因型时,改善很小,最大为0.02。当QTL数量少且两个群体之间QTL效应的相关性低时,对于简单合并方法观察到准确性降低。对于真实数据,当使用28,206个SNP时,多任务模型在Ayrshire验证集中的准确性提高了0至0.07,而简单的数据合并方法导致除蛋白质百分比以外的所有性状的准确性降低。当使用246,668个SNP时,从多任务模型获得的精度提高了0到0.03,而使用合并方法导致精度降低了0.01到0.09。在荷斯坦地区,这三种方法具有相似的性能。结论这项研究的结果表明,所提出的用于多人群基因组预测的多任务贝叶斯学习模型是有效的,并且有可能提高基因组预测的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号