...
首页> 外文期刊>PLoS One >BWGS: A R package for genomic selection and its application to a wheat breeding programme
【24h】

BWGS: A R package for genomic selection and its application to a wheat breeding programme

机译:BWGS:用于基因组选择的R包及其在小麦育种计划中的应用

获取原文

摘要

We developed an integrated R library called BWGS to enable easy computation of Genomic Estimates of Breeding values (GEBV) for genomic selection. BWGS, for BreedWheat Genomic selection, was developed in the framework of a cooperative private-public partnership project called Breedwheat ( https://breedwheat.fr ) and relies on existing R-libraries, all freely available from CRAN servers. The two main functions enable to run 1) replicated random cross validations within a training set of genotyped and phenotyped lines and 2) GEBV prediction, for a set of genotyped-only lines. Options are available for 1) missing data imputation, 2) markers and training set selection and 3) genomic prediction with 15 different methods, either parametric or semi-parametric. The usefulness and efficiency of BWGS are illustrated using a population of wheat lines from a real breeding programme. Adjusted yield data from historical trials (highly unbalanced design) were used for testing the options of BWGS. On the whole, 760 candidate lines with adjusted phenotypes and genotypes for 47 839 robust SNP were used. With a simple desktop computer, we obtained results which compared with previously published results on wheat genomic selection. As predicted by the theory, factors that are most influencing predictive ability, for a given trait of moderate heritability, are the size of the training population and a minimum number of markers for capturing every QTL information. Missing data up to 40%, if randomly distributed, do not degrade predictive ability once imputed, and up to 80% randomly distributed missing data are still acceptable once imputed with Expectation-Maximization method of package rrBLUP. It is worth noticing that selecting markers that are most associated to the trait do improve predictive ability, compared with the whole set of markers, but only when marker selection is made on the whole population. When marker selection is made only on the sampled training set, this advantage nearly disappeared, since it was clearly due to overfitting. Few differences are observed between the 15 prediction models with this dataset. Although non-parametric methods that are supposed to capture non-additive effects have slightly better predictive accuracy, differences remain small. Finally, the GEBV from the 15 prediction models are all highly correlated to each other. These results are encouraging for an efficient use of genomic selection in applied breeding programmes and BWGS is a simple and powerful toolbox to apply in breeding programmes or training activities.
机译:我们开发了一个名为BWG的集成R库,可以轻松计算基因组选择的育种值(GEBV)的基因组估计。 BWGS为繁殖基因组选择是在叫做繁殖拍照(HTTPS://breedWheat.fr)的合作私有公共伙伴关系项目的框架中开发的,并依赖于现有的R库,所有这些都可以自由地从CRAN服务器上获得。两个主要功能使得在基因分型和表型线和表型线和2)GEBV预测中复制1)在训练组中复制随机交叉验证,对于一组基因分型线。选项可用于1)缺少数据缺货,2)标记和培训设置选择和3)基因组预测,具有15种不同方法,参数或半参数。使用真正的育种计划的小麦线群来说明BWG的有用性和效率。来自历史试验的调整后的产量数据(高度不平衡设计)用于测试BWG的选项。在整体上,使用了具有调整后表型和基因型的760个候选线,用于47个839稳健的SNP。使用简单的桌面计算机,我们获得了结果的结果,与先前公布的小麦基因组选择的结果相比。正如理论所预测的那样,对于适度的适度可遗传性的特征来说,影响最大的因素是培训人口的大小和用于捕获每个QTL信息的最小标记数。缺少数据高达40%,如果随机分布,请勿降低预测能力一旦避阻,并且仍然可以接受高达80%的随机分布缺失数据,一旦避免了包装RRBLUP的期望最大化方法。值得注意的是,与整个标记相比,选择与特性相关的标记可以提高预测能力,但只有当在整个人口上进行标记选择时才。当仅在采样训练集上进行标记选择时,这种优势几乎消失,因为它显然是由于过度装备。在具有此数据集的15个预测模型之间观察到几个差异。虽然应该捕获非添加效应的非参数方法具有稍微更好的预测精度,但差异仍然很小。最后,来自15个预测模型的GEBV彼此高度相关。这些结果令人鼓舞的应用育种计划中的基因组选择有效地利用基因组选择,BWGS是一个简单而强大的工具箱,适用于育种计划或培训活动。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号