首页> 外文期刊>Genetics, selection, evolution >Application of neural networks with back-propagation to genome-enabled prediction of complex traits in Holstein-Friesian and German Fleckvieh cattle
【24h】

Application of neural networks with back-propagation to genome-enabled prediction of complex traits in Holstein-Friesian and German Fleckvieh cattle

机译:反向传播神经网络在基因组预测荷斯坦-弗里斯牛和德国弗莱克维牛复杂性状中的应用

获取原文
       

摘要

Background Recently, artificial neural networks (ANN) have been proposed as promising machines for marker-based genomic predictions of complex traits in animal and plant breeding. ANN are universal approximators of complex functions, that can capture cryptic relationships between SNPs (single nucleotide polymorphisms) and phenotypic values without the need of explicitly defining a genetic model. This concept is attractive for high-dimensional and noisy data, especially when the genetic architecture of the trait is unknown. However, the properties of ANN for the prediction of future outcomes of genomic selection using real data are not well characterized and, due to high computational costs, using whole-genome marker sets is difficult. We examined different non-linear network architectures, as well as several genomic covariate structures as network inputs in order to assess their ability to predict milk traits in three dairy cattle data sets using large-scale SNP data. For training, a regularized back propagation algorithm was used. The average correlation between the observed and predicted phenotypes in a 20 times 5-fold cross-validation was used to assess predictive ability. A linear network model served as benchmark. Results Predictive abilities of different ANN models varied markedly, whereas differences between data sets were small. Dimension reduction methods enhanced prediction performance in all data sets, while at the same time computational cost decreased. For the Holstein-Friesian bull data set, an ANN with 10 neurons in the hidden layer achieved a predictive correlation of r=0.47 for milk yield when the entire marker matrix was used. Predictive ability increased when the genomic relationship matrix (r=0.64) was used as input and was best (r=0.67) when principal component scores of the marker genotypes were used. Similar results were found for the other traits in all data sets. Conclusion Artificial neural networks are powerful machines for non-linear genome-enabled predictions in animal breeding. However, to produce stable and high-quality outputs, variable selection methods are highly recommended, when the number of markers vastly exceeds sample size.
机译:背景技术近来,人工神经网络(ANN)已被提出作为有前途的机器,用于动植物育种中复杂性状的基于标记的基因组预测。 ANN是复杂函数的通用逼近器,可以捕获SNP(单核苷酸多态性)和表型值之间的隐秘关系,而无需明确定义遗传模型。此概念对于高维和嘈杂的数据具有吸引力,尤其是在特征的遗传结构未知的情况下。然而,使用真实数据预测基因组选择的未来结果的人工神经网络的特性尚未很好地表征,并且由于高昂的计算成本,使用全基因组标记集非常困难。我们评估了不同的非线性网络体系结构,以及几种基因组协变量结构作为网络输入,以便评估使用大型SNP数据预测其在三个奶牛数据集中的牛奶性状的能力。为了进行训练,使用了正则反向传播算法。使用20倍5倍交叉验证中观察到的表型与预测表型之间的平均相关性来评估预测能力。线性网络模型作为基准。结果不同ANN模型的预测能力差异显着,而数据集之间的差异很小。降维方法提高了所有数据集的预测性能,同时降低了计算成本。对于Holstein-Friesian公牛数据集,当使用整个标记矩阵时,隐藏层中具有10个神经元的ANN的产奶量与r = 0.47的预测相关性。当使用基因组关系矩阵(r = 0.64)作为输入时,预测能力增加,而当使用标记基因型的主成分评分时,预测能力最佳(r = 0.67)。在所有数据集中,其他性状也发现了相似的结果。结论人工神经网络是用于动物育种中非线性基因组预测的强大机器。但是,为了产生稳定和高质量的输出,当标记的数量大大超过样本数量时,强烈建议使用变量选择方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号