首页> 外文期刊>Journal of dairy science >Comparing regression, naive Bayes, and random forest methods in the prediction of individual survival to second lactation in Holstein cattle
【24h】

Comparing regression, naive Bayes, and random forest methods in the prediction of individual survival to second lactation in Holstein cattle

机译:比较回归,幼稚贝叶斯和随机森林方法在荷斯坦牛第二哺乳期预测中的预测

获取原文
获取原文并翻译 | 示例
       

摘要

In this study, we compared multiple logistic regression,a linear method, to naive Bayes and random forest,2 nonlinear machine-learning methods. We usedall 3 methods to predict individual survival to secondlactation in dairy heifers. The data set used for predictioncontained 6,847 heifers born between January2012 and June 2013, and had known survival outcomes.Each animal had 50 genomic estimated breeding valuesavailable at birth and up to 65 phenotypic variablesthat accumulated over time. Survival was predicted at5 moments in life: at birth, at 18 mo, at first calving, at6 wk after first calving, and at 200 d after first calving.The data sets were randomly split into 70% trainingand 30% testing sets to evaluate model performancefor 20-fold validation. The methods were compared foraccuracy, sensitivity, specificity, area under the curve(AUC) value, contrasts between groups for the predictionoutcomes, and increase in surviving animals in apractical scenario. At birth and 18 mo, all methodshad overlapping performance; no method significantlyoutperformed the other. At first calving, 6 wk afterfirst calving, and 200 d after first calving, random forestand naive Bayes had overlapping performance, andboth machine-learning methods outperformed multiplelogistic regression. Overall, naive Bayes has the highestaverage AUC at all decision points up to 200 d afterfirst calving. Random forest had the highest AUC at200 d after first calving. All methods obtained similarincreases in survival in the practical scenario. Despitethis, the methods appeared to predict the survival ofindividual heifers differently. All methods improvedover time, but the changes in mean model outcomesfor surviving and non-surviving animals differed bymethod. Furthermore, the correlations of individualpredictions between methods ranged from r = 0.417 tor = 0.700; the lowest correlations were at first calvingfor all methods. In short, all 3 methods were able topredict survival at a population level, because all methodsimproved survival in a practical scenario. However,depending on the method used, predictions for individualanimals were quite different between methods.
机译:在这项研究中,我们比较了多元的逻辑回归,一种线性方法,到朴素贝叶斯和随机森林,2非线性机器学习方法。我们用了所有3种方法预测单个生存到第二个乳品小母牛的哺乳。用于预测的数据集载有6,847名小学师1月之间出生2012年和2013年6月,已知的生存结果。每只动物都有50个基因组估计的育种值在出生时可用,最多65个表型变量随着时间的推移累积。生存是预期的生活中的5个时刻:出生时,在18莫,首先是犊牛,在第一次犊牛后6周,并在首先产犊后在200天。数据集随机分为70%的培训和30%的测试集来评估模型性能对于20倍验证。比较这些方法准确性,灵敏度,特异性,曲线下的区域(AUC)值,预测的组之间对比结果,并增加了幸存的动物实际情况。在出生和18月,所有方法表现重叠;没有明显的方法表现优于另一个。乍一看,6周后首先是第一次产犊,第一次产犊后,随机森林和朴素的贝父的表现重叠,而且两种机器学习方法都表现优于多个逻辑回归。总体而言,天真的贝父最高所有决策中的平均AUC最多可达200​​天第一次犊牛。随机森林有最高的AUC第一次产犊后200 d。所有方法都类似在实际情况下生存增加。尽管这样,这些方法似乎预测了生存单个小母牛不同。所有方法都改进了随着时间的推移,但卑鄙模型结果的变化对于幸存和非生存的动物不同方法。此外,个体的相关性方法之间的预测范围从r = 0.417到r = 0.700;最低的相关性首先是犊牛对于所有方法。简而言之,所有3种方法都能够预测人口水平的生存,因为所有方法在实际情况下提高生存。然而,取决于所使用的方法,个人预测动物在方法之间存在很大差异。

著录项

  • 来源
    《Journal of dairy science》 |2019年第10期|9409–9421|共13页
  • 作者单位

    Wageningen University and Research Animal Breeding and Genomics PO Box 338 6700 AH Wageningen the Netherlands;

    Wageningen University and Research Animal Breeding and Genomics PO Box 338 6700 AH Wageningen the Netherlands;

    Cooperation CRV Animal Evaluation Unit PO Box 454 6800 AL Arnhem the Netherlands;

    Wageningen University and Research Information Technology Group 6706 KN Wageningen the Netherlands;

    Wageningen University and Research Information Technology Group 6706 KN Wageningen the Netherlands;

    Wageningen University and Research Animal Breeding and Genomics PO Box 338 6700 AH Wageningen the Netherlands;

  • 收录信息 美国《科学引文索引》(SCI);美国《生物学医学文摘》(MEDLINE);美国《化学文摘》(CA);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    machine learning; naive Bayes; regression; random forest; phenotypic prediction;

    机译:机器学习;天真的贝叶斯;回归;随机森林;表型预测;
  • 入库时间 2022-08-18 22:29:27

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号