首页> 外文期刊>Journal of dairy science >Short communication: Multivariate outlier detection for routine Nordic dairy cattle genetic evaluation in the Nordic Holstein and Red population
【24h】

Short communication: Multivariate outlier detection for routine Nordic dairy cattle genetic evaluation in the Nordic Holstein and Red population

机译:简短交流:在北欧荷斯坦和红色种群中进行常规北欧奶牛遗传评估的多元离群值检测

获取原文
获取原文并翻译 | 示例
       

摘要

It is of practical importance to ensure the data quality from a milk-recording system before use for genetic evaluation. A procedure was developed for detection of multivariate outliers based on an approximation for Mahalanobis distance and was implemented in the Nordic Holstein and Red population. The general target of this procedure is based on the Nordic Cattle Genetic Evaluation yield model, which is a 9-trait model for milk, protein, and fat in the first 3 lactations. The procedure is based on the phenotypic correlation structure as a function of days in milk (DIM) and on computation of trait means and standard deviations within a production year, lactation, and DIM. For each record in the data, a Mahalanobis distance value was computed based on the trait mean and the covariance matrix for the actual production year, lactation, and DIM. A set of cutoff values, ranging from 10 to 100 with steps of 10, for discarding multivariate outliers was investigated. Prediction accuracy was calculated as the Pearson correlations between estimated breeding values predicted by full data set and estimated breeding values predicted by reduced data set for cows without records in the reduced data set and with 1 or more records deleted due to the editing rules on Mahalanobis distance. The results showed that, averaged over all scenarios, gains of 0.005 to 0.048 on prediction accuracy have been obtained by deleting the multivariate outliers. The improvements were more profound for progeny of young bulls compared with progeny of proven bulls. It is easy to implement this multivariate outlier-detection procedure in the routine genetic evaluation for different dairy cattle breeds; however, an optimal cutoff value for Mahalanobis distance needs to be defined to achieve an acceptable compromise between genetic evaluation accuracy and data deletion.
机译:在用于基因评估之前,确保牛奶记录系统的数据质量具有实际意义。开发了一种基于马哈拉诺比斯距离的近似值来检测多元离群值的程序,该程序已在北欧荷斯坦和红色种群中实施。该程序的总体目标是基于北欧牛遗传评估产量模型,该模型是前3次泌乳中牛奶,蛋白质和脂肪的9性状模型。该程序基于表型相关结构与牛奶天数(DIM)的函数,并基于生产年,泌乳和DIM内性状平均值和标准差的计算。对于数据中的每个记录,根据特征平均值和实际生产年份,泌乳期和DIM的协方差矩阵计算马氏距离值。研究了一组舍弃值,范围从10到100(步长为10),用于舍弃多元离群值。预测准确性的计算方法是:对没有完整数据集的记录但由于Mahalanobis距离的编辑规则而删除了1条或更多条记录的母牛,通过完整数据集预测的估计育种值与通过缩减数据集预测的估计育种值之间的皮尔森相关性。结果表明,通过删除多元离群值,在所有情况下平均可以得到0.005至0.048的预测精度增益。与经过验证的公牛后代相比,年轻公牛的后代的进步更为深刻。在不同奶牛品种的常规遗传评估中,很容易实施这种多元离群值检测程序。但是,需要定义Mahalanobis距离的最佳截止值,以在遗传评估准确性和数据删除之间达成可接受的折衷。

著录项

  • 来源
    《Journal of dairy science》 |2018年第12期|11159-11164|共6页
  • 作者单位

    Nat Resources Inst Finland Luke, FIN-31600 Jokioinen, Finland;

    Nord Cattle Genet Evaluat, DK-8200 Aarhus, Denmark;

    Aarhus Univ, Ctr Quantitat Genet & Genom, Dept Mol Biol & Genet, DK-8830 Tjele, Denmark;

    Faba Co Op, FIN-01301 Vantaa, Finland;

  • 收录信息 美国《科学引文索引》(SCI);美国《生物学医学文摘》(MEDLINE);美国《化学文摘》(CA);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    prediction accuracy; Mahalanobis distance; data deletion;

    机译:预测精度;马氏距离;数据删除;
  • 入库时间 2022-08-18 04:02:57

相似文献

  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号