首页> 外文会议>Meeting of The Society^for^Veterinary^Epidemiology^and^Preventive^Medicine^(Great^Britain). >USING MACHINE LEARNING TO PREDICT DISEASE IN TRANSITION PERIODDAIRY CATTLE - THE CHALLENGES OF AN IMBALANCED DATASET AND THEIMPORTANCE OF METRICS
【24h】

USING MACHINE LEARNING TO PREDICT DISEASE IN TRANSITION PERIODDAIRY CATTLE - THE CHALLENGES OF AN IMBALANCED DATASET AND THEIMPORTANCE OF METRICS

机译:使用机器学习预测转型期牛的疾病 - 不平衡数据集的挑战以及指标的Idportance

获取原文

摘要

In our study, while attempting to fit predictive models for transition cow disease, it became evident that the metrics chosen for reporting can lead to misleading interpretation of the model's performance. When building classification algorithms, in which the goal is to correctly assign new data to predetermined classes, it is common that only accuracy, sensitivity and specificity are reported. For imbalanced datasets however, where the number of data points for each class differs substantially, theomission of metrics such as kappa or balanced accuracy can lead to inaccurate conclusions, especially since the overall accuracy is commonly very high. In this research, we illustrate the importance of reporting all relevant metrics associated with a model's performance in order to reflect the model's genuine predictive capability with new data.
机译:在我们的研究中,在尝试适合过渡牛疾病的预测模型时,它变得明显,所选择的报告所选择的指标可能导致误导对模型的表现的解释。在构建分类算法时,目标是正确地将新数据分配给预定类别,常常仅报告准确性,灵敏度和特异性。然而,对于不平衡数据集,在每个类的数据点的数量基本上不同,诸如κ或均衡精度之类的度量的耻辱可能导致不准确的结论,特别是因为整体精度通常非常高。在这项研究中,我们说明了报告与模型性能相关的所有相关指标的重要性,以反映模型的新数据的真正预测能力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号