首页> 外文会议>International Conference on Information Reuse and Integration for Data Science >Evaluating Model Predictive Performance: A Medicare Fraud Detection Case Study
【24h】

Evaluating Model Predictive Performance: A Medicare Fraud Detection Case Study

机译:评估模型预测性能:医疗保险欺诈检测案例研究

获取原文

摘要

Evaluating a machine learning model's predictive performance is vital for establishing the practical usability in real-world applications. The use of separate training and test datasets, and cross-validation are common when evaluating machine learning models. The former uses two distinct datasets, whereas cross-validation splits a single dataset into smaller training and test subsets. In real-world production applications, it is critical to establish a model's usefulness by validating it on completely new input data, and not just using the crossvalidation results on a single historical dataset. In this paper, we present results for both evaluation methods, to include performance comparisons. In order to provide meaningful comparative analyses between methods, we perform real-world fraud detection experiments using 2013 to 2016 Medicare durable medical equipment claims data. This Medicare dataset is split into training (2013 to 2015 individual years) and test (2016 only). Using this Medicare case study, we assess the fraud detection performance, across three learners, for both model evaluation methods. We find that using the separate training and test sets generally outperforms cross-validation, indicating a better real-world model performance evaluation. Even so, cross-validation has comparable, but conservative, fraud detection results.
机译:评估机器学习模型的预测性能对于在现实应用中建立实际可用性至关重要。使用单独的训练和测试数据集以及在评估机器学习模型时常见的交叉验证。前者使用两个不同的数据集,而交叉验证将单个数据集分成更小的训练和测试子集。在真实的生产应用程序中,通过在完全新的输入数据上验证模型,建立模型的有用性至关重要,而不仅仅是在单个历史数据集上使用CrossValidation结果。在本文中,我们为两个评估方法提供了结果,包括性能比较。为了在方法之间提供有意义的比较分析,我们使用2013年至2016年Medicare耐用医疗设备声明数据进行现实世界欺诈检测实验。该Medicare DataSet分为培训(2013年至2015年个人年份)和测试(仅限2016年)。使用此Medicare案例研究,我们对三个学习者进行欺诈检测性能,适用于模型评估方法。我们发现,使用单独的培训和测试集通常优于交叉验证,表明更好的真实模式性能评估。即便如此,交叉验证也具有可比性,但保守,欺诈检测结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号