首页> 外文期刊>Earth and Space Science >The STONE Curve: A ROC‐Derived Model Performance Assessment Tool
【24h】

The STONE Curve: A ROC‐Derived Model Performance Assessment Tool

机译:石曲线:ROC派生模型性能评估工具

获取原文
       

摘要

A new model validation and performance assessment tool is introduced, the sliding threshold of observation for numeric evaluation (STONE) curve. It is based on the relative operating characteristic (ROC) curve technique, but instead of sorting all observations in a categorical classification, the STONE tool uses the continuous nature of the observations. Rather than defining events in the observations and then sliding the threshold only in the classifier/model data set, the threshold is changed simultaneously for both the observational and model values, with the same threshold value for both data and model. This is only possible if the observations are continuous and the model output is in the same units and scale as the observations, that is, the model is trying to exactly reproduce the data. The STONE curve has several similarities with the ROC curve—plotting probability of detection against probability of false detection, ranging from the (1,1) corner for low thresholds to the (0,0) corner for high thresholds, and values above the zero‐intercept unity‐slope line indicating better than random predictive ability. The main difference is that the STONE curve can be nonmonotonic, doubling back in both the x and y directions. These ripples reveal asymmetries in the data‐model value pairs. This new technique is applied to modeling output of a common geomagnetic activity index as well as energetic electron fluxes in the Earth's inner magnetosphere. It is not limited to space physics applications but can be used for any scientific or engineering field where numerical models are used to reproduce observations. Plain Language Summary Scientists often try to reproduce observations with a model, helping them explain the observations by adjusting known and controllable features within the model. They then use a large variety of metrics for assessing the ability of a model to reproduce the observations. One such metric is called the relative operating characteristic (ROC) curve, a tool that assesses a model's ability to predict events within the data. The ROC curve is made by sliding the event‐definition threshold in the model output, calculating certain metrics and making a graph of the results. Here, a new model assessment tool is introduced, called the sliding threshold of observation for numeric evaluation (STONE) curve. The STONE curve is created by sliding the event definition threshold not only for the model output but also simultaneously for the data values. This is applicable when the model output is trying to reproduce the exact values of a particular data set. While the ROC curve is still a highly valuable tool for optimizing the prediction of known and preclassified events, it is argued here that the STONE curve is better for assessing model prediction of a continuous‐valued data set.
机译:介绍了一种新的模型验证和性能评估工具,数字评估(Stone)曲线的观察的滑动阈值。它基于相对操作特征(ROC)曲线技术,但代替在分类分类中排序所有观察,石刀使用观察的连续性。而不是在观察中定义事件,然后仅在分类器/型号数据集中滑动阈值,而是针对观察和模型值同时改变阈值,其具有相同的数据和模型的阈值。只有在观察是连续的情况下才有可能并且模型输出处于相同的单位和比例作为观察,即,模型正在尝试完全重现数据。石曲线具有若干相似之处与ROC曲线绘制概率检测的误差检测概率,范围从(1,1)角度为低阈值的(0.0)角,高于零的值-Intercept Unity-Slope线,表明比随机预测能力更好。主要区别在于,石曲线可以是非单调的,在x和y方向上倍增。这些涟漪在数据模型值对中显示不对称。这种新技术适用于普通地磁活动指数的建模输出,以及地球内磁层中的能量电子通量。它不限于空间物理应用,但可用于任何科学或工程领域,其中使用数值模型来再现观察。普通语言摘要科学家经常尝试用模型重现观察,帮助他们通过调整模型内的已知和可控特征来解释观察。然后,他们使用各种度量来评估模型再现观察的能力。一种这样的度量被称为相对操作特征(ROC)曲线,该工具评估模型预测数据内事件的能力。通过在模型输出中滑动事件定义阈值,计算某些度量并制作结果图来进行ROC曲线。在这里,引入了一种新的模型评估工具,称为数字评估(Stone)曲线观察的滑动阈值。通过滑动事件定义阈值而不是用于模型输出而且同时为数据值而创建的石曲线。这适用于模型输出尝试再现特定数据集的确切值时。虽然ROC曲线仍然是优化了已知和预分配事件的预测的高度有价值的工具,但在此表示石曲线更好地用于评估连续值数据集的模型预测。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号