首页> 外文会议>Conference on image perception, observer performance, and technology assessment >Semi-parametric Estimation of the Area Under the Precision-Recall Curve
【24h】

Semi-parametric Estimation of the Area Under the Precision-Recall Curve

机译:精确召回曲线下面积的半参数估计

获取原文

摘要

Precision and recall are two common metrics used in the evaluation of information retrieval systems. By changing the number of retrieved documents, one can obtain a precision-recall curve. The area under the precision-recall curve (AUCPR) has been suggested as a performance measure for information retrieval systems, in a manner similar to the use of the area under the receiver operating characteristic curve in binary classification. Limited work has been performed in the literature to investigate the bias and variance of AUCPR estimators. The goal of our study was to investigate the bias and variability of a semi-parametric binormal method for estimating the AUCPR, and to compare it to other techniques, such as average precision (AP) and lower trapezoid (LT) approximation. We show how AUCPR can be obtained given the binormal model parameters, and how its variance can be estimated using the delta method. We performed simulation experiments with normal and non-normal data, and investigated the effect of sample size and prevalence. Our results indicated that the semi-parametric binormal approach provided AUCPR estimates with small bias and confidence intervals with acceptable coverage when the sample size was large, and the performance of the binormal model was comparable to or better than alternative methods evaluated in this study when the sample size was small. We conclude that the semi-parametric binormal model can be used to accurately estimate the AUCPR, and that the confidence intervals derived from the model can be at least as accurate as from other alternatives, even for non-normal decision variable distributions.
机译:准确性和召回率是评估信息检索系统中使用的两个常用指标。通过更改检索到的文档的数量,可以获得一条精确的调用曲线。已经提出了精确召回曲线下的面积(AUCPR)作为信息检索系统的性能指标,其方式类似于在二进制分类中使用接收器工作特性曲线下的面积。在文献中已经进行了有限的工作来调查AUCPR估计量的偏差和方差。我们研究的目的是研究用于估计AUCPR的半参数双正态方法的偏差和变异性,并将其与其他技术进行比较,例如平均精度(AP)和下梯形(LT)近似。我们展示了如何在给定双正态模型参数的情况下获得AUCPR,以及如何使用delta方法估计其方差。我们使用正常和非正常数据进行了模拟实验,并研究了样本量和患病率的影响。我们的结果表明,当样本量较大时,半参数双标准方法可提供具有较小偏差和置信区间且可接受覆盖率的AUCPR估计值,并且当样本量较大时,双标准模型的性能与本研究中评估的替代方法相当或更好。样本量很小。我们得出的结论是,半参数双正态模型可用于准确估计AUCPR,并且即使对于非正态决策变量分布,从该模型得出的置信区间也至少可以与其他替代方法一样准确。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号