首页> 美国卫生研究院文献>Bioinformatics >A CROC stronger than ROC: measuring visualizing and optimizing early retrieval
【2h】

A CROC stronger than ROC: measuring visualizing and optimizing early retrieval

机译:比ROC更强的CROC:测量可视化和优化早期检索

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

>Motivation: The performance of classifiers is often assessed using Receiver Operating Characteristic ROC [or (AC) accumulation curve or enrichment curve] curves and the corresponding areas under the curves (AUCs). However, in many fundamental problems ranging from information retrieval to drug discovery, only the very top of the ranked list of predictions is of any interest and ROCs and AUCs are not very useful. New metrics, visualizations and optimization tools are needed to address this ‘early retrieval’ problem.>Results: To address the early retrieval problem, we develop the general concentrated ROC (CROC) framework. In this framework, any relevant portion of the ROC (or AC) curve is magnified smoothly by an appropriate continuous transformation of the coordinates with a corresponding magnification factor. Appropriate families of magnification functions confined to the unit square are derived and their properties are analyzed together with the resulting CROC curves. The area under the CROC curve (AUC[CROC]) can be used to assess early retrieval. The general framework is demonstrated on a drug discovery problem and used to discriminate more accurately the early retrieval performance of five different predictors. From this framework, we propose a novel metric and visualization—the CROC(exp), an exponential transform of the ROC curve—as an alternative to other methods. The CROC(exp) provides a principled, flexible and effective way for measuring and visualizing early retrieval performance with excellent statistical power. Corresponding methods for optimizing early retrieval are also described in the .>Availability: Datasets are publicly available. Python code and command-line utilities implementing CROC curves and metrics are available at >Contact:
机译:>动机:通常使用接收器工作特征ROC [或(AC)累积曲线或富集曲线]曲线以及曲线下的相应区域(AUC)来评估分类器的性能。但是,在从信息检索到药物发现等许多基本问题中,只有排名最高的预测列表才有意义,ROC和AUC并不是很有用。需要新的指标,可视化和优化工具来解决此“早期检索”问题。>结果:为解决早期检索问题,我们开发了通用的集中式ROC(CROC)框架。在此框架中,通过使用相应的放大系数对坐标进行适当的连续变换,可以平滑地放大ROC(或AC)曲线的任何相关部分。导出了限制在单位平方内的适当的放大函数族,并分析了它们的特性以及所得的CROC曲线。 CROC曲线下的面积(AUC [CROC])可用于评估早期检索。该通用框架针对一种药物发现问题进行了演示,并用于更准确地区分五个不同预测变量的早期检索性能。从这个框架中,我们提出了一种新颖的度量和可视化方法-CROC(exp),ROC曲线的指数变换,作为其他方法的替代方法。 CROC(exp)提供了一种原则上,灵活而有效的方式来以出色的统计能力对早期检索性能进行测量和可视化。 >可用性:数据集是公开可用的,还介绍了用于优化早期检索的相应方法。 >联系方式提供了实现CROC曲线和指标的Python代码和命令行实用程序

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号