首页> 外文期刊>Bioinformatics >A CROC stronger than ROC: measuring, visualizing and optimizing early retrieval
【24h】

A CROC stronger than ROC: measuring, visualizing and optimizing early retrieval

机译:比ROC更强的CROC:测量,可视化和优化早期检索

获取原文
获取原文并翻译 | 示例
       

摘要

Motivation: The performance of classifiers is often assessed using Receiver Operating Characteristic ROC [or (AC) accumulation curve or enrichment curve] curves and the corresponding areas under the curves (AUCs). However, in many fundamental problems ranging from information retrieval to drug discovery, only the very top of the ranked list of predictions is of any interest and ROCs and AUCs are not very useful. New metrics, visualizations and optimization tools are needed to address this 'early retrieval' problem.Results: To address the early retrieval problem, we develop the general concentrated ROC (CROC) framework. In this framework, any relevant portion of the ROC (or AC) curve is magnified smoothly by an appropriate continuous transformation of the coordinates with a corresponding magnification factor. Appropriate families of magnification functions confined to the unit square are derived and their properties are analyzed together with the resulting CROC curves. The area under the CROC curve (AUC[CROC]) can be used to assess early retrieval. The general framework is demonstrated on a drug discovery problem and used to discriminate more accurately the early retrieval performance of five different predictors. From this framework, we propose a novel metric and visualization-the CROC(exp), an exponential transform of the ROC curve-as an alternative to other methods. The CROC(exp) provides a principled, flexible and effective way for measuring and visualizing early retrieval performance with excellent statistical power. Corresponding methods for optimizing early retrieval are also described in the Appendix.Availability: Datasets are publicly available. Python code and command-line utilities implementing CROC curves and metrics are available at http://pypi.python.org/pypi/CROC/Contact: pfbaldi@ics.uci.edu
机译:动机:通常使用接收器工作特性ROC [或(AC)累积曲线或富集曲线]曲线和曲线下的相应区域(AUC)来评估分类器的性能。但是,在从信息检索到药物发现等许多基本问题中,只有排名最高的预测列表才有意义,ROC和AUC并不是很有用。需要新的度量,可视化和优化工具来解决此“早期检索”问题。结果:为了解决早期检索问题,我们开发了通用的集中式ROC(CROC)框架。在此框架中,通过使用相应的放大系数对坐标进行适当的连续变换,可以平滑地放大ROC(或AC)曲线的任何相关部分。导出了限制在单位平方内的适当放大函数族,并分析了它们的特性以及所得的CROC曲线。 CROC曲线下的面积(AUC [CROC])可用于评估早期检索。该通用框架针对药物发现问题进行了演示,可用于更准确地区分五个不同预测变量的早期检索性能。从这个框架中,我们提出了一种新颖的度量和可视化-CROC(exp),ROC曲线的指数变换,作为其他方法的替代方法。 CROC(exp)提供了一种原则上,灵活而有效的方式来以出色的统计能力来测量和可视化早期检索性能。附录中还描述了优化早期检索的相应方法。可用性:数据集可公开获得。实现CROC曲线和度量的Python代码和命令行实用程序可在http://pypi.python.org/pypi/CROC/Contact:pfbaldi@ics.uci.edu获得

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号