首页> 外文期刊>Statistics in Biosciences >Binormal Precision-Recall Curves for Optimal Classification of Imbalanced Data
【24h】

Binormal Precision-Recall Curves for Optimal Classification of Imbalanced Data

机译:双向预测曲线,用于最佳分类的不平衡数据

获取原文
获取原文并翻译 | 示例
           

摘要

Binary classification on imbalanced data, i.e., a large skew in the class distribution, is a challenging problem. Evaluation of classifiers via the receiver operating characteristic (ROC) curve is common in binary classification. Techniques to developclassifiers that optimize the area under the ROC curve have been proposed. However, for imbalanced data, the ROC curve tends to give an overly optimistic view. Realizing its disadvantages of dealing with imbalanced data, we propose an approach based onthe Precision-Recall (PR) curve under the binormal assumption. We propose to choose the classifier that maximizes the area under the binormal PR curve. The asymptotic distribution of the resulting estimator is shown. Simulations, as well as real data results, indicate that the binormal Precision-Recall method outperforms approaches based on the area under the ROC curve.
机译:二进制分类对不平衡数据,即类分布中的大偏差,是一个具有挑战性的问题。 通过接收器操作特征(ROC)曲线评估分类器在二进制分类中是常见的。 已经提出了在ROC曲线下优化该区域的开发基团的技术。 但是,对于数据不平衡,ROC曲线倾向于提供过度乐观的视图。 实现处理不平衡数据的缺点,我们提出了一种基于双流假设的精密召回(PR)曲线的方法。 我们建议选择分类器,以最大化Binormal PR曲线下的区域。 显示了所得估计器的渐近分布。 仿真以及真实数据结果表明,基于ROC曲线下的区域的双平均精确召回方法优于接近。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号