首页> 外文期刊>Pattern Recognition: The Journal of the Pattern Recognition Society >Optimizing area under the ROC curve using semi-supervised learning
【24h】

Optimizing area under the ROC curve using semi-supervised learning

机译:使用半监督学习优化ROC曲线下的面积

获取原文
获取原文并翻译 | 示例
           

摘要

Receiver operating characteristic (ROC) analysis is a standard methodology to evaluate the performance of a binary classification system. The area under the ROC curve (AUC) is a performance metric that summarizes how well a classifier separates two classes. Traditional AUC optimization techniques are supervised learning methods that utilize only labeled data (i.e., the true class is known for all data) to train the classifiers. In this work, inspired by semi-supervised and transductive learning, we propose two new AUC optimization algorithms hereby referred to as semi-supervised learning receiver operating characteristic (SSLROC) algorithms, which utilize unlabeled test samples in classifier training to maximize AUC. Unlabeled samples are incorporated into the AUC optimization process, and their ranking relationships to labeled positive and negative training samples are considered as optimization constraints. The introduced test samples will cause the learned decision boundary in a multidimensional feature space to adapt not only to the distribution of labeled training data, but also to the distribution of unlabeled test data. We formulate the semi-supervised AUC optimization problem as a semi-definite programming problem based on the margin maximization theory. The proposed methods SSLROC1 (1-norm) and SSLROC2 (2-norm) were evaluated using 34 (determined by power analysis) randomly selected datasets from the University of California, Irvine machine learning repository. Wilcoxon signed rank tests showed that the proposed methods achieved significant improvement compared with state-of-the-art methods. The proposed methods were also applied to a CT colonography dataset for colonic polyp classification and showed promising results.(1) Published by Elsevier Ltd.
机译:接收器工作特性(ROC)分析是评估二进制分类系统性能的标准方法。 ROC曲线(AUC)下的面积是一种性能指标,总结了分类器将两个类别分开的程度。传统的AUC优化技术是有监督的学习方法,仅使用标记的数据(即所有数据都知道真实的类别)来训练分类器。在这项工作中,受半监督和跨导学习的启发,我们提出了两种新的AUC优化算法,在此称为半监督学习接收器操作特征(SSLROC)算法,该算法在分类器训练中利用未标记的测试样本来最大化AUC。未标记的样本被合并到AUC优化过程中,它们与标记的正训练样本和负训练样本的排名关系被视为优化约束。引入的测试样本将导致多维特征空间中的学习决策边界不仅适应标记的训练数据的分布,而且适应未标记的测试数据的分布。我们基于余量最大化理论将半监督AUC优化问题表述为半定规划问题。提议的方法SSLROC1(1-规范)和SSLROC2(2-规范)使用来自加州大学尔湾分校机器学习存储库的34个随机选择的数据集进行了评估(通过功率分析确定)。 Wilcoxon签署秩检验表明,与最新方法相比,该方法取得了显着改进。提出的方法还应用于结肠结肠息肉分类的CT结肠成像数据集,并显示出令人鼓舞的结果。(1)由Elsevier Ltd.出版。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号