首页> 美国卫生研究院文献>other >Optimizing area under the ROC curve using semi-supervised learning
【2h】

Optimizing area under the ROC curve using semi-supervised learning

机译:使用半监督学习优化ROC曲线下的面积

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Receiver operating characteristic (ROC) analysis is a standard methodology to evaluate the performance of a binary classification system. The area under the ROC curve (AUC) is a performance metric that summarizes how well a classifier separates two classes. Traditional AUC optimization techniques are supervised learning methods that utilize only labeled data (i.e., the true class is known for all data) to train the classifiers. In this work, inspired by semi-supervised and transductive learning, we propose two new AUC optimization algorithms hereby referred to as semi-supervised learning receiver operating characteristic (SSLROC) algorithms, which utilize unlabeled test samples in classifier training to maximize AUC. Unlabeled samples are incorporated into the AUC optimization process, and their ranking relationships to labeled positive and negative training samples are considered as optimization constraints. The introduced test samples will cause the learned decision boundary in a multidimensional feature space to adapt not only to the distribution of labeled training data, but also to the distribution of unlabeled test data. We formulate the semi-supervised AUC optimization problem as a semi-definite programming problem based on the margin maximization theory. The proposed methods SSLROC1 (1-norm) and SSLROC2 (2-norm) were evaluated using 34 (determined by power analysis) randomly selected datasets from the University of California, Irvine machine learning repository. Wilcoxon signed rank tests showed that the proposed methods achieved significant improvement compared with state-of-the-art methods. The proposed methods were also applied to a CT colonography dataset for colonic polyp classification and showed promising results.
机译:接收器工作特性(ROC)分析是评估二进制分类系统性能的标准方法。 ROC曲线下的面积(AUC)是一种性能指标,概述了分类器将两个类分开的程度。传统的AUC优化技术是有监督的学习方法,该方法仅利用标记的数据(即所有数据都知道真实的类别)来训练分类器。在这项工作中,受半监督和跨导学习的启发,我们提出了两种新的AUC优化算法,在此称为半监督学习接收器操作特征(SSLROC)算法,该算法在分类器训练中利用未标记的测试样本来最大化AUC。未标记的样本被合并到AUC优化过程中,它们与标记的正训练样本和负训练样本的等级关系被视为优化约束。引入的测试样本将导致多维特征空间中的学习决策边界不仅适应标记的训练数据的分布,而且适应未标记的测试数据的分布。我们基于余量最大化理论将半监督AUC优化问题表述为半定规划问题。提议的方法SSLROC1(1-规范)和SSLROC2(2-规范)使用来自加州大学尔湾分校机器学习存储库的34个随机选择的数据集进行了评估(通过功率分析确定)。 Wilcoxon签名秩检验表明,与最新方法相比,该方法取得了显着改进。所提出的方法还应用于结肠结肠息肉的CT结肠成像数据集,并显示出可喜的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号