首页> 外文会议>International Conference on Cloud Computing and Big Data >Semi-supervised Based Training Set Construction for Outlier Detection
【24h】

Semi-supervised Based Training Set Construction for Outlier Detection

机译:基于半监控的训练集结构,用于远离检测

获取原文

摘要

Outliers are sparse and few. It's costly to obtain a training set with enough outliers so that existing approaches to the problem of outlier detection seldom processed with supervised manner. However, given a training set with sufficient outliers, supervised outlier detection perform better than other methods. Traditional training set need to label each sample, but we can only label out the outliers and the other unlabeled ones can be directly marked as inliers to construct training set. In most cases, the number of samples we can label is limited and a large number of samples can be easily obtained without labeling. Semi-Supervised learning methods have a nature advantage in utilizing information of little labeled samples and large unlabeled samples to predict unlabeled instances. Based on this idea, we propose a algorithm CRLC constructing training set combining semi-supervised outlier detection. Our experiments show that our algorithm achieves better performance compared to other methods with the same cost.
机译:异常值稀疏,很少。获得具有足够异常值的培训速度昂贵,以便现有的异常检测问题的方法很少处理受监督方式。但是,给定具有足够异常值的培训,监督异常检测比其他方法更好。传统培训集需要标记每个样本,但我们只能标记出异常值,另一个未标记的培训可以直接标记为构建培训集的最基于。在大多数情况下,我们可以标记的样品数量有限,并且可以容易地获得大量样品而无需标记。半监督学习方法具有利用少量标记样本和大型未标记样品的信息来预测未标记的实例的性质优势。基于这个想法,我们提出了一种算法CRLC构建训练集合半监督异常探测。我们的实验表明,与具有相同成本相比的其他方法相比,我们的算法实现了更好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号