首页> 外文期刊>Remote sensing letters >Effect of training strategy for positive and unlabelled learning classification: test on Landsat imagery
【24h】

Effect of training strategy for positive and unlabelled learning classification: test on Landsat imagery

机译:积极和不加标记的学习分类训练策略的效果:Landsat影像测试

获取原文
获取原文并翻译 | 示例
       

摘要

The positive and unlabelled learning (PUL) algorithm, which transforms the traditional binary classifier into a one-class classifier based on Bayes' rule, has received attention in the field of remote sensing classification. PUL requires positive and unlabelled sample sets for the training procedure. As a one-class classifier, only the positive samples are labelled by the users, whereas the unlabelled samples are generated randomly from all the pixels. However, the effect of the training strategy on PUL has not yet been investigated. This study tested the performance of PUL-SVM (support vector machine) using training samples with different sizes, purity levels and quality levels. It is found that the balanced sizes of the positive sample set and the unlabelled sample set is an optimal strategy if the sets were sampled randomly. However, pure positive samples, which are preferred to be selected for guaranteeing correctness, make determining the optimal sample size difficult. The quality of positive samples is also an important factor that affects classification accuracy. A positive sample set that contains random-error samples drastically reduces classification accuracy, whereas confusion-error samples with spectra similar to those of the target class slightly affect classification accuracy. On the basis of these findings, we recommend the random sampling strategy for positive samples and balanced sizes for positive and unlabelled sample sets. Meanwhile, random-error samples must be avoided, whereas confusion-error samples can be highly tolerated.
机译:基于贝叶斯规则的正无标记学习(PUL)算法将传统的二进制分类器转化为一类分类器,在遥感分类领域受到了关注。 PUL需要阳性和未标记的样本集进行培训。作为一类分类器,用户仅标记正样本,而未标记样本是从所有像素中随机生成的。但是,尚未研究训练策略对PUL的影响。这项研究使用具有不同大小,纯度水平和质量水平的训练样本测试了PUL-SVM(支持向量机)的性能。如果对正样本集和未标记样本集进行随机抽样,发现平衡的大小是最佳策略。然而,优选选择纯正样本以保证正确性,这使得确定最佳样本量变得困难。阳性样品的质量也是影响分类准确性的重要因素。包含随机误差样本的正样本集会大大降低分类准确性,而具有类似于目标类别光谱的混淆误差样本会稍微影响分类准确性。根据这些发现,我们建议对阳性样本采用随机抽样策略,对阳性和未标记样本集建议采用平衡大小的抽样策略。同时,必须避免随机误差样本,而可以高度容忍混淆误差样本。

著录项

  • 来源
    《Remote sensing letters》 |2016年第12期|1063-1072|共10页
  • 作者单位

    Beijing Normal Univ, State Key Lab Earth Surface Proc & Resource Ecol, Beijing 100875, Peoples R China;

    SUNY Buffalo, Dept Geog, Buffalo, NY 14260 USA;

    Beijing Normal Univ, State Key Lab Earth Surface Proc & Resource Ecol, Beijing 100875, Peoples R China;

    Beijing Normal Univ, State Key Lab Earth Surface Proc & Resource Ecol, Beijing 100875, Peoples R China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号