首页> 外文期刊>Swarm and Evolutionary Computation >Using decomposition-based multi-objective evolutionary algorithm as synthetic example optimization for self-labeling
【24h】

Using decomposition-based multi-objective evolutionary algorithm as synthetic example optimization for self-labeling

机译:基于分解的多目标进化算法作为自我标签的合成示例优化

获取原文
获取原文并翻译 | 示例
           

摘要

Existing a lot of unlabeled data and few labeled data is one of the most common problems in real datasets. Semi-supervised classification methods can well handle such a problem and have a desirable performance. Among them, one of the most successful methods in dealing with shortage of labeled data is self-labeled technique. One of the difficulties of this technique is wrong data labeling in iterative process of self-labeling. The main reasons are 1) existing outlier and noisy data, 2) inappropriate distribution of labeled data in problem space, and 3) shortage of labeled data in order to make diversity in learning hypotheses. In this paper, a method is developed so as to generate synthetic labeled using decomposition-based multi-objective evolutionary algorithm as synthetic example optimization for self-labeling called DMSS. In DMSS, the synthetic labeled datasets with high diversity and high classification accuracy are generated and then added to the labeled datasets for better training of the algorithm. In already conducted researches, the diversity of the generated data and their distribution in problem space have not been well investigated. The proposed method is a data preparation method which can be employed in all self-labeled techniques. To do so, the proposed method is applied over four self-labeled algorithms having different features and their performances are then evaluated using 25 pattern datasets. The obtained results show high performance of the DMSS with regard to the classification accuracy compared to the existing methods in the literature. Also, the outcomes of conducted non-parametric statistical tests show that the proposed method significantly outperforms the other existing methods in the literature, as well.
机译:现有大量未标记的数据和少数标记数据是实时数据集中最常见的问题之一。半监督分类方法可以很好地处理这样的问题并且具有所需的性能。其中,处理标记数据短缺的最成功方法之一是自我标记的技术。这种技术的困难之一是在自我标签的迭代过程中的数据标记是错误的。主要原因是1)现有的异常值和嘈杂的数据,2)在问题空间中标记数据的不当分发,以及3)标记数据的短缺,以便在学习假设方面进行多样性。在本文中,开发了一种方法,以便使用基于分解的多目标进化算法作为合成示例优化来生成标记的合成,用于称为DMS的自标记。在DMS中,产生具有高分集和高分类精度的合成标记数据集,然后将其添加到标记的数据集中,以便更好地培训算法。在已经进行的研究中,在问题空间中产生的数据和分布的多样性尚未得到很好的研究。该方法是一种数据准备方法,可以以所有自标记技术采用。为此,所提出的方法应用于具有不同特征的四种自标记算法,然后使用25个图案数据集进行评估它们的性能。与文献中的现有方法相比,所获得的结果表明了DMSS的高性能。此外,所进行的非参数统计测试的结果表明,该方法也显着优于文献中的其他现有方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号