首页> 外文期刊>Neurocomputing >Multi-view based unlabeled data selection using feature transformation methods for semiboost learning
【24h】

Multi-view based unlabeled data selection using feature transformation methods for semiboost learning

机译:基于特征转换方法的基于半视图的多视图无标记数据选择

获取原文
获取原文并翻译 | 示例

摘要

SemiBoost Mallapragada et al. (2009) is a boosting framework for semi-supervised learning, in which unlabeled data as well as labeled data both contribute to learning. Various strategies have been proposed in the literature to perform the task of selecting useful unlabeled data in SemiBoost. Recently, a multi-view based strategy was proposed in Le and Kim (2016), in which the feature set of the data is decomposed into subsets (i.e., multiple views) using a feature-decomposition method. In the decomposition process, the strategy inevitably results in some loss of information. To avoid this drawback, this paper considered feature-transformation methods, rather than using the decomposition method, to obtain the multiple views. More specifically, in the feature-transformation method, a number of views were obtained from the entire feature set using the same number of different mapping functions. After deriving the number of views of the data, each of the views was used for measuring corresponding confidences, for first evaluating examples to be selected. Then, all the confidence levels measured from the multiple views were combined as a weighted average for deriving a target confidence. The experimental results, which were obtained using support vector machines for well-known benchmark data, demonstrate that the proposed mechanism can compensate for the shortcomings of the traditional strategies. In addition, the results demonstrate that when the data is transformed appropriately into multiple views, the strategy can achieve further improvement in results in terms of classification accuracy. (C) 2017 Elsevier B.V. All rights reserved.
机译:SemiBoost Mallapragada等。 (2009年)是半监督学习的提振框架,其中未标记的数据以及标记的数据都对学习有所贡献。文献中已经提出了各种策略来执行在SemiBoost中选择有用的未标记数据的任务。最近,Le and Kim(2016)提出了一种基于多视图的策略,其中使用特征分解方法将数据的特征集分解为子集(即多视图)。在分解过程中,该策略不可避免地导致信息丢失。为了避免这个缺点,本文考虑使用特征变换方法而不是使用分解方法来获得多个视图。更具体地说,在特征变换方法中,使用相同数量的不同映射函数从整个特征集中获得了多个视图。在得出数据视图的数量之后,将每个视图用于测量相应的置信度,以选择要首先评估的示例。然后,将从多个视图测得的所有置信度水平合并为加权平均值,以得出目标置信度。使用支持向量机获得著名基准数据的实验结果表明,该机制可以弥补传统策略的不足。此外,结果表明,将数据适当地转换为多个视图时,该策略可以在分类精度方面进一步提高结果。 (C)2017 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号