首页> 外文会议>2011 2nd International Conference on Artificial Intelligence, Management Science and Electronic Commerce >Confident wrapper-type semi-supervised feature selection using an ensemble classifier
【24h】

Confident wrapper-type semi-supervised feature selection using an ensemble classifier

机译:使用集成分类器进行有信心的包装类型半监督特征选择

获取原文

摘要

Feature selection is an important data preprocessing step in pattern recognition. Recently, a wrapper-type semi-supervised feature selection method, known as FW-SemiFS, was proposed to overcome the small labeled sample problem of supervised feature selection. FW-SemiFS does not consider the confidence of predicted unlabeled data, but rather evaluates the relevance of features according to their frequency. Such frequencies are obtained via iterative supervised sequential forward feature selection (SFFS). However, the large amount of computational time associated with iterative SFFS is detrimental to FW-SemiFS. Furthermore, this relevance evaluation method eliminates the primary advantage of wrapper-type feature selection: the ability to evaluate the discriminative power of a combination of features. In this paper, we propose a new wrapper-type semi-supervised feature selection framework that can select a more relevant feature subset using confident unlabeled data. The proposed framework, called ensemble-based semi-supervised feature selection (EN-SemiFS), employs an ensemble classifier that supports the estimation of the confidence of unlabeled data. We analyzed the relationship between wrapper-type feature selection and the confidence of unlabeled data and explored how this relationship can make the semisupervised feature selection framework faster and more accurate. The experimental results revealed that the proposed method can select a more relevant feature subset when compared to existing methods.
机译:特征选择是模式识别中重要的数据预处理步骤。近年来,提出了一种包装类型的半监督特征选择方法,称为FW-SemiFS,以克服监督特征选择的小标签样本问题。 FW-SemiFS不考虑未标记的预测数据的可信度,而是根据特征的频率来评估特征的相关性。此类频率是通过迭代监督顺序前向特征选择(SFFS)获得的。但是,与迭代SFFS相关的大量计算时间不利于FW-SemiFS。此外,这种相关性评估方法消除了包装类型特征选择的主要优势:能够评估特征组合的判别力。在本文中,我们提出了一种新的包装器类型的半监督特征选择框架,该框架可以使用可信的未标记数据选择更相关的特征子集。所提出的框架称为基于集合的半监督特征选择(EN-SemiFS),它采用了集合分类器,该分类器支持对未标记数据的置信度进行估计。我们分析了包装类型特征选择与未标记数据的置信度之间的关系,并探讨了这种关系如何使半监督特征选择框架更快,更准确。实验结果表明,与现有方法相比,该方法可以选择更相关的特征子集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号