...
首页> 外文期刊>Complexity >An Exact Feature Selection Algorithm Based on Rough Set Theory
【24h】

An Exact Feature Selection Algorithm Based on Rough Set Theory

机译:基于粗糙集理论的精确特征选择算法

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Feature reduction based on rough set theory is an effective feature selection method in pattern recognition applications. Finding a minimal subset of the original features is inherent in rough set approach to feature selection. As feature reduction is a Nondeterministic Polynomial-time-hard problem, it is necessary to develop fast optimal or near-optimal feature selection algorithms. This article aims to propose an exact feature selection algorithm in rough set that is efficient in terms of computation time. The proposed algorithm begins the examination of a solution tree by a breadth-first strategy. The pruned nodes are held in a version of the trie data structure. Based on the monotonic property of dependency degree, all subsets of the pruned nodes cannot be optimal solutions. Thus, by detecting these subsets in trie, it is not necessary to calculate their dependency degree. The search on the tree continues until the optimal solution is found. This algorithm is improved by selecting an initial search level determined by the hill-climbing method instead of searching the tree from the level below the root. The length of the minimal reduct and the size of data set can influence which starting search level is more efficient. The experimental results using some of the standard UCI data sets, demonstrate that the proposed algorithm is effective and efficient for data sets with more than 30 features. (c) 2014 Wiley Periodicals, Inc. Complexity 20: 50-62, 2015
机译:基于粗糙集理论的特征约简是模式识别应用中一种有效的特征选择方法。在特征选择的粗糙集方法中,找到原始特征的最小子集是固有的。由于特征约简是一个不确定的多项式时间难问题,因此有必要开发快速的最优或接近最优的特征选择算法。本文旨在提出一种精确的粗糙集特征选择算法,该算法在计算时间方面很有效。所提出的算法开始通过广度优先策略检查解决方案树。修剪的节点保存在trie数据结构的版本中。基于依赖度的单调性质,修剪后的节点的所有子集都不是最优解。因此,通过在特里中检测这些子集,不必计算它们的依赖性程度。继续在树上搜索,直到找到最佳解决方案。通过选择由爬山方法确定的初始搜索级别而不是从根以下的级别搜索树来改进该算法。最小还原的长度和数据集的大小会影响哪个起始搜索级别更有效。使用一些标准UCI数据集的实验结果表明,该算法对于具有30多个特征的数据集是有效的。 (c)2014 Wiley Periodicals,Inc.复杂度20:50-62,2015年

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号