首页> 外文会议>International Conference on Machine Learning, Optimization, and Data Science >Covering Arrays to Support the Process of Feature Selection in the Random Forest Classifier
【24h】

Covering Arrays to Support the Process of Feature Selection in the Random Forest Classifier

机译:覆盖阵列以支持随机林类分类器中的功能选择过程

获取原文

摘要

The Random Forest (RF) algorithm consists of an assembly of base decision trees, constructed from Bootstrap subsets of the original dataset. Each subset is a sample of instances (rows) by a random subset of features (variables or columns) of the original dataset to be classified. In RF, pruning is not applied in the generation of base trees and in the classification process of a new record, each tree issues a vote enabling the selected class to be defined, as that with the most votes. Bearing in mind that in the state of the art it is defined that random feature selection for constructing the Bootstrap subsets decreases the quality of the results achieved with RF, in this work the integration of covering arrays (CA) in RF is proposed to solve this situation, in an algorithm called RFCA. In RFCA, the number N of rows of the CA defines the lowest number of base trees that require to be generated in RF and each row of the CA defines the features that each Bootstrap subset will use in the creation of each tree. To evaluate the new proposal, 32 datasets available in the UCI repository are used and compared with the RF available in Weka. The experiments show that the use of a CA of strength 2 to 7 obtains promising results in terms of accuracy.
机译:随机森林(RF)算法由基本决策树的组装组成,由原始数据集的引导子集构成。每个子集是由原始数据集的随机特征(变量或列)的例子(行)样本。在RF中,修剪不应用于基础树的生成以及在新记录的分类过程中,每棵树都发出了一种旨在定义所选类的投票,以及最多的投票。记住,在本领域的状态下,定义用于构造引导子集的随机特征选择会降低利用RF实现的结果的质量,在该工作中,提出了RF中的覆盖阵列(CA)的集成来解决这个问题情况,以一种名为RFCA的算法。在RFCA中,CA行的数量n定义需要在RF中生成的最低数量的基础树,并且CA的每一行定义了每个Bootstrap子集将在每棵树的创建中使用的功能。为了评估新的提案,使用了UCI存储库中可用的32个数据集,并与Weka中可用的RF进行比较。实验表明,使用强度2至7的CA在准确性方面获得了有希望的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号