首页> 中文期刊> 《计算机工程与应用》 >动态误分类代价下代价敏感属性选择分治算法

动态误分类代价下代价敏感属性选择分治算法

         

摘要

代价敏感属性选择问题的目的是通过权衡测试代价和误分类代价,得到一个具有最小总代价的属性子集.目前,多数代价敏感属性选择方法只考虑误分类代价固定不变的情况,不能较好地解决类分布不均衡等问题.而在大规模数据集上,算法效率不理想也是代价敏感属性选择的主要问题之一.针对这些问题,以总代价最小为目标,设计了一种新的动态误分类代价机制.结合分治思想,根据数据集规模按列自适应拆分各数据集.基于动态误分类代价重新定义最小代价属性选择问题,提出了动态误分类代价下的代价敏感属性选择分治算法.通过实验表明,该算法能在提高效率的同时获得最优误分类代价,从而保证所得属性子集的总代价最小.%Cost-sensitive feature selection problem aims at getting an attribute subset with the minimal total cost, through considering the trade-off between test costs and misclassification costs. There are two main challenges in cost-sensitive feature selection problem. On the one hand, most of the cost-sensitive attribute selection methods only take fixed misclas-sification costs into account, thus these methods can't solve imbalance class problems. On the other hand, the efficiency is not ideal when dealing with cost-sensitive feature selection on large scale datasets. In this paper, the contributions for the two challenges are summarized as follows. Firstly, it designs a new dynamic mechanism of misclassification costs to minimize total cost. Secondly, each of datasets is adaptively divided according to the scale of the dataset based on divide and conquer method. Finally, cost-sensitive feature selection problem is redefined based on dynamic misclassification costs, and a divide and conquer algorithm is proposed for cost-sensitive feature selection problem. The proposed algo-rithm is compared with two other algorithms on seven UCI datasets. Some experiments demonstrate that the proposed algo-rithm can improve the efficiency and obtain the optimal misclassification costs as well, so as to ensure to minimize total cost.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号