首页> 外文期刊>Mathematical Problems in Engineering >An Efficient Cost-Sensitive Feature Selection Using Chaos Genetic Algorithm for Class Imbalance Problem
【24h】

An Efficient Cost-Sensitive Feature Selection Using Chaos Genetic Algorithm for Class Imbalance Problem

机译:基于混沌遗传算法的类不平衡问题的高效代价敏感特征选择

获取原文
获取原文并翻译 | 示例

摘要

In the era of big data, feature selection is an essential process in machine learning. Although the class imbalance problem has recently attracted a great deal of attention, little effort has been undertaken to develop feature selection techniques. In addition, most applications involving feature selection focus on classification accuracy but not cost, although costs are important. To cope with imbalance problems, we developed a cost-sensitive feature selection algorithm that adds the cost-based evaluation function of a filter feature selection using a chaos genetic algorithm, referred to as CSFSG. The evaluation function considers both feature-acquiring costs (test costs) and misclassification costs in the field of network security, thereby weakening the influence of many instances from the majority of classes in large-scale datasets. The CSFSG algorithm reduces the total cost of feature selection and trades off both factors. The behavior of the CSFSG algorithm is tested on a large-scale dataset of network security, using two kinds of classifiers: C4.5 and k-nearest neighbor (KNN). The results of the experimental research show that the approach is efficient and able to effectively improve classification accuracy and to decrease classification time. In addition, the results of our method are more promising than the results of other cost-sensitive feature selection algorithms.
机译:在大数据时代,特征选择是机器学习中必不可少的过程。尽管类不平衡问题最近引起了极大的关注,但很少有人努力开发特征选择技术。此外,尽管成本很重要,但大多数涉及特征选择的应用程序都专注于分类准确性而不关注成本。为了解决不平衡问题,我们开发了一种成本敏感的特征选择算法,该算法使用称为CSFSG的混沌遗传算法添加了基于成本的过滤器特征选择评估功能。评估功能同时考虑了网络安全领域中的功能获取成本(测试成本)和分类错误成本,从而削弱了大型数据集中大多数类别的许多实例的影响。 CSFSG算法减少了特征选择的总成本,并且在两个因素之间进行权衡。 CSFSG算法的行为在网络安全的大规模数据集上进行了测试,使用两种分类器:C4.5和k最近邻(KNN)。实验研究结果表明,该方法是有效的,能够有效提高分类精度,减少分类时间。另外,我们的方法的结果比其他成本敏感的特征选择算法的结果更有希望。

著录项

  • 来源
    《Mathematical Problems in Engineering》 |2016年第6期|8752181.1-8752181.9|共9页
  • 作者单位

    Taiyuan Univ Technol, Coll Comp Sci & Technol, Yingze St 79, Taiyuan 030024, Peoples R China|Shanxi Med Coll Continuing Educ, Ctr Informat & Network, Shuangtasi St 22, Taiyuan 030012, Peoples R China;

    Taiyuan Univ Technol, Coll Comp Sci & Technol, Yingze St 79, Taiyuan 030024, Peoples R China;

    Taiyuan Univ Technol, Coll Comp Sci & Technol, Yingze St 79, Taiyuan 030024, Peoples R China;

    Shanxi Branch Agr Bank China, Technol & Prod Management, Nanneihuan St 33, Taiyuan 030024, Peoples R China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号