首页> 外文OA文献 >An Efficient Cost-Sensitive Feature Selection Using Chaos Genetic Algorithm for Class Imbalance Problem
【2h】

An Efficient Cost-Sensitive Feature Selection Using Chaos Genetic Algorithm for Class Imbalance Problem

机译:使用Chaos遗传算法进行高效的成本敏感特征选择,用于类别不平衡问题

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

In the era of big data, feature selection is an essential process in machine learning. Although the class imbalance problem has recently attracted a great deal of attention, little effort has been undertaken to develop feature selection techniques. In addition, most applications involving feature selection focus on classification accuracy but not cost, although costs are important. To cope with imbalance problems, we developed a cost-sensitive feature selection algorithm that adds the cost-based evaluation function of a filter feature selection using a chaos genetic algorithm, referred to as CSFSG. The evaluation function considers both feature-acquiring costs (test costs) and misclassification costs in the field of network security, thereby weakening the influence of many instances from the majority of classes in large-scale datasets. The CSFSG algorithm reduces the total cost of feature selection and trades off both factors. The behavior of the CSFSG algorithm is tested on a large-scale dataset of network security, using two kinds of classifiers: C4.5 and k-nearest neighbor (KNN). The results of the experimental research show that the approach is efficient and able to effectively improve classification accuracy and to decrease classification time. In addition, the results of our method are more promising than the results of other cost-sensitive feature selection algorithms.
机译:在大数据的时代,特征选择是机器学习的重要过程。虽然阶级不平衡问题最近引起了大量的关注,但已经进行了很少的努力来开发特征选择技术。此外,大多数涉及特征选择的应用程序侧重于分类准确性但不成本,虽然成本很重要。为了应对不平衡问题,我们开发了一种成本敏感的特征选择算法,使用混沌遗传算法添加了滤波器特征选择的基于成本的评估功能,称为CSFSG。评估函数考虑了网络安全领域的特征获取成本(测试成本)和错误分类成本,从而削弱了许多实例在大型数据集中的大多数类中的影响。 CSFSG算法降低了特征选择的总成本,并从两个因素交易。 CSFSG算法的行为在网络安全的大规模数据集上测试,使用两种分类器:C4.5和k最近邻(KNN)。实验研究的结果表明,该方法有效且能够有效地提高分类准确性并降低分类时间。此外,我们的方法结果比其他成本敏感特征选择算法的结果更有希望。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号