首页> 外文会议>International Workshop on Big Data and Information Security >Optimization of Phishing Website Classification Based on Synthetic Minority Oversampling Technique and Feature Selection
【24h】

Optimization of Phishing Website Classification Based on Synthetic Minority Oversampling Technique and Feature Selection

机译:基于综合少数族群过采样技术和特征选择的钓鱼网站分类优化

获取原文

摘要

This paper presents a new approach for optimizing phishing website classification based on Synthetic Minority Oversampling Technique (SMOTE) together with feature selection. Classification is a kind of supervised machine learning technique that learns based on the features to identify the class. However, not all features are relevant to identify phishing websites and the class imbalance problem leads to suboptimal performances. Therefore, we propose SMOTE for handling the class imbalance problem by generating new synthetic instances for the minority class. Filter-based feature selection using Information Gain and Correlation are proposed for reducing irrelevant features. The classification performances are evaluated using K-Nearest Neighbor (KNN) classifier. The results demonstrate that SMOTE effectively increases the classification performances in terms of accuracy, precision, recall, and F-measure with more time-efficient. The performance of SMOTE combined with feature selection is validated and benchmarked with different techniques both on full features and reduced features. The results demonstrate that our proposed technique presents the highest accuracy, i.e. 97.47% on full features and 94.87% on reduced features. Hence, our proposed technique is promising in optimizing phishing website classification.
机译:本文提出了一种基于综合少数群体过采样技术(SMOTE)和特征选择的网络钓鱼网站分类优化新方法。分类是一种有监督的机器学习技术,它基于特征进行学习以识别类。但是,并非所有功能都与识别网络钓鱼网站相关,并且类别不平衡问题导致性能不佳。因此,我们建议SMOTE通过为少数群体生成新的综合实例来处理阶级失衡问题。为了减少不相关的特征,提出了使用信息增益和相关性的基于滤波器的特征选择。使用K最近邻(KNN)分类器评估分类性能。结果表明,SMOTE在准确性,准确性,查全率和F度量方面有效地提高了分类性能,并且具有更高的时间效率。 SMOTE的性能与功能选择相结合,已通过完整功能和简化功能的不同技术进行了验证和基准测试。结果表明,我们提出的技术呈现出最高的精度,即完整特征为97.47%,缩小特征为94.87%。因此,我们提出的技术在优化网络钓鱼网站分类方面很有前途。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号