GA-based feature subset selection in a spamon-spam detection system

机译：垃圾邮件/非垃圾邮件检测系统中基于GA的特征子集选择

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Spam has created a significant security problem for computer users everywhere. Spammers take an advantage of defrauds to cover parts of messages that can be used for identification of spam. For instance, a spammer does not need to consume much cost and bandwidth for sending junk mails even more than one hundred emails. On the other hand, from the feature selection perspective, one of the specific problems that decrease accuracy of spam and non-spam emails classification is high data dimensionality. Therefore, the reduction of dimensionality is related to decrease the number of irrelevant features. In this paper, a genetic algorithm (GA) is applied during feature selection in effort to decrease the number of useless features in a collection of high-dimensional email body and subject. Next, a Multi-Layer Perceptron (MLP) is employed to classify features that have been selected by the GA. Using LingSpam benchmark corpora as the dataset, the experimental results showed that a GA feature selector with the MLP classifier does not only decrease the data dimensionality but increase the spam detection rate as compared against other classifiers such as SVM and Naïve Bayes.

机译：垃圾邮件已为世界各地的计算机用户带来了严重的安全问题。垃圾邮件发送者利用欺诈的优势来覆盖可用于识别垃圾邮件的部分邮件。例如，垃圾邮件发送者无需花费太多成本和带宽来发送垃圾邮件，甚至可以发送一百多个电子邮件。另一方面，从功能选择的角度来看，降低垃圾邮件和非垃圾邮件分类准确性的特定问题之一是数据维度高。因此，降维与减少无关特征的数量有关。本文在特征选择过程中应用了遗传算法（GA），以减少高维电子邮件正文和主题集合中无用特征的数量。接下来，使用多层感知器（MLP）对GA选定的特征进行分类。使用LingSpam基准语料库作为数据集，实验结果表明，与其他分类器（例如SVM和朴素贝叶斯）相比，带有MLP分类器的GA特征选择器不仅降低了数据维数，而且提高了垃圾邮件检测率。

著录项

来源
《2012 international conference on computer and communication engineering》|2012年|p.675- 679|共5页
会议地点 Kuala Lumpur(MY)
作者
Behjat Amir Rajabi; Mustapha Aida; Nezamabadi-pour Hossein; Sulaiman Md. Nasir; Mustapha Norwati;
展开▼
作者单位

Faculty of Computer Science and Information Technology, Universiti Putra Malaysia, 43400 UPM Serdang, Selangor, Malaysia;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类通信;计算技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. A feature selection approach to find optimal feature subsets for the network intrusion detection system [J] . Kang Seung-Ho, Kim Kuinam J. Cluster computing . 2016,第1期

机译：为网络入侵检测系统找到最佳特征子集的特征选择方法
2. EEG signal processing for epilepsy seizure detection using 5-level Db4 discrete wavelet transform, GA-based feature selection and ANN/SVM classifiers [J] . Omidvar Mehdi, Zahedi Abdulhamid, Bakhshi Hamidreza Journal of ambient intelligence and humanized computing . 2021,第11期

机译：EEG信号处理用于癫痫癫痫发作检测的使用5级DB4离散小波变换，基于GA的特征选择和ANN / SVM分类器
3. A HYBRID METHOD FOR INTRUSION DETECTION WITH GA-BASED FEATURE SELECTION [J] . Zhi-Xian Chen, Hao Huang Intelligent automation and soft computing . 2011,第2期

机译：基于GA的特征选择的混合检测方法。
4. GA-based feature subset selection in a spam/non-spam detection system [C] . Behjat Amir Rajabi, Mustapha Aida, Nezamabadi-pour Hossein, International Conference on Computer and Communication Engineering . 2012

机译：垃圾邮件/非垃圾邮件检测系统的基于GA的特征子集选择
5. Feature Selection Via Random Subsets of Uncorrelated Features [D] . Long, Dang Kim. 2020

机译：通过无相关的功能的随机子集选择功能选择
6. GA-Based Selection of Vaginal Microbiome Features Associated with Bacterial Vaginosis [O] . Joi Carter, Daniel Beck, Henry Williams, -1

机译：基于GA的与细菌性阴道病相关的阴道微生物组特征的选择
7. GA-based feature subset selection for myoelectric classification, in [O] . Mohammadreza Asghari, Oskoei Huosheng Hu 2006

机译：基于遗传算法的肌电分类特征子集选择

GA-based feature subset selection in a spamon-spam detection system

摘要

著录项

相似文献

相关主题

期刊订阅