...
首页> 外文期刊>Knowledge and information systems >A modified content-based evolutionary approach to identify unsolicited emails
【24h】

A modified content-based evolutionary approach to identify unsolicited emails

机译:基于修改的内容的进化方法来识别未经请求的电子邮件

获取原文
获取原文并翻译 | 示例
           

摘要

This computational research seeks to classify unsolicited versus legitimate emails. A modified version of an existing genetic programming (GP) classifier-i.e., modified genetic programming (MGP)-is implemented to build an ensemble of classifiers to identify unsolicited emails. The proposed classifier is assessed using informative features extracted from two corpora (Enron and SpamAssassin) with the help of the greedy stepwise feature search method. Further, a comparative study is performed with other popular classifiers, such as Bayesian network, naive Bayes, decision tree, random forest (RF), support vector machine (SVM), and GP. Further the results are validated with 20-fold cross-validation and paired T test. The results prove that the proposed classifier performs better in terms of accuracy and false-positive detection in comparison with the other machine learning classifiers tested in this study. Using different training and testing a set of email files from the Enron corpus, ensemble-based classifiers, such as boosted SVM, boosted Bayesian, boosted naive Bayesian, RF, and the proposed MGP classifier, are tested and compared on all metrics, including training and testing time. The findings suggest that the MGP classifier with the greedy stepwise feature search method offers an improvement over alternative methods in detecting unsolicited emails.
机译:此计算研究旨在对未经请求的与合法电子邮件进行分类。修改版本的现有基因编程(GP)分类器-i.e.,修改的遗传编程(MGP) - 实现了构建分类器的集合,以识别未经请求的电子邮件。使用从贪婪逐步特征搜索方法的帮助,使用从两个语料库(SENOR和SPAMASSASIN)中提取的信息特征来评估所提出的分类器。此外,比较研究与其他流行的分类器进行,例如贝叶斯网络,幼稚贝叶斯,决策树,随机林(RF),支持向量机(SVM)和GP。此外,结果用20倍的交叉验证和配对T测试验证。结果证明,与本研究中测试的其他机器学习分类器相比,所提出的分类器在准确性和假阳性检测方面表现更好。使用不同的培训和测试来自Enron语料库的一组电子邮件文件,基于集合的分类器,例如增强SVM,增强贝叶斯,促进的天真贝叶斯,RF和所提出的MGP分类器,并比较所有度量,包括培训和测试时间。研究结果表明,具有贪婪逐步特征搜索方法的MGP分类器提供了在检测未经请求的电子邮件中的替代方法的改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号