首页> 外文期刊>Knowledge-Based Systems >A comparative study for content-based dynamic spam classification using four machine learning algorithms
【24h】

A comparative study for content-based dynamic spam classification using four machine learning algorithms

机译:使用四种机器学习算法的基于内容的动态垃圾邮件分类的比较研究

获取原文
获取原文并翻译 | 示例

摘要

The growth of email users has resulted in the dramatic increasing of the spam emails during the past few years. In this paper, four machine learning algorithms, which are Naive Bayesian (NB), neural network (NN), support vector machine (SVM) and relevance vector machine (RVM), are proposed for spam classification. An empirical evaluation for them on the benchmark spam filtering corpora is presented. The experiments are performed based on different training set size and extracted feature size. Experimental results show that NN classifier is unsuitable for using alone as a spam rejection tool. Generally, the performances of SVM and RVM classifiers are obviously superior to NB classifier. Compared with SVM, RVM is shown to provide the similar classification result with less relevance vectors and much faster testing time. Despite the slower learning procedure, RVM is more suitable than SVM for spam classification in terms of the applications that require low complexity.
机译:电子邮件用户的增长导致过去几年垃圾邮件的激增。本文针对垃圾邮件分类,提出了朴素贝叶斯算法,神经网络NN,支持向量机SVM和相关向量机RVM 4种机器学习算法。提出了针对他们的基准垃圾邮件过滤语料库的经验评估。根据不同的训练集大小和提取的特征大小执行实验。实验结果表明,NN分类器不适合单独用作垃圾邮件拒绝工具。通常,SVM和RVM分类器的性能明显优于NB分类器。与SVM相比,RVM被证明可以提供相似的分类结果,相关向量更少,测试时间更快。尽管学习过程较慢,但就要求低复杂度的应用而言,RVM比SVM更适合于垃圾邮件分类。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号