首页> 外文OA文献 >Using online linear classifiers to filter spam Emails
【2h】

Using online linear classifiers to filter spam Emails

机译:使用在线线性分类器来过滤垃圾邮件

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The performance of two online linear classifiers - the Perceptron and Littlestone’s Winnow – is explored for two anti-spam filtering benchmark corpora - PU1 and Ling-Spam. We study the performance for varying numbers of features, along with three different feature selection methods: Information Gain (IG), Document Frequency (DF) and Odds Ratio. The size of the training set and the number of training iterations are also investigated for both classifiers. The experimental results show that both the Perceptron and Winnow perform much better when using IG or DF than using Odds Ratio. It is further demonstrated that when using IG or DF, the classifiers are insensitive to the number of features and the number of training iterations, and not greatly sensitive to the size of training set. Winnow is shown to slightly outperform the Perceptron. It is also demonstrated that both of these online classifiers perform much better than a standard Naïve Bayes method. The theoretical and implementation computational complexity of these two classifiers are very low, and they are very easily adaptively updated. They outperform most of the published results, while being significantly easier to train and adapt. The analysis and promising experimental results indicate that the Perceptron and Winnow are two very competitive classifiers for anti-spam filtering.
机译:针对两个反垃圾邮件过滤基准程序集PU1和Ling-Spam,探索了两个在线线性分类器Perceptron和Littlestone的Winnow的性能。我们研究了多种特征的性能,以及三种不同的特征选择方法:信息增益(IG),文档频率(DF)和几率。还针对两个分类器研究了训练集的大小和训练迭代次数。实验结果表明,使用IG或DF时,Perceptron和Winnow的性能都比使用“赔率”好得多。进一步证明,当使用IG或DF时,分类器对特征数量和训练迭代次数不敏感,对训练集的大小不敏感。 Winnow被证明略胜于Perceptron。还证明了这两种在线分类器的性能都比标准朴素贝叶斯方法好得多。这两个分类器的理论和实现计算复杂度很低,并且很容易自适应地更新。它们的性能优于大多数已发布的结果,同时更易于训练和适应。分析和有希望的实验结果表明,Perceptron和Winnow是反垃圾邮件过滤的两个非常有竞争力的分类器。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号