首页> 外文期刊>The international arab journal of information technology >An Anti-Spam Filter Based on One-Class IB Method in Small Training Sets
【24h】

An Anti-Spam Filter Based on One-Class IB Method in Small Training Sets

机译:小型训练集中基于一类IB方法的反垃圾邮件过滤器

获取原文
获取原文并翻译 | 示例
           

摘要

We present an approach to email filtering based on one-class Information Bottleneck (IB) method in small training sets. When themes of emails are changing continually, the available training set which is high-relevant to the current theme will be small. Hence, we further show how to estimate the learning algorithm and how to filter the spam in the small training sets. First, In order to preserve classification accuracy and avoid over-fitting while substantially reducing training set size, we consider the learning framework as the solution of one-class centroid only averaged by highly positive emails, and second, we design a simple binary classification model to filters spam by the comparison of similarity between emails and centroids. Experimental results show that in small training sets our method can significantly improve classification accuracy compared with the currently popular methods, such as: Naive Bayes, AdaBoost and SVM.
机译:我们提出了一种基于小型培训集中的一类信息瓶颈(IB)方法的电子邮件过滤方法。当电子邮件主题不断变化时,与当前主题高度相关的可用培训集会很小。因此,我们进一步展示了如何估计学习算法以及如何在小型训练集中过滤垃圾邮件。首先,为了保持分类的准确性并避免过度拟合,同时大幅减少训练集的大小,我们将学习框架视为仅由高度肯定的电子邮件平均的一类质心的解决方案,其次,我们设计了一个简单的二进制分类模型通过比较电子邮件和质心之间的相似性来过滤垃圾邮件。实验结果表明,与目前流行的方法(如朴素贝叶斯,AdaBoost和SVM)相比,在小的训练集中,我们的方法可以显着提高分类准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号