An Anti-Spam Filter Based on One-Class IB Method in Small Training Sets

Yang Chen; Zhao Shaofeng; Zhang Dan; Ma Junxia

首页> 外文期刊>The international arab journal of information technology >An Anti-Spam Filter Based on One-Class IB Method in Small Training Sets

【24h】

An Anti-Spam Filter Based on One-Class IB Method in Small Training Sets

机译：小型训练集中基于一类IB方法的反垃圾邮件过滤器

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We present an approach to email filtering based on one-class Information Bottleneck (IB) method in small training sets. When themes of emails are changing continually, the available training set which is high-relevant to the current theme will be small. Hence, we further show how to estimate the learning algorithm and how to filter the spam in the small training sets. First, In order to preserve classification accuracy and avoid over-fitting while substantially reducing training set size, we consider the learning framework as the solution of one-class centroid only averaged by highly positive emails, and second, we design a simple binary classification model to filters spam by the comparison of similarity between emails and centroids. Experimental results show that in small training sets our method can significantly improve classification accuracy compared with the currently popular methods, such as: Naive Bayes, AdaBoost and SVM.

机译：我们提出了一种基于小型培训集中的一类信息瓶颈（IB）方法的电子邮件过滤方法。当电子邮件主题不断变化时，与当前主题高度相关的可用培训集会很小。因此，我们进一步展示了如何估计学习算法以及如何在小型训练集中过滤垃圾邮件。首先，为了保持分类的准确性并避免过度拟合，同时大幅减少训练集的大小，我们将学习框架视为仅由高度肯定的电子邮件平均的一类质心的解决方案，其次，我们设计了一个简单的二进制分类模型通过比较电子邮件和质心之间的相似性来过滤垃圾邮件。实验结果表明，与目前流行的方法（如朴素贝叶斯，AdaBoost和SVM）相比，在小的训练集中，我们的方法可以显着提高分类准确性。

著录项

来源
《The international arab journal of information technology》 |2016年第6期|677-685|共9页
作者
Yang Chen; Zhao Shaofeng; Zhang Dan; Ma Junxia;
展开▼
作者单位

Renmin Univ China, Sch Informat, Beijing, Peoples R China|Zhengzhou Univ Light Ind, Sch Software Engn, Zhengzhou, Peoples R China;

Henan Univ Econ & Law, Coll Comp & Informat Engn, Zhengzhou, Peoples R China;

China Earthquake Adm, Geophys Explorat Ctr, Beijing, Peoples R China;

Zhengzhou Univ Light Ind, Sch Software Engn, Zhengzhou, Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
IB method; one-class IB; anti-spam filter; Small training sets;

机译：IB方法;一类IB;反垃圾邮件过滤器;小型训练集;

相似文献

外文文献
中文文献
专利

1. Optimization-based methodology for training set selection to synthesize composite correlation filters for face recognition [J] . Santiago-Ramirez Everardo, Gonzalez-Fraga Jose A., Gutierrez Everardo, Signal Processing. Image Communication: A Publication of the the European Association for Signal Processing . 2016,第Null期

机译：基于优化的训练集选择方法，用于合成用于脸部识别的复合相关滤波器
2. Optimization of Linear Filtering Model to Predict Post-LASIK Corneal Smoothing Based on Training Data Sets [J] . Anatoly Fabrikant, Guang-Ming Dai, Dimitri Chernyak Applied Mathematics . 2013,第12期

机译：基于训练数据集的预测LASIK后角膜平滑度的线性滤波模型的优化
3. Comparing set reconciliation methods based on bloom filters and their variants [J] . Z. Hu, X. Teng, D. Guo, Tsinghua Science and Technology . 2016,第2期

机译：比较基于Bloom过滤器及其变体的集合对帐方法
4. Research in Anti-Spam Method Based on Bayesian Filtering [C] . Jiansheng Wu, Tao Deng Pacific-Asia Workshop on Computational Intelligence and Industrial Application . 2008

机译：基于贝叶斯滤波的反垃圾邮件方法研究
5. Statistical methods for gene set annotation optimization, unsupervised gene set testing and independent gene set filtering. [D] . Frost, Hildreth Robert. 2015

机译：用于基因组注释优化，无监督基因组测试和独立基因组过滤的统计方法。
6. Vascular Tree Segmentation in Medical Images Using Hessian-Based Multiscale Filtering and Level Set Method [O] . Jiaoying Jin, Linjun Yang, Xuming Zhang, 2013

机译：基于Hessian的多尺度滤波和水平集方法对医学图像中的血管树进行分割
7. Patch-recovery filters for curvature in discontinuous Galerkin-based level-set methods [O] . Kummer, Florian, Warburton, Tim 2015

机译：补丁过滤器用于基于Galerkin的不连续曲率水平集方法

An Anti-Spam Filter Based on One-Class IB Method in Small Training Sets

摘要

著录项

相似文献

相关主题

期刊订阅