Application of Linear Classifier on Chinese Spam Filtering

Yongqin Qiu; Yan Xu; Dan Li

首页> 外文期刊>Journal of software >Application of Linear Classifier on Chinese Spam Filtering

【24h】

Application of Linear Classifier on Chinese Spam Filtering

机译：线性分类器在中文垃圾邮件过滤中的应用

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Spam is a key problem in electronic communication. Especially in large-scale email systems. Content-based filtering is one mainstream method of combating this threat in its forms, an e-mail filtering system can learn directly from a user's mail set, but the previous Content-based filtering methods are hard to find a balance between efficiency and effectiveness. Such algorithms of text categorization as Naive Bayes, kNN, Decision Tree and Boosting can be applied in spam filtering. However, the effectiveness of Naive Bayes is limited and it is not fit for instant feedback learning. Others algorithm such as SVM are more effective but complicated to compute. Because in a real email system a large volume of emails often need to be handled in a short time, efficiency will often be as important as effectiveness when implementing an anti-spam filtering method. So we intend to find a linear classifier to solve this problem, two online linear classifiers: the Perception and Winnow were explored for this task, which are two fast linear classifiers. The training of these two methods is online and mistake driven. Furthermore, they are suitable for feedback. We employ the two methods in three benchmark corpora, including PU1, Ling spam and 2005-Jun, the experiments in public e-mail corpus show an effective result. We conclude that the two online linear classifiers have a state-of-the-art performance for filtering spam, especially for Chinese spam emails.

机译：垃圾邮件是电子通信中的关键问题。特别是在大型电子邮件系统中。基于内容的过滤是一种抗击这种形式威胁的主流方法，电子邮件过滤系统可以直接从用户的邮件集中学习，但是以前的基于内容的过滤方法很难在效率和有效性之间找到平衡。诸如Naive Bayes，kNN，决策树和Boosting之类的文本分类算法可以应用于垃圾邮件过滤。但是，朴素贝叶斯的有效性是有限的，它不适合即时反馈学习。其他算法（例如SVM）更有效，但计算复杂。因为在实际的电子邮件系统中，经常需要在短时间内处理大量电子邮件，所以在实施反垃圾邮件过滤方法时，效率通常与有效性同等重要。因此，我们打算找到一个线性分类器来解决此问题，为此任务探索了两个在线线性分类器：Perception和Winnow，它们是两个快速线性分类器。这两种方法的培训是在线的并且是错误驱动的。此外，它们适合反馈。我们在PU1，Ling垃圾邮件和2005年6月这三个基准语料库中采用了这两种方法，在公共电子邮件语料库中进行的实验显示了有效的结果。我们得出的结论是，这两个在线线性分类器在过滤垃圾邮件（尤其是中文垃圾邮件）方面具有最先进的性能。

著录项

来源
《Journal of software》 |2011年第1期|p.116-123|共8页
作者
Yongqin Qiu; Yan Xu; Dan Li;
展开▼
作者单位

Beijing Language and Culture University, Beijing, China;

Beijing Language and Culture University, Beijing, China;

Beijing Language and Culture University, Beijing, China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
anti-spam; information filtering; winnow; perception; linear classifier;

机译：反垃圾邮件信息过滤;簸;知觉;线性分类器;

相似文献

外文文献
中文文献
专利

1. Application of Linear Classifier on Chinese Spam Filtering [J] . Yongqin Qiu, Yan Xu, Dan Li Journal of Computers . 2011,第1期

机译：线性分类器在中文垃圾邮件过滤中的应用
2. Application of Linear Classifier on Chinese Spam Filtering [J] . Yongqin Qiu, Yan Xu, Dan Li Journal of software . 2011,第1期

机译：线性分类器在中文垃圾邮件过滤中的应用
3. Application of Adaptive Splitting and Selection Classifier to the SPAM Filtering Problem [J] . KONRAD JACKOWSKI, BARTOSZ KRAWCZYK, MICHAL WOZNIAK Cybernetics and Systems . 2013,第5a8期

机译：自适应分割选择分类器在垃圾邮件过滤问题中的应用
4. Novel IPCA-Based Classifiers and Their Application to Spam Filtering [C] . Rozza Alessandro, Lombardi Gabriele, Casiraghi Elena Intelligent Systems Design and Applications, 2009. ISDA '09 . 2009

机译：基于IPCA的新型分类器及其在垃圾邮件过滤中的应用
5. Theory and design of mixed lumped-distributed cross-coupled filters with applications to linear phase shifter and tunable filters. [D] . Shin, Sanghoon. 2002

机译：混合集总分布交叉耦合滤波器的理论和设计及其在线性移相器和可调滤波器中的应用。
6. Radiation dose reduction with application of non-linear adaptive filters for abdominal CT [O] . Sarabjeet Singh, Mannudeep K Kalra, Mi Kim Sung, 2012

机译：应用非线性自适应滤波器降低腹部CT的辐射剂量
7. Using online linear classifiers to filter spam Emails [O] . Wang Bin, Jones Gareth J.F., Wenfeng Pan 2007

机译：使用在线线性分类器来过滤垃圾邮件

Application of Linear Classifier on Chinese Spam Filtering

摘要

著录项

相似文献

相关主题

期刊订阅