A study of spam filtering using support vector machines

Ola Amayri; Nizar Bouguila

首页> 外文期刊>Artificial Intelligence Review: An International Science and Engineering Journal >A study of spam filtering using support vector machines

【24h】

A study of spam filtering using support vector machines

机译：使用支持向量机的垃圾邮件过滤研究

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Electronic mail is a major revolution taking place over traditional communication systems due to its convenient, economical, fast, and easy to use nature. A major bottleneck in electronic communications is the enormous dissemination of unwanted, harmful emails known as spam emails. A major concern is the developing of suitable filters that can adequately capture those emails and achieve high performance rate. Machine learning (ML) researchers have developed many approaches in order to tackle this problem. Within the context of machine learning, support vector machines (SVM) have made a large contribution to the development of spam email filtering. Based on SVM, different schemes have been proposed through text classification approaches (TC). A crucial problem when using SVM is the choice of kernels as they directly affect the separation of emails in the feature space. This paper presents thorough investigation of several distance-based kernels and specify spam filtering behaviors using SVM. The majority of used kernels in recent studies concern continuous data and neglect the structure of the text. In contrast to classical kernels, we propose the use of various string kernels for spam filtering. We show how effectively string kernels suit spam filtering problem. On the other hand, data preprocessing is a vital part of text classification where the objective is to generate feature vectors usable by SVM kernels. We detail a feature mapping variants in TC that yield improved performance for the standard SVM in filtering task. Furthermore, to cope for realtime scenarios we propose an online active framework for spam filtering. We present empirical results from an extensive study of online, transductive, and online active methods for classifying spam emails in real time. We show that active online method using string kernels achieves higher precision and recall rates.

机译：电子邮件由于其方便，经济，快速和易于使用的性质，是在传统通信系统上发生的重大革命。电子通信的一个主要瓶颈是大量传播垃圾邮件，这些邮件是有害的，有害的电子邮件。一个主要的问题是开发合适的过滤器，以充分捕获那些电子邮件并实现较高的性能。机器学习（ML）研究人员开发了许多方法来解决此问题。在机器学习的背景下，支持向量机（SVM）为垃圾邮件过滤的发展做出了巨大贡献。基于SVM，已经通过文本分类方法（TC）提出了不同的方案。使用SVM时的一个关键问题是内核的选择，因为它们直接影响功能空间中电子邮件的分离。本文对几种基于距离的内核进行了全面的研究，并使用SVM指定了垃圾邮件过滤行为。最近研究中使用的大多数内核都涉及连续数据，而忽略了文本的结构。与经典内核相反，我们建议使用各种字符串内核进行垃圾邮件过滤。我们展示了字符串内核如何有效地解决垃圾邮件过滤问题。另一方面，数据预处理是文本分类的重要组成部分，其目的是生成SVM内核可用的特征向量。我们详细介绍了TC中的功能映射变体，这些变体可提高标准SVM在过滤任务中的性能。此外，为了应对实时情况，我们提出了一个在线主动框架来进行垃圾邮件过滤。我们提供了对在线，转换和在线主动方法进行实时垃圾邮件分类的广泛研究的经验结果。我们表明，使用字符串内核的主动在线方法可以实现更高的精度和召回率。

著录项

来源
《Artificial Intelligence Review: An International Science and Engineering Journal》 |2010年第1期|共36页
作者
Ola Amayri; Nizar Bouguila;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类人工智能理论;
关键词
Spam filtering; Support vector machines; String kernels; Feature mapping; Online active;

机译：垃圾邮件过滤;支持向量机;字符串内核;功能映射;在线活动;

相似文献

外文文献
中文文献
专利

1. A study of spam filtering using support vector machines [J] . Ola Amayri, Nizar Bouguila Artificial Intelligence Review: An International Science and Engineering Journal . 2010,第1期

机译：使用支持向量机的垃圾邮件过滤研究
2. Mobile SMS Spam Filtering for Nepali Text Using Na?ve Bayesian and Support Vector Machine [J] . Tej Bahadur Shahi, Abhimanu Yadav International Journal of Intelligence Science . 2014,第1期

机译：使用朴素贝叶斯和支持向量机对尼泊尔文本进行移动SMS垃圾邮件过滤
3. Applications of Support Vector Machine Based on Boolean Kernel to Spam Filtering [J] . Shugang Liu, Kebin Cui Modern Applied Science . 2009,第10期

机译：基于布尔内核的支持向量机对垃圾邮件过滤的应用
4. A Model for Spam Filtering Using Support Vector Machine and Artificial Immune System [C] . Yaping Jiang, Hao Guo, Peigen Guo International Symposium on Advances in Electrical, Electronics and Computer Engineering . 2017

机译：用支持向量机和人工免疫系统垃圾邮件过滤模型
5. On email spam filtering using support vector machine. [D] . Amayri, Ola. 2009

机译：在使用支持向量机的电子邮件垃圾邮件过滤中。
6. Rolling Element Bearing Fault Diagnosis by Combining Adaptive Local Iterative Filtering Modified Fuzzy Entropy and Support Vector Machine [O] . Keheng Zhu, Liang Chen, Xiong Hu 2018

机译：通过组合自适应局部迭代过滤改进的模糊熵和支持向量机滚动元件轴承故障诊断
7. Implementasi Dan Analisa Granular Support Vector Machine Dengan Data Cleaning (Gsvm-dc) Untuk E-mail Spam Filtering [O] . Mahsus, Moh, Baizal, ZK. Abdurahman, Shaufiah 2012

机译：带有数据清理（Gsvm-dc）的垃圾邮件过滤功能的粒状支持向量机的实现与分析

A study of spam filtering using support vector machines

摘要

著录项

相似文献

相关主题

期刊订阅