Improving Statistical Bayesian Spam Filtering Algorithms

代理获取

页面导航

目录
摘要
著录项
相似文献
相关主题

摘要

The aim of this thesis is to improve accuracy of Bayesian spam filtering, the most popular and widely used approach in spam filtering. Among the various possible approaches to this aim, two approaches that improved the filtering performances arepresented in this thesis. Three popular evolutions of Bayesian spam filtering algo rithms: Naive Bayes, Paul Graham's and Gary Robinson's are reviewed. Formulated on top of those evolutions, proposed algorithms incorporate new novel ideas. The first approach proposed is co-weighting of multiple probability estimations. Though based on Bayesian theorem, several ways of computing probability estima tions have been proposed and used. Those estimations are examined and a new,combined, more effective estimation based on co-weighted multi-estimations is pro posed. The approach is compared with individual estimations. The second approach is based on co-weighted multi-area information. Bayesian spam filters, in general, compute probability estimations for tokens either without considering the email areas of occurrences except the body or treating the same token occurred in different areas as different tokens. However, in reality the same token occurring in different areas are inter-related and the relation too could play role in the classific ation. This novel idea is incorporated, co-relating multi-area information by co-weighting them and obtaining more effective combined integrated probability estimations for tokens. It is shown that this approach also improves the performance of spam filtering. The new approach is compared with individual area-wise estimations and traditional separate estimations in all areas. The filters are tested by thorough experiments with three well known public cor pora: Ling Spam, Spam Assassin and Annexia/Xpert and they are evaluated using several performance measures. Both the proposed approaches are shown to exhibit significant improvement, stability, robustness and consistency in the spam filtering.Algorithms

著录项

作者
Raju Shrestha;
展开▼
作者单位

湖南大学;

展开▼
授予单位湖南大学;
学科 Computer Science and Technology
授予学位硕士
导师姓名 Yaping Lin;
年度 2005
页码
总页数
原文格式 PDF
正文语种英文
中图分类自动推理、机器学习;
关键词
Bayesian spam filtering; performance; Algorithms;

相似文献

中文文献
外文文献
专利

1. Mobile SMS Spam Filtering for Nepali Text Using Naive Bayesian and Support Vector Machine [J] . Tej Bahadur Shahi ,Abhimanu Yadav . 智能科学国际期刊（英文） . 2014,第1期
2. An Improved Bayesian with Application to Anti-Spam Email [J] . ZHAN Chuan ,LU Xian-liang ,ZHOU Xu . 电子科技学刊 . 2005,第001期
3. Secure Path Cycle Selection Method Using Fuzzy Logic System for Improving Energy Efficiency in Statistical En-Route Filtering Based WSNs [J] . Su Man Nam ,Chung Il Sun ,Tae Ho Cho . 无线传感网络（英文） . 2011,第11期
4. Solar radio filtering algorithm based on improved long short-term memory [J] . Qing-Fu Du ,Qiao-Man Zhang ,Xin Li . 天文和天体物理学研究 . 2021,第004期
5. Solar radio filtering algorithm based on improved long short-term memory [J] . Qing-Fu Du ,Qiao-Man Zhang ,Xin Li . 天文和天体物理学研究 . 2021,第004期
6. Polymer Doping for High Efficiency Perovskite Solar Cells with Improved Stability [C] . JIANG Wenxuan ,姜杰轩 ,WANG Kang . 第十三届全国太阳级硅及光伏发电研讨会 . 2017
7. Improved technology to improve the quality of Cang’e Biyan Tablets [A] . 沈树虎 . 2012

Improving Statistical Bayesian Spam Filtering Algorithms

目录

摘要

著录项

相似文献

相关主题

期刊订阅