An experimental comparison of naive Bayesian and keyword-based anti-spam filtering with personal e-mail messages

机译：朴素贝叶斯和基于关键字的反垃圾邮件过滤与个人电子邮件的实验比较

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The growing problem of unsolicited bulk e-mail, also known as "spam", has generated a need for reliable anti-spam e-mail filters. Filters of this type have so far been based mostly on manually constructed keyword patterns. An alternative approach has recently been proposed, whereby a Naive Bayesian classifier is trained automatically to detect spam messages. We test this approach on a large collection of personal e-mail messages, which we make publicly available in "encrypted" form contributing towards standard benchmarks. We introduce appropriate cost-sensitive measures, investigating at the same time the effect of attribute-set size, training-corpus size, lemmatization, and stop lists, issues that have not been explored in previous experiments. Finally, the Naive Bayesian filter is compared, in terms of performance, to a filter that uses keyword patterns, and which is part of a widely used e-mail reader.

机译：

不请自来的批量电子邮件（也称为“垃圾邮件”）的日益严重的问题引起了对可靠的反垃圾邮件过滤器的需求。到目前为止，这种类型的过滤器主要基于手动构建的关键字模式。最近提出了一种替代方法，通过该方法可以自动训练朴素贝叶斯分类器以检测垃圾邮件。我们在大量个人电子邮件消息上测试了此方法，我们以“加密”形式向公众公开这些消息，这些消息有助于实现标准基准测试。我们引入了适当的成本敏感措施，同时调查了属性集大小，训练语料库大小，词形化和停止列表的影响，而这些都是先前实验中未曾探讨过的问题。最后，就性能而言，将朴素贝叶斯过滤器与使用关键字模式的过滤器进行比较，该过滤器是广泛使用的电子邮件阅读器的一部分。展开▼

著录项

来源
《Annual international ACM SIGIR conference on Research and development in information retrieval;International ACM SIGIR conference on Research and development in information retrieval》|2000年|P.160-167|共8页
会议地点
作者
Ion Androutsopoulos; John Koutsias; Konstantinos V. Chandrinos; Constantine D. Spyropoulos;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类各种专用数据库;
关键词
text categorization;

机译：文字分类;

相似文献

外文文献
中文文献
专利

1. Spam influence on business and economy: Theoretical and experimental studies for textual anti-spam filtering using mature document processingand naive Bayesian classifier [J] . A. A. Zaidan N. N. Ahmed, H. Abdul Karim, Gazi Mahabubul Alam, African Journal of Business Management . 2011,第2期

机译：垃圾邮件对商业和经济的影响：使用成熟文件处理和Naive Bayesian分类的文本反垃圾邮件过滤的理论和实验研究
2. Social network based filtering of unsolicited messages from e-mails [J] . Kiliroor Cinu C., Valliyammai C. Journal of intelligent & fuzzy systems: Applications in Engineering and Technology . 2019,第5期

机译：基于社交网络从电子邮件中的未经请求消息过滤
3. Evaluating Rule-Based and Statistical Filters to Detecting Arabic E-Mail Alert Messages [J] . Emad M. Al-Shawakfa, Qasem A. Al-Radaideh, Ahmed F. Al-Eroud International Journal of Computer Processing of Oriental Languages . 2012,第2期

机译：评估基于规则的统计过滤器以检测阿拉伯语电子邮件警报消息
4. An Anti-spam Filtering System Based on the Naive Bayesian Classifier and Distributed Checksum Clearinghouse [C] . Wang Haiyan, Zhou Runsheng, Wang Yi International Symposium on Intelligent Information Technology Application;IITA 2009 . 2009

机译：基于朴素贝叶斯分类器和分布式校验和信息交换所的反垃圾邮件过滤系统
5. Adaptive anti-spam e-mail filtering using Huffman coding and statistical learning. [D] . Nerellapalli, Praveen R. 2005

机译：使用霍夫曼编码和统计学习的自适应反垃圾邮件过滤。
6. A Comparison of FPGA and GPGPU Designs for Bayesian Occupancy Filters [O] . Luis Medina, Miguel Diez-Ochoa, Raul Correal, 2017

机译：贝叶斯占用滤波器的FPGA和GPGPU设计的比较
7. An Experimental Comparison of Naive Bayesian and Keyword-Based Anti-Spam Filtering with Personal E-mail Messages [O] . Androutsopoulos, Ion, Koutsias, John, Chandrinos, Konstantinos V., 2000

机译：朴素贝叶斯与基于关键词的反垃圾邮件的实验比较使用个人电子邮件过滤

An experimental comparison of naive Bayesian and keyword-based anti-spam filtering with personal e-mail messages

摘要

著录项

相似文献

相关主题

期刊订阅