A large-scale empirical analysis of email spam detection through network characteristics in a stand-alone enterprise

Tu Ouyang; Soumya Ray; Mark Allman; Michael Rabinovich

首页> 外文期刊>Computer networks >A large-scale empirical analysis of email spam detection through network characteristics in a stand-alone enterprise

【24h】

A large-scale empirical analysis of email spam detection through network characteristics in a stand-alone enterprise

机译：独立企业中通过网络特征进行的电子邮件垃圾邮件检测的大规模实证分析

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Spam is a never-ending issue that constantly consumes resources to no useful end. In this paper, we envision spam filtering as a pipeline consisting of DNS blacklists, filters based on SYN packet features, filters based on traffic characteristics and filters based on message content. Each stage of the pipeline examines more information in the message but is more computationally expensive. A message is rejected as spam once any layer is sufficiently confident. We analyze this pipeline, focusing on the first three layers, from a single-enterprise perspective. To do this we use a large email dataset collected over two years. We devise a novel ground truth determination system to allow us to label this large dataset accurately. Using two machine learning algorithms, we study (ⅰ) how the different pipeline layers interact with each other and the value added by each layer, (ⅱ) the utility of individual features in each layer, (ⅲ) stability of the layers across time and network events and (ⅳ) an operational use case investigating whether this architecture can be practically useful. We find that (ⅰ) the pipeline architecture is generally useful in terms of accuracy as well as in an operational setting, (ⅱ) it generally ages gracefully across long time periods and (iii) in some cases, later layers can compensate for poor performance in the earlier layers. Among the caveats we find are that (ⅰ) the utility of network features is not as high in the single enterprise viewpoint as reported in other prior work, (ⅱ) major network events can sharply affect the detection rate, and (ⅲ) the operational (computational) benefit of the pipeline may depend on the efficiency of the final content filter.

机译：垃圾邮件是一个永无止境的问题，它不断消耗资源，无济于事。在本文中，我们将垃圾邮件过滤设想为由DNS黑名单，基于SYN数据包功能的过滤器，基于流量特征的过滤器和基于邮件内容的过滤器组成的管道。流水线的每个阶段都会检查消息中的更多信息，但计算量更大。一旦任何一层足够有信心，便将邮件拒绝为垃圾邮件。我们从单一企业的角度分析此管道，重点放在前三层。为此，我们使用了两年来收集的大型电子邮件数据集。我们设计了一种新颖的地面真相确定系统，使我们可以准确地标记这个大数据集。使用两种机器学习算法，我们研究（ⅰ）不同管线层如何相互影响以及每一层所增加的价值；（ⅱ）每层中各个要素的效用；（ⅲ）层在整个时间和网络事件以及（ⅳ）操作用例，以调查此体系结构是否实际有用。我们发现（ⅰ）管道体系结构通常在准确性和操作环境方面都很有用，（ⅱ）它通常会在很长一段时间内正常老化，并且（iii）在某些情况下，较新的层可以弥补较差的性能在早期的层中。我们发现的警告包括：（ⅰ）从单个企业的角度来看，网络功能的实用性不如其他先前工作所报道的那样；（ⅱ）重大网络事件会严重影响检测率，并且（ⅲ）流水线的（计算）优势可能取决于最终内容过滤器的效率。

著录项

来源
《Computer networks 》 |2014年第11期| 101-121| 共21页
作者
Tu Ouyang; Soumya Ray; Mark Allman; Michael Rabinovich;
展开▼
作者单位

Department of Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH, USA;

Department of Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH, USA;

International Computer Science Institute, Berkeley, CA, USA;

Department of Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH, USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Spam; Network-level characteristics; Longitudinal analysis;

机译：垃圾邮件;网络级特征;纵向分析;

相似文献

外文文献
中文文献
专利

1. Using GMDH-based networks for improved spam detection and email feature analysis [J] . El-Sayed M. El-Alfy, Radwan E. Abdel-Aal Applied Soft Computing . 2011 ,第1期

机译：使用基于GMDH的网络改进垃圾邮件检测和电子邮件功能分析
2. Polynomial Neural Networks Versus Other Spam Email Filters: An Empirical Study [J] . Progress in Artificial Intelligence . 2020 ,第1期

机译：多项式神经网络与其他垃圾邮件过滤器：实证研究
3. The impact that placing email addresses on the Internet has on the receipt of spam: An empirical analysis [J] . Guido Schryen Computers & Security . 2007 ,第5期

机译：在Internet上放置电子邮件地址对垃圾邮件接收的影响：一项实证分析
4. Can Network Characteristics Detect Spam Effectively in a Stand-Alone Enterprise? [C] . Tu Ouyang, Soumya Ray, Michael Rabinovich, Passive and active measurement . 2011

机译：网络特征能否在独立企业中有效检测垃圾邮件？
5. Behavior-based email analysis with application to spam detection. [D] . Hershkop, Shlomo. 2006

机译：基于行为的电子邮件分析及其在垃圾邮件检测中的应用。
6. The Transition from Occupational Safety and Health (OSH) Interventions to OSH Outcomes: An Empirical Analysis of Mechanisms and Contextual Factors within Small and Medium-Sized Enterprises [O] . Guido J. L. Micheli, Enrico Cagno, Antonio Calabrese 2018

机译：从职业安全与卫生（OSH）干预向OSH结果的过渡：中小型企业的机制和环境因素的实证分析
7. A Large-Scale Empirical Analysis of Email Spam Detection Through Network Characteristics in a Stand-Alone Enterprise [O] . Tu Ouyanga, Soumya Raya, Mark Allmanb, 2015

机译：独立企业网络特征电子邮件垃圾邮件检测的大规模实证分析

A large-scale empirical analysis of email spam detection through network characteristics in a stand-alone enterprise

摘要

著录项

相似文献

相关主题

期刊订阅