首页> 外文学位 >Behavior-based email analysis with application to spam detection.
【24h】

Behavior-based email analysis with application to spam detection.

机译:基于行为的电子邮件分析及其在垃圾邮件检测中的应用。

获取原文
获取原文并翻译 | 示例

摘要

Email is the "killer network application". Email is ubiquitous and pervasive. In a relatively short timeframe, the Internet has become irrevocably and deeply entrenched in our modern society primarily due to the power of its communication substrate linking people and organizations around the globe. Much work on email technology has focused on making email easy to use, permitting a wide variety of information and information types to be conveniently, reliably, and efficiently sent throughout the Internet. However, the analysis of the vast storehouse of email content accumulated or produced by individual users has received relatively little attention other than for specific tasks such as spam and virus filtering. As one paper in the literature puts it, "the state of the art is still a messy desktop" (Denning, 1982).; The Problem: Email clients provide only partial information - users have to manage much on their own, making it hard to search or prioritize large amounts of email. Our thesis is that advanced data mining can provide new opportunities for applications to increase email productivity and extract new information from email archives.; This thesis presents an implemented framework for data mining behavior models from email data. The Email Mining Toolkit (EMT) is a data mining toolkit designed to analyze offline email corpora, including the entire set of email sent and received by an individual user, revealing much information about individual users as well as the behavior of groups of users in an organization. A number of machine learning and anomaly detection algorithms are embedded in the system to model the user's email behavior in order to classify email for a variety of tasks. The work has been successfully applied to the tasks of clustering and classification of similar emails, spam detection, and forensic analysis to reveal information about user's behavior.; We organize the core functionality of EMT into a lightweight package called the Profiling Email Toolkit (PET). A novel contribution in PET is the focus on analyzing real time email flow information from both an individual and an organization in a standard framework. PET includes new algorithms that combine multiple models using a variety of features extracted from email to achieve higher accuracy and lower false positive than any one individual model for a variety of analytical tasks.
机译:电子邮件是“杀手级网络应用程序”。电子邮件无处不在且无处不在。在相对较短的时间内,互联网已经在我们的现代社会中变得不可逆转且根深蒂固,这主要归功于其将世界各地的人们和组织联系起来的沟通基础的力量。电子邮件技术方面的许多工作都集中在使电子邮件易于使用,允许将各种信息和信息类型方便,可靠和高效地发送到整个Internet上。但是,除了个别任务(例如垃圾邮件和病毒过滤)以外,对单个用户累积或产生的大量电子邮件内容的分析受到的关注相对较少。正如文献中的一篇论文所说,“现有技术仍然是一个凌乱的桌面”(Denning,1982)。问题:电子邮件客户端仅提供部分信息-用户必须自己进行大量管理,因此很难搜索或确定大量电子邮件的优先级。我们的观点是,高级数据挖掘可以为应用程序提供新的机会,以提高电子邮件的生产率并从电子邮件档案中提取新的信息。本文提出了一种基于电子邮件数据的数据挖掘行为模型的实现框架。电子邮件挖掘工具箱(EMT)是一种数据挖掘工具箱,旨在分析脱机电子邮件语料库,包括单个用户发送和接收的整套电子邮件,从而揭示有关单个用户的大量信息以及一个用户组中用户的行为。组织。系统中嵌入了许多机器学习和异常检测算法,以对用户的电子邮件行为进行建模,从而为各种任务对电子邮件进行分类。该工作已成功应用于类似电子邮件的聚类和分类,垃圾邮件检测以及取证分析以揭示有关用户行为的信息。我们将EMT的核心功能组织到一个轻巧的程序包中,称为“分析电子邮件工具包(PET)”。 PET的一项新颖贡献是致力于在标准框架中分析来自个人和组织的实时电子邮件流信息。 PET包含新算法,这些算法结合了使用从电子邮件中提取的各种功能的多个模型,以实现比用于任何分析任务的任何单个模型更高的准确性和更低的误报率。

著录项

  • 作者

    Hershkop, Shlomo.;

  • 作者单位

    Columbia University.;

  • 授予单位 Columbia University.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2006
  • 页码 209 p.
  • 总页数 209
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号