首页> 外文OA文献 >Digital Waste Sorting: A Goal-Based, Self-Learning Approach to Label Spam Email Campaigns
【2h】

Digital Waste Sorting: A Goal-Based, Self-Learning Approach to Label Spam Email Campaigns

机译:数字垃圾分类:基于目标的自学方法来标记垃圾邮件活动

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Fast analysis of correlated spam emails may be vital in the effort of finding and prosecuting spammers performing cybercrimes such as phishing and online frauds. This paper presents a self-learning framework to automatically divide and classify large amounts of spam emails in correlated labeled groups. Building on large datasets daily collected through honeypots, the emails are firstly divided into homogeneous groups of similar messages campaigns), which can be related to a specific spammer. Each campaign is then associated to a class which specifies the goal of the spammer, i.e. phishing, advertisement, etc. The proposed framework exploits a categorical clustering algorithm to group similar emails, and a classifier to subsequently label each email group. The main advantage of the proposed framework is that it can be used on large spam emails datasets, for which no prior knowledge is provided. The approach has been tested on more than 3200 real and recent spam emails, divided in more than 60 campaigns, reporting a classification accuracy of 97% on the classified data.
机译:对相关垃圾邮件的快速分析对于发现和起诉执行网络犯罪(例如网络钓鱼和在线欺诈)的垃圾邮件制造者可能至关重要。本文提出了一种自学习框架,可以自动将大量垃圾邮件分类为相关标签组并进行分类。在每天通过蜜罐收集的大型数据集的基础上,首先将电子邮件分为相似消息活动的同类组,这些活动可以与特定垃圾邮件发送者相关。然后,每个活动与指定垃圾邮件发送者目标的类(即网络钓鱼,广告等)相关联。所提出的框架利用分类聚类算法对相似的电子邮件进行分组,并利用分类器对每个电子邮件组进行标记。所提出的框架的主要优点是,它可以用于大型垃圾邮件数据集,而无需提供先验知识。该方法已经在3200多个真实和近期的垃圾邮件中进行了测试,分为60多个活动,报告的分类数据分类精度为97%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号