O-means: An Optimized Clustering Method for Analyzing Spam Based Attacks

Jungsuk SONG; Daisuke INOUE; Masashi ETO; Hyung Chan KIM; Koji NAKAO

首页> 外文期刊>IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences >O-means: An Optimized Clustering Method for Analyzing Spam Based Attacks

【24h】

O-means: An Optimized Clustering Method for Analyzing Spam Based Attacks

机译：O均值：一种用于分析基于垃圾邮件的攻击的优化聚类方法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In recent years, the number of spam emails has been dramatically increasing and spam is recognized as a serious internet threat. Most recent spam emails are being sent by bots which often operate with others in the form of a botnet, and skillful spammers try to conceal their activities from spam analyzers and spam detection technology. In addition, most spam messages contain URLs that lure spam receivers to malicious Web servers for the purpose of carrying out various cyber attacks such as malware infection, phishing attacks, etc. In order to cope with spam based attacks, there have been many efforts made towards the clustering of spam emails based on similarities between them. The spam clusters obtained from the clustering of spam emails can be used to identify the infrastructure of spam sending systems and malicious Web servers, and how they are grouped and correlate with each other, and to minimize the time needed for analyzing Web pages. Therefore, it is very important to improve the accuracy of the spam clustering as much as possible so as to analyze spam based attacks more accurately. In this paper, we present an optimized spam clustering method, called O-means, based on the K-means clustering method, which is one of the most widely used clustering methods. By examining three weeks of spam gathered in our SMTP server, we observed that the accuracy of the O-means clustering method is about 87% which is superior to the previous clustering methods. In addition, we define 12 statistical features to compare similarity between spam emails, and we determined a set of optimized features which makes the O-means clustering method more effective.

机译：近年来，垃圾邮件的数量急剧增加，垃圾邮件被认为是严重的互联网威胁。最新的垃圾邮件是由僵尸程序发送的，这些僵尸程序通常以僵尸网络的形式与他人合作，熟练的垃圾邮件发送者试图从垃圾邮件分析器和垃圾邮件检测技术中隐藏其活动。另外，大多数垃圾邮件都包含URL，这些URL会将垃圾邮件接收者引诱到恶意Web服务器，以进行各种网络攻击，例如恶意软件感染，网络钓鱼攻击等。为了应对基于垃圾邮件的攻击，已经做了很多努力。基于垃圾邮件之间的相似性来聚类。从垃圾邮件群集中获得的垃圾邮件群集可用于标识垃圾邮件发送系统和恶意Web服务器的基础结构，以及它们如何进行分组和相互关联，并最大程度地减少分析Web页面所需的时间。因此，尽可能提高垃圾邮件群集的准确性以更准确地分析基于垃圾邮件的攻击非常重要。在本文中，我们基于K-means聚类方法提出了一种优化的垃圾邮件聚类方法，称为O-means，它是使用最广泛的聚类方法之一。通过检查在我们的SMTP服务器中收集的三周垃圾邮件，我们观察到O-means聚类方法的准确性约为87％，这比以前的聚类方法要好。此外，我们定义了12个统计功能以比较垃圾邮件之间的相似性，并确定了一组优化的功能，这些功能使O-means聚类方法更加有效。

著录项

来源
《IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences》 |2011年第1期|p.245-254|共10页
作者
Jungsuk SONG; Daisuke INOUE; Masashi ETO; Hyung Chan KIM; Koji NAKAO;
展开▼
作者单位

The authors are with National Institute of Information and Communications Technology, Koganei-shi, 184-8795 Japan;

The authors are with National Institute of Information and Communications Technology, Koganei-shi, 184-8795 Japan;

The authors are with National Institute of Information and Communications Technology, Koganei-shi, 184-8795 Japan;

The authors are with National Institute of Information and Communications Technology, Koganei-shi, 184-8795 Japan;

The authors are with National Institute of Information and Communications Technology, Koganei-shi, 184-8795 Japan;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
spam; clustering; feature; k-means clustering method;

机译：垃圾邮件;集群特征;k均值聚类方法;

相似文献

外文文献
中文文献
专利

1. Clustering and Feature Selection Methods for Analyzing Spam Based Attacks [J] . SONG Jungsuk Journal of the National Institute of Information and Communications Technology . 2011,第3a4期

机译：用于分析基于垃圾邮件的攻击的聚类和特征选择方法
2. Twitter spam account detection based on clustering and classification methods [J] . Adewole Kayode Sakariyah, Hang Tao, Wu Wanqing, Journal of supercomputing . 2020,第7期

机译：基于聚类和分类方法的Twitter垃圾邮件帐户检测
3. Analyzing and Optimizing ANT-Clustering Algorithm by Using Numerical Methods for Efficient Data Mining [J] . Md. Asikur Rahman, Md. Mustafizur Rahman, Md. Mustafa Kamal Bhuiyan, International Journal of Data Mining & Knowledge Management Process . 2012,第5期

机译：数值方法分析和优化蚁群算法以进行高效数据挖掘
4. A Methodology for Analyzing Overall Flow of Spam-Based Attacks [C] . Jungsuk Song, Daisuke Inoue, Masashi Eto, International conference on neural information processing;ICONIP 2009 . 2009

机译：分析基于垃圾邮件的攻击的总体流程的方法
5. Risk-based multi-objective optimization for the control of mobile source air pollution: A framework methodology for analyzing risk transferral among exposure, emissions, and economic costs. [D] . Heitzmann, Martha Crawford. 1997

机译：基于风险的多目标优化控制移动源空气污染：一种框架方法，用于分析暴露，排放和经济成本之间的风险转移。
6. Methods of Resource Scheduling Based on Optimized Fuzzy Clustering in Fog Computing [O] . Guangshun Li, Yuncui Liu, Junhua Wu, 2019

机译：雾计算中基于优化模糊聚类的资源调度方法
7. Iterative group-based and difference ranking method for online rating systems with spamming attacks [O] . Quan-Yun Fu, Jian-Feng Ren, Hong-Liang Sun 2021

机译：基于次级别的基于组和差异排序方法，用于垃圾邮件攻击的在线评级系统

O-means: An Optimized Clustering Method for Analyzing Spam Based Attacks

摘要

著录项

相似文献

相关主题

期刊订阅