首页> 外文会议>International Conference on Information and Communication Systems >Spam profile detection in social networks based on public features
【24h】

Spam profile detection in social networks based on public features

机译:基于公共功能的社交网络中垃圾邮件配置文件检测

获取原文

摘要

In the context of Online Social Networks, Spam profiles are not just a source of unwanted ads, but a serious security threat used by online criminals and terrorists for various malicious purposes. Recently, such criminals were able to steal a number of accounts that belong to NatWest bank's customers. Their attack vector was based on spam tweets posted by a Twitter account which looked very close to NatWest customer support account and leaded users to a link of a phishing site. In this study, we investigate the nature of spam profiles in Twitter with a goal to improve social spam detection. Based on a set of publicly available features, we develop spam profiles detection models. At this stage, a dataset of 82 Twitter's profiles are collected and analyzed. With feature engineering, we investigate ten binary and simple features that can be used to classify spam profiles. Moreover, a feature selection process is utilized to identify the most influencing features in the process of detecting spam profiles. For feature selection, two methods are used ReliefF and Information Gain. While for classification, four classification algorithms are applied and compared: Decision Trees, Multilayer Perceptron, k-Nearest neighbors and Naive Bayes. Preliminary experiments in this work show that the promising detection rates can be obtained using such features regardless of the language of the tweets.
机译:在在线社交网络的背景下,垃圾邮件配置文件不仅是有害广告的来源,而且还是在线罪犯和恐怖分子出于各种恶意目的而使用的严重安全威胁。最近,这些罪犯能够窃取属于NatWest银行客户的许多帐户。他们的攻击媒介基于一个Twitter帐户发布的垃圾邮件,该Twitter邮件看上去与NatWest客户支持帐户非常接近,并导致用户访问钓鱼网站的链接。在这项研究中,我们调查了Twitter中垃圾邮件配置文件的性质,旨在改善社交垃圾邮件检测。基于一组公开可用的功能,我们开发了垃圾邮件配置文件检测模型。在此阶段,收集并分析了82个Twitter资料的数据集。通过功能工程,我们研究了十个可用于对垃圾邮件配置文件进行分类的二进制和简单功能。而且,在检测垃圾邮件配置文件的过程中,使用特征选择过程来识别最具影响力的特征。对于特征选择,使用了两种方法ReliefF和Information Gain。在进行分类时,将应用和比较四种分类算法:决策树,多层感知器,k最近邻居和朴素贝叶斯。这项工作中的初步实验表明,不管这些推文的语言是什么,使用这些功能都可以获得有希望的检测率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号