首页> 外文会议>International Conference on Social Informatics >Hunting Malicious Bots on Twitter: An Unsupervised Approach
【24h】

Hunting Malicious Bots on Twitter: An Unsupervised Approach

机译:在Twitter上搜寻恶意机器人:一种无监督的方法

获取原文

摘要

Malicious bots violate Twitter's terms of service - they include bots that post spam content, adware and malware, as well as bots that are designed to sway public opinion. How prevalent are such bots on Twitter? Estimates vary, with Twitter [3] itself stating that less than 5% of its over 300 million active accounts axe bots. Using a supervised machine learning approach with a manually curated set of Twitter bots, [12] estimate that between 9% to 15% of active Twitter accounts are bots (both benign and malicious). In this paper, we propose an unsupervised approach to hunt for malicious bot groups on Twitter. Key structural and behavioral markers for such bot groups are the use of URL shortening services, duplicate tweets and content coordination over extended periods of time. While these markers have been identified in prior work [9,15], we devise a new protocol to automatically harvest such bot groups from live Tweet streams. Our experiments with this protocol show that between 4% to 23% (mean 10.5%) of all accounts that use shortened URLs are bots and bot networks that evade detection over a long period of time, with significant heterogeneity in distribution based on the URL shortening service. We compare our detection approach with two state-of-the-art methods for bot detection on Twitter: a supervised learning approach called BotOrNot [10] and an unsupervised technique called DeBot [8]. We show that BotOrNot misclassifies around 40% of the malicious bots identified by our protocol. The overlap between bots detected by our approach and DeBot, which uses synchronicity of tweeting as a primary behavioral marker, is around 7%, indicating that the detection approaches target very different types of bots. Our protocol effectively identifies malicious bots in a language-independent, as well as topic and keyword independent framework in real-time in an entirely unsupervised manner and is a useful supplement to existing bot detection tools.
机译:恶意机器人违反了Twitter的服务条款-包括发布垃圾邮件内容,广告软件和恶意软件的机器人,以及旨在影响公众舆论的机器人。这样的机器人在Twitter上的流行程度如何?估计各不相同,Twitter [3]本身指出,在超过3亿活跃帐户的斧头机器人中,只有不到5%。 [12]使用监督的机器学习方法和一组手动策划的Twitter机器人,估计有9%到15%的活动Twitter帐户是机器人(良性和恶意)。在本文中,我们提出了一种无监督的方法来在Twitter上寻找恶意的机器人程序组。此类漫游器组的关键结构和行为标记是URL缩短服务的使用,重复的推文以及较长时间的内容协调。虽然这些标记在先前的工作中已经被确定[9,15],但我们设计了一种新协议,可以从实时Tweet流中自动收集此类机器人组。我们使用此协议进行的实验表明,使用缩短的URL的所有帐户中有4%至23%(平均10.5%)是长时间躲避检测的僵尸程序和僵尸网络,并且由于URL缩短而在分配方面存在很大的异质性服务。我们将检测方法与Twitter上两种最先进的机器人检测方法进行了比较:一种称为BotOrNot的有监督学习方法[10]和一种称为DeBot的无监督技术[8]。我们证明,BotOrNot对协议所识别的40%的恶意bot进行了错误分类。我们的方法检测到的机器人与DeBot(使用推文的同步性作为主要行为标记)之间的重叠率约为7%,这表明检测方法针对的是非常不同类型的机器人。我们的协议以一种完全不受监督的方式,以一种独立于语言的方式,以及与主题和关键字无关的框架,有效地识别了恶意的bot,并且是一种完全不受监督的方式,并且是对现有bot检测工具的有用补充。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号