首页> 外文期刊>Neurocomputing >Detection of spam-posting accounts on Twitter
【24h】

Detection of spam-posting accounts on Twitter

机译:在Twitter上检测垃圾邮件发布帐户

获取原文
获取原文并翻译 | 示例

摘要

Online Social Media platforms, such as Facebook and Twitter, enable all users, independently of their characteristics, to freely generate and consume huge amounts of data. While this data is being exploited by individuals and organisations to gain competitive advantage, a substantial amount of data is being generated by spam or fake users. One in every 200 social media messages and one in every 21 tweets is estimated to be spam. The rapid growth in the volume of global spam is expected to compromise research works that use social media data, thereby questioning data credibility. Motivated by the need to identify and filter out spam contents in social media data, this study presents a novel approach for distinguishing spam vs. non-spam social media posts and offers more insight into the behaviour of spam users on Twitter. The approach proposes an optimised set of features independent of historical tweets, which are only available for a short time on Twitter. We take into account features related to the users of Twitter, their accounts and their pairwise engagement with each other. We experimentally demonstrate the efficacy and robustness of our approach and compare it to a typical feature set for spam detection in the literature, achieving a significant improvement on performance. In contrast to prior research findings, we observe that an average automated spam account posted at least 12 tweets per day at well defined periods. Our method is suitable for real-time deployment in a social media data collection pipeline as an initial preprocessing strategy to improve the validity of research data. (c) 2018 The Authors. Published by Elsevier B.V.
机译:诸如Facebook和Twitter之类的在线社交媒体平台使所有用户能够独立于其特征而自由生成和使用大量数据。尽管个人和组织利用这些数据来获得竞争优势,但垃圾邮件或虚假用户正在生成大量数据。据估计,每200条社交媒体消息中就有1条和每21条推文中就有1条是垃圾邮件。预计全球垃圾邮件数量的快速增长将损害使用社交媒体数据的研究工作,从而质疑数据的可信度。由于需要识别和过滤社交媒体数据中的垃圾邮件内容,因此本研究提出了一种区分垃圾邮件和非垃圾邮件社交媒体帖子的新颖方法,并提供了更多关于Twitter上垃圾邮件用户行为的见解。该方法提出了一组独立于历史推文的优化功能,这些特征仅在Twitter上提供了很短的时间。我们考虑到与Twitter用户,他们的帐户以及他们之间的成对互动有关的功能。我们通过实验证明了我们方法的有效性和鲁棒性,并将其与文献中用于垃圾邮件检测的典型功能集进行了比较,从而实现了性能上的显着改善。与先前的研究结果相反,我们观察到,一个自动垃圾邮件帐户在定义好的时段内每天至少发布12条推文。我们的方法适合作为社交媒体数据收集管道中的实时部署,作为提高研究数据有效性的初始预处理策略。 (c)2018作者。由Elsevier B.V.发布

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号