首页> 外文期刊>Journal of supercomputing >Twitter spam account detection based on clustering and classification methods
【24h】

Twitter spam account detection based on clustering and classification methods

机译:基于聚类和分类方法的Twitter垃圾邮件帐户检测

获取原文
获取原文并翻译 | 示例

摘要

Twitter social network has gained more popularity due to the increase in social activities of registered users. Twitter performs dual functions of online social network (OSN), acting as a microblogging OSN, and at the same time as a news update platform. Recently, the growth in Twitter social interactions has attracted the attention of cybercriminals. Spammers have used Twitter to spread malicious messages, post phishing links, flood the network with fake accounts, and engage in other malicious activities. The process of detecting the network of spammers who engage in these activities is an important step toward identifying individual spam account. Researchers have proposed a number of approaches to identify a group of spammers. However, each of these approaches addressed a specific category of spammer. This paper proposes a different approach to detect spammers on Twitter based on the similarities that exist among spam accounts. A number of features were introduced to improve the performance of the three classification algorithms selected in this study. The proposed approach applied principal component analysis and tuned K-means algorithm to cluster over 200,000 accounts, randomly selected from more than 2 million tweets to detect the clusters of spammers. Experimental results show that Random Forest achieved the highest accuracy of 96.30%. This result is followed by multilayer perceptron with 96.00% and support vector machine, which achieved 95.60%. The performance of the selected classifiers based on class imbalance also revealed that Random Forest achieved the highest accuracy, precision, recall, and F-measure.
机译:由于注册用户的社交活动的增加,Twitter社交网络已经获得了更受欢迎。 Twitter执行在线社交网络(OSN)的双重功能,作为微博OSN,同时作为新闻更新平台。最近,Twitter社会互动的增长引起了网络犯罪分子的注意。垃圾邮件发送者使用Twitter传播恶意信息,邮寄网络钓鱼链接,用假账户泛滥网络,并从事其他恶意活动。检测从事这些活动的垃圾邮件发送者网络的过程是朝着识别个人垃圾邮件账户的重要一步。研究人员提出了许多方法来识别一组垃圾邮件。然而,这些方法中的每一种都解决了特定的垃圾邮件发送者。本文提出了一种不同的方法来根据垃圾邮件账户中存在的相似性来检测Twitter上的垃圾邮件发送者。引入了许多特征来提高本研究中选择的三种分类算法的性能。所提出的方法应用主成分分析和调整的K-mean算法以超过200,000个账户的集群,从200多万推文中随机选择来检测垃圾邮件发送者的集群。实验结果表明,随机森林实现了96.30%的最高精度。此结果随后是具有96.00%和支持向量机的多层感知,达到95.60%。基于类别不平衡的所选分类器的性能还显示随机林实现了最高精度,精度,召回和F测量。

著录项

  • 来源
    《Journal of supercomputing》 |2020年第7期|4802-4837|共36页
  • 作者单位

    Univ Ilorin Fac Commun & Informat Sci Ilorin Nigeria;

    Dongguan Univ Technol DGUT CNAM Inst Dongguan Guangdong Peoples R China;

    Shenzhen Inst Adv Technol SIAT CAS Key Lab Human Machine Intelligence Synergy Sy Shenzhen 518055 Peoples R China|Chinese Acad Sci Inst Biomed & Hlth Engn SIAT Shenzhen 518055 Peoples R China;

    Embry Riddle Aeronaut Univ Dept Elect Comp Software & Syst Engn Daytona Beach FL 32114 USA;

    Vellore Inst Technol Sch Comp Sci & Engn Vellore 632014 Tamil Nadu India;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Online social network; Spam detection; Fake account; Clustering; Classification;

    机译:在线社交网络;垃圾邮件检测;假帐户;聚类;分类;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号