...
首页> 外文期刊>Quality Control, Transactions >Using Improved Conditional Generative Adversarial Networks to Detect Social Bots on Twitter
【24h】

Using Improved Conditional Generative Adversarial Networks to Detect Social Bots on Twitter

机译:使用改进的条件生成对冲网络来检测Twitter上的社交机器人

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

The detection and removal of malicious social bots in social networks has become an area of interest in industry and academia. The widely used bot detection method based on machine learning leads to an imbalance in the number of samples in different categories. Classifier bias leads to a low detection rate of minority samples. Therefore, we propose an improved conditional generative adversarial network (improved CGAN) to extend imbalanced data sets before applying training classifiers to improve the detection accuracy of social bots. To generate an auxiliary condition, we propose a modified clustering algorithm, namely, the Gaussian kernel density peak clustering algorithm (GKDPCA), which avoids the generation of data-augmentation noise and eliminates imbalances between and within social bot class distributions. Furthermore, we improve the CGAN convergence judgment condition by introducing the Wasserstein distance with a gradient penalty, which addresses the model collapse and gradient disappearance in the traditional CGAN. Three common oversampling algorithms are compared in experiments. The effects of the imbalance degree and the expansion ratio of the original data on oversampling are studied, and the improved CGAN performs better than the others. Experimental results comparing with three common oversampling algorithms show that the improved CGAN achieves the higher evaluation scores in terms of F1-score, G-mean and AUC.
机译:社交网络中恶意社交机器人的检测和删除已成为工业和学术界的兴趣领域。基于机器学习的广泛使用的机器人检测方法导致不同类别中的样本数量的不平衡。分类器偏置导致少数群体样本的低检测率。因此,我们提出了一种改进的条件生成对抗性网络(改进的CGAN),以在应用训练分类器之前扩展不平衡数据集,以提高社交机器人的检测精度。为了生成辅助条件,我们提出了一种修改的聚类算法,即高斯内核密度峰聚类算法(GKDPCA),其避免了数据增强噪声的产生,并消除了社交BOT类分布之间的不平衡。此外,我们通过将Wassersein距离与梯度罚款引入渐变距离来改善Cgan Goadtion判断条件,这解决了传统Cgan中的模型崩溃和梯度消失。在实验中比较了三种常见的过采样算法。研究了不平衡程度和原始数据的扩展比对过采样的影响,并且改进的CGAN比其他人表现更好。实验结果与三种常见的过采样算法相比表明,改进的CGAN在F1分数,G均值和AUC方面实现了更高的评估分数。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号