首页> 外文会议>IEEE International Conference on Communications >Detecting spam comments posted in micro-blogs using the self-extensible spam dictionary
【24h】

Detecting spam comments posted in micro-blogs using the self-extensible spam dictionary

机译:使用可自我扩展的垃圾邮件字典检测微博中发布的垃圾邮件评论

获取原文

摘要

The high popularity of Weibo has greatly enriched people's lives, allowing online users to share their feelings through posting comments. However, more and more spam comments are also being posted in users' blogs on this social media. In this paper, in order to effectively detect spam comments in Chinese micro-blogs, we introduce semantic analysis to construct a Self-Extensible Spam Dictionary which automatically expands itself when new words emerge on the micro-blogs frequently. The use of semantic analysis can provide us with additional features which are beneficial to detecting spam comments. A Proportion-Weight Filter (PWF) model is also proposed to detect two kinds of spam comments (AD and vulgar comments), by filtering the spam-weight and the spam-proportion of the Weibo comments based on our Self-Extensible Spam Dictionary criteria. Our experimental results demonstrate that when detecting a combination of both AD and vulgar spam comments, we can achieve an average detection accuracy of 87.9%. Particularly for AD spam comments detection, we can achieve an average accuracy of 96.2%, which is preferable compared to when using machine learning methods. The statistical analysis of the results verifies that our proposed methods can identify the spam comments effectively and to relatively high degrees of accuracy.
机译:微博的高度普及极大地丰富了人们的生活,允许在线用户通过发表评论来分享他们的感受。但是,越来越多的垃圾邮件评论也在该社交媒体上的用户博客中发布。为了有效地检测中文微博中的垃圾邮件评论,本文引入语义分析来构建自扩展垃圾邮件词典,当新词频繁出现在微博中时,该词典会自动进行自我扩展。语义分析的使用可以为我们提供其他功能,这些功能有利于检测垃圾邮件评论。还提出了一种比例权重过滤器(PWF)模型,通过基于我们的“可扩展垃圾邮件字典”标准过滤微博注释的垃圾邮件权重和垃圾邮件比例,来检测两种垃圾邮件注释(AD和粗俗注释) 。我们的实验结果表明,同时检测到AD和粗俗垃圾邮件评论时,我们可以实现87.9%的平均检测准确率。特别是对于AD垃圾邮件评论检测,我们可以实现96.2%的平均准确度,与使用机器学习方法时相比,这是更好的选择。结果的统计分析证明,我们提出的方法可以有效地识别垃圾评论,并且具有相对较高的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号