...
首页> 外文期刊>Procedia Computer Science >N-Gram Assisted Youtube Spam Comment Detection
【24h】

N-Gram Assisted Youtube Spam Comment Detection

机译:N-Gram辅助的YouTube垃圾邮件评论检测

获取原文

摘要

This paper proposes a novel methodology for the detection of intrusive comments or spam on the video-sharing website - Youtube. We describe spam comments as those which have a promotional intent or those who deem to be contextually irrelevant for a given video. The prospects of monetisation through advertising on popular social media channels over the years has attracted an increasingly larger number of users. This has in turn led to to the growth of malicious users who have begun to develop automated bots, capable of large-scale orchestrated deployment of spam messages across multiple channels simultaneously. The presence of these comments significantly hurts the reputation of a channel and also the experience of normal users. Youtube themselves have tackled this issue with very limited methods which revolve around blocking comments that contain links. Such methods have proven to be extremely ineffective as Spammers have found ways to bypass such heuristics. Standard machine learning classification algorithms have proven to be somewhat effective but there is still room for better accuracy with new approaches. In this work, we attempt to detect such comments by applying conventional machine learning algorithms such as Random Forest, Support Vector Machine, Naive Bayes along with certain custom heuristics such as N-Grams which have proven to be very effective in detecting and subsequently combating spam comments.
机译:本文提出了一种新颖的方法来检测视频共享网站Youtube上的侵入性评论或垃圾邮件。我们将垃圾评论描述为具有宣传意图或认为与特定视频在上下文上无关的评论。多年来,通过在流行的社交媒体渠道上投放广告来货币化的前景吸引了越来越多的用户。反过来,这又导致恶意用户的增长,他们开始开发自动漫游器,该漫游器能够同时跨多个渠道大规模协调部署垃圾邮件。这些评论的存在会严重损害频道的声誉以及普通用户的体验。 Youtube本身已通过非常有限的方法来解决此问题,这些方法围绕阻止包含链接的注释。由于垃圾邮件发送者发现了绕过这种启发式方法的方法,因此已证明这种方法极为无效。标准的机器学习分类算法已被证明是有效的,但是使用新方法仍存在更好的准确性。在这项工作中,我们尝试通过应用常规的机器学习算法(例如随机森林,支持向量机,朴素贝叶斯)以及某些自定义启发式方法(例如N-Grams)来检测此类评论,这些方法已被证明在检测和随后打击垃圾邮件方面非常有效评论。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号