首页> 外文期刊>Expert Systems with Application >Detecting and visualizing hate speech in social media: A cyber Watchdog for surveillance
【24h】

Detecting and visualizing hate speech in social media: A cyber Watchdog for surveillance

机译:在社交媒体中检测和可视化仇恨言论:监视的网络看门狗

获取原文
获取原文并翻译 | 示例
           

摘要

The multi-fold growth of the social media user-base fuelled a substantial increase in the amount of hate speech posts on social media platforms. The enormous data volume makes it hard to capture such cases and either moderate or delete them. This paper presents an approach to detect and visualize online aggression, a special case of hate speech, over social media. Aggression is categorized into overtly aggressive (OAG), covertly aggressive (CAG), and non-aggressive labels (NAG). We have designed a user interface based on a web browser plugin over Facebook and Twitter to visualize the aggressive comments posted on the Social media user's timelines. This plugin interface might help to the security agency to keep a tab on the social media stream. It also provides citizens with a tool that is typically only available for large enterprises. The availability of such a tool alleviates the technological imbalance between industry and citizens. Besides, the system might be helpful to the research community to create further tools and prepare weakly labeled training data in a few minutes using comments posted by users on celebrity's Facebook, Twitter timeline. We have reported the results on a newly created dataset of user comments posted on Facebook and Twitter using our proposed plugins and the standard Trolling Aggression Cyberbullying 2018 (TRAC) dataset in English and code-mixed Hindi. Various classifiers like Support Vector Machine (SVM), Logistic regression, deep learning model based on Convolution Neural Network (CNN), Attention-based model, and the recently proposed BERT pre-trained language model by Google AI, have been used for aggression classification. The weighted F1-score of around 0.64 and 0.62 is achieved on TRAC Facebook English and Hindi datasets while on Twitter English and Hindi datasets, the weighted F1-score is 0.58 and 0.50, respectively. (c) 2020 Elsevier Ltd. All rights reserved.
机译:社交媒体用户群的多倍生长推动了社交媒体平台上的仇恨语音帖子量大幅增加。巨大的数据量难以捕获此类情况,并且中等或删除它们。本文提出了一种检测和可视化在线侵略的方法,是社交媒体的特殊情况。侵略分为公开侵略性(OAG),隐蔽的侵略性(CAG)和非侵略性标签(NAG)。我们设计了一个基于Facebook和Twitter的Web浏览器插件的用户界面,以可视化社交媒体用户的时间表上发布的激进评论。此插件界面可能有助于安全机构保留在社交媒体流上的选项卡。它还提供具有通常仅适用于大型企业的工具的公民。这种工具的可用性减轻了行业和公民之间的技术不平衡。此外,系统可能对研究界有所帮助,在几分钟内使用用户在Celebresity的Facebook,Twitter Timeline上发表的评论,在几分钟内创建进一步的工具并准备弱标记的培训数据。我们已经报告了在Facebook和Twitter上发布的新创建数据集的结果,使用我们所提出的插件和英语和代码混合印度的标准拖钓攻击网络欺凌2018年(TRAC)数据集。像支持向量机(SVM)一样的各种分类器,基于卷积神经网络(CNN),基于关注的模型的逻辑回归,深度学习模型以及谷歌AI最近提出的BERT预先接受的语言模型,已被用于侵略分类。在Trac Facebook英语和印度数据集上,在Twitter英语和Hindi数据集时,加权F1分数约为0.64和0.62,加权F1分数分别为0.58和0.50。 (c)2020 elestvier有限公司保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号