首页> 外文会议>Workshop on language in social media >Detecting Hate Speech on the World Wide Web
【24h】

Detecting Hate Speech on the World Wide Web

机译:在万维网上检测仇恨讲话

获取原文

摘要

We present an approach to detecting hate speech in online text, where hate speech is defined as abusive speech targeting specific group characteristics, such as ethnic origin, religion, gender, or sexual orientation. While hate speech against any group may exhibit some common characteristics, we have observed that hatred against each different group is typically characterized by the use of a small set of high frequency stereotypical words; however, such words may be used in either a positive or a negative sense, making our task similar to that of words sense disambiguation. In this paper we describe our definition of hate speech, the collection and annotation of our hate speech corpus, and a mechanism for detecting some commonly used methods of evading common "dirty word" filters. We describe pilot classification experiments in which we classify anti-semitic speech reaching an accuracy 94%, precision of 68% and recall at 60%, for an F1 measure of .6375.
机译:我们提出了一种在在线文本中检测仇恨言论的方法,仇恨言论被定义为针对特定群体特征的辱骂性言论,例如族裔血统,宗教,性别或性取向。虽然对任何组的仇恨言论可能表现出一些共同的特征,但我们观察到对每个不同组的仇恨通常是使用一小组高频刻板词的特征;然而,这些单词可以用正面或负面意义使用,使我们的任务类似于单词感知消歧的任务。在本文中,我们描述了我们对仇恨语音,我们仇恨语音语料库的收集和注释的定义,以及一种检测一些常用的“肮脏词”过滤器的常用方法的机制。我们描述了试验分类实验,其中我们将反晶状体演讲达到精度为94%,精度为68%,并召回60%,以便为0.6375的F1测量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号