Reducing Unintended Identity Bias in Russian Hate Speech Detection

机译：在俄罗斯仇恨语音检测中减少意外的身份偏见

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Toxicity has become a grave problem for many online communities and has been growing across many languages, including Russian. Hate speech creates an environment of intimidation, discrimination, and may even incite some real-world violence. Both researchers and social platforms have been focused on developing models to detect toxicity in online communication for a while now. A common problem of these models is the presence of bias towards some words (e.g. woman, black, jew or женщина, черный, еврей) that are not toxic, but serve as triggers for the classifier due to model caveats. In this paper, we describe our efforts towards classifying hate speech in Russian, and propose simple techniques of reducing unintended bias, such as generating training data with language models using terms and words related to protected identities as context and applying word dropout to such words.

机译：毒性已成为许多在线社区的严重问题，并且在许多语言中都在增长，包括俄罗斯。仇恨言论创造了恐吓，歧视，甚至可能煽动一些现实世界的暴力。研究人员和社交平台都集中在开发模型，以检测在线通信中的毒性。这些模型的一个常见问题是对某些词语的存在（例如，没有毒性的女人，黑色，犹太人，еншина，черный，Еврей），但由于模型警告而作为分类器的触发器。在本文中，我们描述了我们在俄语中追究仇恨言论的努力，并提出了简单的技术来减少意外偏见的简单技术，例如使用与受保护的身份相关的语言模型生成培训数据，以及与上下文相关的语言模型，并将Word丢失应用于此类单词。

著录项

来源
《Workshop on Online Abuse and Harms》|2020年|65-69|共5页
会议地点
作者
Nadezhda Zueva; Madina Kabirova; Pavel Kalaidin;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Hate speech detection and racial bias mitigation in social media based on BERT model [J] . Marzieh Mozafari, Reza Farahbakhsh, No?l Crespi PLoS One . 2020,第8期

机译：基于BERT模型的社交媒体讨厌讲话检测和种族偏见缓解
2. Detecting ethnicity-targeted hate speech in Russian social media texts [J] . Ekaterina Pronoza, Polina Panicheva, Olessia Koltsova, Information Processing & Management . 2021,第6期

机译：在俄罗斯社交媒体文本中检测种族目标仇恨言论
3. A valid question: Could hate speech condition bias in the brain? [J] . Gail B. Murrow, Richard Murrow Journal of Law and the Biosciences . 2016,第1期

机译：一个有效的问题：讨厌言语状态偏向大脑吗？
4. The Risk of Racial Bias in Hate Speech Detection [C] . Maarten Sap, Dallas Card, Saadia Gabriel, Annual meeting of the Association for Computational Linguistics . 2019

机译：仇恨语音检测中种族偏见的风险
5. On the Detection of Hate Speech, Hate Speakers and Polarized Groups in Online Social Media [D] . Warmsley, Dana. 2017

机译：在线社交媒体中仇恨言论，仇恨演说者和两极分化群体的检测

Reducing Unintended Identity Bias in Russian Hate Speech Detection

摘要

著录项

相似文献

相关主题

期刊订阅