Offensive, aggressive, and hate speech analysis: From data-centric to human-centered approach

Jan Kocon; Alicja Figas; Martin Gruza; Daria Puchalska; Tomasz Kajdanowicz; Przemyslaw Kazienko

首页> 外文期刊>Information Processing & Management >Offensive, aggressive, and hate speech analysis: From data-centric to human-centered approach

【24h】

Offensive, aggressive, and hate speech analysis: From data-centric to human-centered approach

机译：令人反感，侵略性和讨厌的言语分析：从以数据为中心到以人为本的方法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Analysis of subjective texts like offensive content or hate speech is a great challenge, especially regarding annotation process. Most of current annotation procedures are aimed at achieving a high level of agreement in order to generate a high quality reference source. However, the annotation guidelines for subjective content may restrict the annotators' freedom of decision making. Motivated by a moderate annotation agreement in offensive content datasets, we hypothesize that personalized approaches to offensive content identification should be in place. Thus, we propose two novel perspectives of perception: group-based and individual. Using demographics of annotators as well as embeddings of their previous decisions (annotated texts), we are able to train multimodal models (including transformer-based) adjusted to personal or community profiles. Based on the agreement of individuals and groups, we experimentally showed that annotator group agreeability strongly correlates with offensive content recognition quality. The proposed personalized approaches enabled us to create models adaptable to personal user beliefs rather than to agreed offensiveness understanding. Overall, our individualized approaches to offensive content classification outperform classic data-centric methods that generalize offensiveness perception and it refers to all six tested models. Additionally, we developed requirements for annotation procedures, personalization and content processing to make the solutions human-centered.

机译：对令人攻击内容或仇恨等主观文本的分析是一个巨大的挑战，特别是关于注释过程。最新的注释程序旨在实现高度协议，以产生高质量的参考源。但是，主观内容的注释指南可能会限制注释者的决策自由。在冒犯内容数据集中的适度注释协议的动机，我们假设应当到位个性化对攻击内容识别的方法。因此，我们提出了两种新颖的感知视角：基于团体和个人。使用注册器的人口统计数据以及他们以前的决定（注释文本）的嵌入式，我们能够培训调整到个人或社区配置文件的多模式模型（包括基于变压器的）。根据个人和团体的协议，我们通过实验表明，注释群协商能力与令人反感的内容识别质量密切相关。拟议的个性化方法使我们能够创建适应个人用户信仰的模型，而不是同意冒险理解。总体而言，我们的个性化内容分类方法优于呈现普遍性的经典数据中心方法，概括了冒险性感知，并且它是指所有六种测试模型。此外，我们开发了对辅助程序，个性化和内容处理的要求，以使解决方案以人为本。

著录项

来源
《Information Processing & Management》 |2021年第5期|102643.1-102643.26|共26页
作者
Jan Kocon; Alicja Figas; Martin Gruza; Daria Puchalska; Tomasz Kajdanowicz; Przemyslaw Kazienko;
展开▼
作者单位

Wroclaw University of Science and Technology 50-370 Wroclaw Poland;

Wroclaw University of Science and Technology 50-370 Wroclaw Poland;

Wroclaw University of Science and Technology 50-370 Wroclaw Poland;

Wroclaw University of Science and Technology 50-370 Wroclaw Poland;

Wroclaw University of Science and Technology 50-370 Wroclaw Poland;

Wroclaw University of Science and Technology 50-370 Wroclaw Poland;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Hate speech; Offensive content; Human-centered NLP; Multimodal deep learning; Personalization; Subjective content perception; Annotator agreement;

机译：仇恨言论;令人反感的内容;以人为本的NLP;多模式深度学习;个性化;主观内容感知;注释员协议;
入库时间 2022-08-19 02:25:57

相似文献

外文文献
中文文献
专利

1. Mapping Twitter hate speech towards social and sexual minorities: a lexicon-based approach to semantic content analysis [J] . Lingiardi Vittorio, Carone Nicola, Semeraro Giovanni, Behaviour & Information Technology . 2020,第7a9期

机译：映射Twitter讨厌言论，以社会和性少数群体：基于词汇的语义内容分析方法
2. Short Semantic Patterns: A Linguistic Pattern Mining Approach for Content Analysis Applied to Hate Speech [J] . International Journal of Artificial Intelligence Tools: Architectures, Languages, Algorithms . 2020,第2期

机译：短语义模式：应用于仇恨语音的语言模式挖掘方法
3. A Triadic Formal Concept Analysis Approach to Analyzing Online Hate Speech in Facebook Comments [J] . Radu Mihai Meza, ?erban Nicolae Meza BRAIN. Broad Research in Artificial Intelligence and Neurosciences . 2019,第1期

机译：用于分析Facebook评论中的在线仇恨言论的三元形式概念分析方法
4. Toxic, Hateful, Offensive or Abusive? What Are We Really Classifying? An Empirical Analysis of Hate Speech Datasets [C] . Paula Fortuna, Juan Soler-Company, Leo Wanner International Conference on Language Resources and Evaluation . 2020

机译：有毒，可恶，令人反感或辱骂？我们真正分类了什么？仇恨语音数据集的实证分析
5. On the Detection of Hate Speech, Hate Speakers and Polarized Groups in Online Social Media [D] . Warmsley, Dana. 2017

机译：在线社交媒体中仇恨言论，仇恨演说者和两极分化群体的检测
6. Mapping online hate: A scientometric analysis on research trends and hotspots in research on online hate [O] . Ahmed Waqas, Joni Salminen, Soon-gyo Jung, 2012

机译：绘制网络仇恨：对网络仇恨研究趋势和热点的科学计量分析
7. CIC at SemEval-2019 Task 5: Simple Yet Very Efficient Approach to Hate Speech Detection, Aggressive Behavior Detection, and Target Classification in Twitter [O] . Iqra Ameer, Muhammad Hammad Fahim Siddiqui, Grigori Sidorov, 2019

机译：CIC在Semeval-2019任务5：简单但非常有效的方法来讨厌语音检测，激进的行为检测和Twitter中的目标分类

Offensive, aggressive, and hate speech analysis: From data-centric to human-centered approach

摘要

著录项

相似文献

相关主题

期刊订阅