Automated identification of bias inducing words in news articles using linguistic and context-oriented features

Timo Spinde; Lada Rudnitckaia; Jelena Mitrovic; Felix Hamborg; Michael Granitzer; Bela Gipp; Karsten Donnay

首页> 外文期刊>Information Processing & Management >Automated identification of bias inducing words in news articles using linguistic and context-oriented features

【24h】

Automated identification of bias inducing words in news articles using linguistic and context-oriented features

机译：使用语言和背景化特征自动识别新闻文章中的偏差词

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Media has a substantial impact on public perception of events, and, accordingly, the way media presents events can potentially alter the beliefs and views of the public. One of the ways in which bias in news articles can be introduced is by altering word choice. Such a form of bias is very challenging to identify automatically due to the high context-dependence and the lack of a large-scale gold-standard data set. In this paper, we present a prototypical yet robust and diverse data set for media bias research. It consists of 1,700 statements representing various media bias instances and contains labels for media bias identification on the word and sentence level. In contrast to existing research, our data incorporate background information on the participants' demographics, political ideology, and their opinion about media in general. Based on our data, we also present a way to detect bias-inducing words in news articles automatically. Our approach is feature-oriented, which provides a strong descriptive and explanatory power compared to deep learning techniques. We identify and engineer various linguistic, lexical, and syntactic features that can potentially be media bias indicators. Our resource collection is the most complete within the media bias research area to the best of our knowledge. We evaluate all of our features in various combinations and retrieve their possible importance both for future research and for the task in general. We also evaluate various possible Machine Learning approaches with all of our features. XGBoost, a decision tree implementation, yields the best results. Our approach achieves an F_1 -score of 0.43, a precision of 0.29, a recall of 0.77, and a ROC AUC of 0.79, which outperforms current media bias detection methods based on features. We propose future improvements, discuss the perspectives of the feature-based approach and a combination of neural networks and deep learning with our current system.

机译：媒体对公众对事件的看得重大影响，并因此，媒体呈现事件的方式可能会改变公众的信仰和观点。可以介绍新闻文章中的偏差的方式之一是通过改变单词选择。由于高的上下文依赖性和缺乏大规模的金标准数据集，这种形式的偏差是非常具有挑战性的。在本文中，我们为媒体偏差研究提供了一种原型且具有多样化的数据集。它由1,700个语句组成，代表各种媒体偏见实例，并包含媒体偏见识别的标签和句子级别。与现有研究相比，我们的数据纳入了参与者人口统计，政治意识形态的背景信息，以及一般来说媒体的意见。根据我们的数据，我们还提供了一种方法，可以自动检测新闻文章中的偏见诱导词语。我们的方法是面向的，与深度学习技术相比，提供了强大的描述性和解释性力量。我们识别和工程师各种语言，词汇和句法特征，可能是媒体偏置指标。我们的资源集合是媒体偏见研究领域最完整的据我们所知。我们以各种组合评估我们的所有特征，并在将来的研究和一般任务中检索其可能的重要性。我们还评估了各种特征的各种可能的机器学习方法。 XGBoost是一个决策树实现，产生了最佳结果。我们的方法实现了0.43的F_1-Score，精度为0.29，召回0.77，ROC AUC为0.79，这优于基于特征的电流介质偏置检测方法。我们提出了未来的改进，讨论了基于特征的方法和神经网络的组合与我们目前的系统的结合。

著录项

来源
《Information Processing & Management》 |2021年第3期|102505.1-102505.15|共15页
作者
Timo Spinde; Lada Rudnitckaia; Jelena Mitrovic; Felix Hamborg; Michael Granitzer; Bela Gipp; Karsten Donnay;
展开▼
作者单位

University of Konstanz Universitaetsstrasse 10 DE-78464 Konstanz Germany University of Wuppertal Gaussstrasse 20 DE-42119 Wuppertal Germany;

University of Konstanz Universitaetsstrasse 10 DE-78464 Konstanz Germany;

University of Passau Innstrasse 41 DE-94032 Passau Germany;

University of Konstanz Universitaetsstrasse 10 DE-78464 Konstanz Germany Heidelberg Academy of Sciences and Humanities Germany;

University of Passau Innstrasse 41 DE-94032 Passau Germany;

University of Wuppertal Gaussstrasse 20 DE-42119 Wuppertal Germany Heidelberg Academy of Sciences and Humanities Germany;

University of Zurich Raemistrasse 71 CH-8006 Zuerich Switzerland Heidelberg Academy of Sciences and Humanities Germany;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Media bias; Feature engineering; Text analysis; Context analysis; News analysis; Bias data set;

机译：媒体偏见;功能工程;文字分析;上下文分析;新闻分析;偏置数据集;

相似文献

外文文献
中文文献
专利

1. Automated identification of media bias in news articles: an interdisciplinary literature review [J] . Felix Hamborg, Karsten Donnay, Bela Gipp International journal on digital libraries . 2019,第4期

机译：自动识别新闻报道中的媒体偏见：跨学科文献综述
2. News Coverage of Immigration: The Influence of Exposure to Linguistic Bias in the News on Consumer's Racial/Ethnic Cognitions [J] . Dana Mastro, Riva Tukachinsky, Elizabeth Behm-Morawitz, Communication Quarterly . 2014,第2期

机译：移民新闻报道：消费者种族/族裔认知新闻中暴露于语言偏见的影响
3. Automatic Identification and Production of Related Words for Historical Linguistics [J] . Alina Maria Ciobanu, Liviu P. Dinu Computational linguistics . 2020,第4期

机译：历史语言学的自动识别和生产相关词汇
4. Automated Identification of Media Bias by Word Choice and Labeling in News Articles [C] . Felix Hamborg, Anastasia Zhukova, Bela Gipp ACM/IEEE Joint Conference on Digital Libraries . 2019

机译：通过新闻文章中的单词选择和标签自动识别媒体偏见
5. Detecting Bias in News Article Content with Machine Learning [D] . Martindale, Nathan. 2020

机译：通过机器学习检测新闻文章内容中的偏见
6. Substance use recovery and linguistics: The impact of word choice on explicit and implicit bias [O] . Robert D. Ashford, Austin M. Brown, Brenda Curtis -1

机译：物质的使用恢复和语言学：单词选择对显性和隐性偏见的影响
7. Automated identification of bias inducing words in news articles using linguistic and context-oriented features [O] . Timo Spinde, Lada Rudnitckaia, Jelena Mitrović, 2021

机译：使用语言和背景化特征自动识别新闻文章中的偏差词

Automated identification of bias inducing words in news articles using linguistic and context-oriented features

摘要

著录项

相似文献

相关主题

期刊订阅