The Woman Worked as a Babysitter: On Biases in Language Generation

机译：女人当保姆：语言产生的偏见

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We present a systematic study of biases in natural language generation (NLG) by analyzing text generated from prompts that contain mentions of different demographic groups. In this work, we introduce the notion of the regard towards a demographic, use the varying levels of regard towards different demographics as a defining metric for bias in NLG, and analyze the extent to which sentiment scores are a relevant proxy metric for regard. To this end, we collect strategically-generated text from language models and manually annotate the text with both sentiment and regard scores. Additionally, we build an automatic regard classifier through transfer learning, so that we can analyze biases in unseen text. Together, these methods reveal the extent of the biased nature of language model generations. Our analysis provides a study of biases in NLG, bias metrics and correlated human judgments, and empirical evidence on the usefulness of our annotated dataset.

机译：我们通过分析从提示中包含不同人口统计群体的提示中生成的文本，对自然语言生成（NLG）中的偏见进行了系统的研究。在这项工作中，我们介绍了对人群的关注的概念，使用对不同人群的不同关注程度作为NLG偏见的定义指标，并分析了情绪得分在多大程度上是相关的关注指标。为此，我们从语言模型中收集战略性生成的文本，并用情感和关怀评分手动注释文本。此外，我们通过迁移学习构建了一个自动的关注分类器，以便我们可以分析看不见的文本中的偏见。这些方法一起揭示了语言模型世代有偏性的程度。我们的分析提供了对NLG中的偏倚，偏倚指标和相关的人类判断的研究，以及有关带注释的数据集的有用性的经验证据。

著录项

来源
《International joint conference on natural language processing;Conference on empirical methods in natural language processing》|2019年|3405-3410|共6页
会议地点
作者
Emily Sheng; Kai-Wei Chang; Premkumar Natarajan; Nanyun Peng;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Semantic Network Language Generation based on a Semantic Networks Serialization Grammar [J] . Yintang Dai, Shiyong Zhang, Jidong Chen, World Wide Web . 2010,第3期

机译：基于语义网络序列化语法的语义网络语言生成
2. Knowledge graph based natural language generation with adapted pointer-generator networks [J] . Neurocomputing . 2020,第Mara21期

机译：自适应指针生成器网络基于知识图的自然语言生成
3. Code Generation Approaches for an Automatic Transformation of the Unified Modeling Language to the Berkeley Open Infrastructure for Network Computing Framework [J] . Christian Benjamin Ries, Vic Grout International Journal of Soft Computing and Software Engineering . 2013,第3期

机译：用于自动将统一建模语言转换为用于网络计算框架的Berkeley开放基础结构的代码生成方法
4. The Woman Worked as a Babysitter: On Biases in Language Generation [C] . Emily Sheng, Kai-Wei Chang, Premkumar Natarajan, International joint conference on natural language processing . 2019

机译：这位女士曾担任保姆：语言生成的偏见
5. Rating Generations at Work: A 24-Years of Multisource Feedback and Generational Biases [D] . Wohkittel, Joseph Max. 2021

机译：评级代数在工作：24多年的多源反馈和世代偏见
6. Sound–meaning association biases evidenced across thousands of languages [O] . Damián E. Blasi, Søren Wichmann, Harald Hammarström, 2016

机译：声音-意义的关联偏见在数千种语言中得到证明
7. Multilingual resource sharing across both related and unrelated languages: an implemented, open-source framework for practical natural language generation [O] . John A. Bateman, et al. 2005

机译：跨相关和不相关语言的多语言资源共享：用于实际自然语言生成的实现的开源框架
8. Sandia Micro Artwork Generation for Integrated Circuits (SMAGIC) System and the Hierarchical Artwork (HAT) Language [R] . Richard, B. D. 1982

机译：sandia micro artwork for Integrated Circuits（smaGIC）system and Hierarchical artwork（HaT）Language

The Woman Worked as a Babysitter: On Biases in Language Generation

摘要

著录项

相似文献

相关主题

期刊订阅