It's All in the Name: Mitigating Gender Bias with Name-Based Counterfactual Data Substitution

机译：一切都在名称中：通过基于名称的反事实数据替代来缓解性别偏见

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper treats gender bias latent in word embeddings. Previous mitigation attempts rely on the operationalisation of gender bias as a projection over a linear subspace. An alternative approach is Counter/actual Data Augmentation (CDA), in which a corpus is duplicated and augmented to remove bias, e.g. by swapping all inherently-gendered words in the copy. We perform an empirical comparison of these approaches on the English Gigaword and Wikipedia, and find that whilst both successfully reduce direct bias and perform well in tasks which quantify embedding quality, CDA variants outperform projection-based methods at the task of drawing non-biased gender analogies by an average of 19% across both corpora. We propose two improvements to CDA: Counterfactual Data Substitution (CDS), a variant of CDA in which potentially biased text is randomly substituted to avoid duplication, and the Names Intervention, a novel name-pairing technique that vastly increases the number of words being treated. CDA/S with the Names Intervention is the only approach which is able to mitigate indirect gender bias: following de-biasing, previously biased words are significantly less clustered according to gender (cluster purity is reduced by 49%), thus improving on the state-of-the-art for bias mitigation.

机译：本文研究了词嵌入中潜在的性别偏见。先前的缓解尝试依赖于将性别偏见的可操作性作为线性子空间上的投影。另一种方法是计数器/实际数据增强（CDA），其中复制并增强语料库以消除偏见，例如通过交换副本中所有固有性别的单词。我们在英语Gigaword和Wikipedia上对这些方法进行了实证比较，发现尽管成功地减少了直接偏差并且在量化嵌入质量的任务中表现良好，但CDA变体在绘制无偏见性别的任务上胜过了基于投影的方法。两种语料库的平均比喻为19％。我们建议对CDA进行两项改进：对抗事实数据替换（CDS），一种CDA的变体，其中为了避免重复而随机替换有潜在偏见的文本，以防止重复;以及Names Intervention，一种新颖的名称配对技术，可大大增加要处理的单词数。具有名称干预功能的CDA / S是唯一能够减轻间接性别偏见的方法：去偏后，以前偏向的单词根据性别的聚类程度明显降低（群集纯度降低了49％），从而改善了状态最先进的缓解偏见。

著录项

来源
《International joint conference on natural language processing;Conference on empirical methods in natural language processing》|2019年|5266-5274|共9页
会议地点
作者
Rowan Hall Maudslay; Hila Gonen; Ryan Cotterell; Simone Teufel;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Managing bias and unfairness in data for decision support: a survey of machine learning and data engineering approaches to identify and mitigate bias and unfairness within data management and analytics systems [J] . Balayn Agathe, Lofi Christoph, Houben Geert-Jan The VLDB journal . 2021,第5期

机译：管理决策数据数据的偏见和不公平：对机器学习和数据工程方法的调查，以确定和减轻数据管理和分析系统中的偏见和不公平的方法
2. Gender-Specific Response in Pain and Function to Biologic Treatment of Knee Osteoarthritis: A Gender-Bias-Mitigated, Observational, Intention-to-Treat Study at Two Years [J] . Tiffanie-Marie Borg, Nima Heidari, Ali Noorani, Stem cells international . 2021,第a期

机译：疼痛的性别特异性反应和膝关节骨关节炎的生物学治疗：两年后的性别偏见，观察，意向治疗研究
3. Towards Explainable Classifiers Using the Counterfactual Approach - Global Explanations for Discovering Bias in Data [J] . Agnieszka Miko?ajczyk, Micha? Grochowski, Arkadiusz Kwasigroch Journal of Artificial Intelligence and Soft Computing Research . 2021,第1期

机译：使用反事方法来解释可解释的分类器 - 用于发现数据偏差的全局解释
4. It's All in the Name: Mitigating Gender Bias with Name-Based Counterfactual Data Substitution [C] . Rowan Hall Maudslay, Hila Gonen, Ryan Cotterell, International joint conference on natural language processing . 2019

机译：这一切都在名称：减轻基于名称的反事实数据替代的性别偏见
5. Gender Dynamics in the Workplace: A Nuanced Look at Gender Bias and How to Mitigate It [D] . Kirk, Jessica Frances. 2019

机译：工作场所的性别动态：对性别偏见及其缓解方法的细微观察
6. Gender-Specific Response in Pain and Function to Biologic Treatment of Knee Osteoarthritis: A Gender-Bias-Mitigated Observational Intention-to-Treat Study at Two Years [O] . Tiffanie-Marie Borg, Nima Heidari, Ali Noorani, 2021

机译：疼痛的性别特异性反应和膝关节骨关节炎的生物学治疗：两年后的性别偏见观察意向治疗研究
7. Counterfactual Data Augmentation for Mitigating Gender Stereotypes in Languages with Rich Morphology [O] . Ran Zmigrod, Sebastian J. Mielke, Hanna Wallach, 2019

机译：以丰富的形态学减轻语言的性别刻板印象的反事实数据增强

It's All in the Name: Mitigating Gender Bias with Name-Based Counterfactual Data Substitution

摘要

著录项

相似文献

相关主题

期刊订阅