首页> 外文会议>Annual conference on Neural Information Processing Systems >Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings
【24h】

Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings

机译:男人是女程序员,因为女人是家庭主妇吗?脱叠Word Embeddings.

获取原文

摘要

The blind application of machine learning runs the risk of amplifying biases present in data. Such a danger is facing us with word embedding, a popular framework to represent text data as vectors which has been used in many machine learning and natural language processing tasks. We show that even word embeddings trained on Google News articles exhibit female/male gender stereotypes to a disturbing extent. This raises concerns because their widespread use, as we describe, often tends to amplify these biases. Geometrically, gender bias is first shown to be captured by a direction in the word embedding. Second, gender neutral words are shown to be linearly separable from gender definition words in the word embedding. Using these properties, we provide a methodology for modifying an embedding to remove gender stereotypes, such as the association between the words receptionist and female, while maintaining desired associations such as between the words queen and female. Using crowd-worker evaluation as well as standard benchmarks, we empirically demonstrate that our algorithms significantly reduce gender bias in embeddings while preserving the its useful properties such as the ability to cluster related concepts and to solve analogy tasks. The resulting embeddings can be used in applications without amplifying gender bias.
机译:机器学习的盲应用运营放大数据中存在的偏差的风险。这种危险是面对我们的单词嵌入,一个流行的框架,以将文本数据代表为已在许多机器学习和自然语言处理任务中使用的向量。我们展示了甚至在谷歌新闻文章上培训的单词嵌入式培训,将女性/男性性别刻板印象展示到令人不安的程度。这促进了担忧,因为我们描述的广泛使用通常往往会扩增这些偏差。几何上,首先显示性别偏差,以通过嵌入单词的方向捕获。其次,显示性别中立单词与单词嵌入词中的性别定义词线性可分离。使用这些属性,我们提供了一种修改嵌入以删除性别刻板印象的方法,例如单词接待员和女性之间的关联,同时保持期望的关联,例如女王和女性之间的单词。使用人群 - 工人评估以及标准基准,我们经验证明我们的算法显着降低嵌入中的性别偏差,同时保留其有用的属性,例如群集相关概念的能力和解决类比任务。得到的嵌入物可用于在不放大性别偏压的情况下使用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号