首页> 外文期刊>The VLDB journal >Managing bias and unfairness in data for decision support: a survey of machine learning and data engineering approaches to identify and mitigate bias and unfairness within data management and analytics systems
【24h】

Managing bias and unfairness in data for decision support: a survey of machine learning and data engineering approaches to identify and mitigate bias and unfairness within data management and analytics systems

机译:管理决策数据数据的偏见和不公平:对机器学习和数据工程方法的调查,以确定和减轻数据管理和分析系统中的偏见和不公平的方法

获取原文
获取原文并翻译 | 示例
           

摘要

The increasing use of data-driven decision support systems in industry and governments is accompanied by the discovery of a plethora of bias and unfairness issues in the outputs of these systems. Multiple computer science communities, and especially machine learning, have started to tackle this problem, often developing algorithmic solutions to mitigate biases to obtain fairer outputs. However, one of the core underlying causes for unfairness is bias in training data which is not fully covered by such approaches. Especially, bias in data is not yet a central topic in data engineering and management research. We survey research on bias and unfairness in several computer science domains, distinguishing between data management publications and other domains. This covers the creation of fairness metrics, fairness identification, and mitigation methods, software engineering approaches and biases in crowdsourcing activities. We identify relevant research gaps and show which data management activities could be repurposed to handle biases and which ones might reinforce such biases. In the second part, we argue for a novel data-centered approach overcoming the limitations of current algorithmic-centered methods. This approach focuses on eliciting and enforcing fairness requirements and constraints on data that systems are trained, validated, and used on. We argue for the need to extend database management systems to handle such constraints and mitigation methods. We discuss the associated future research directions regarding algorithms, formalization, modelling, users, and systems.
机译:在行业和政府中增加数据驱动决策支持系统的使用伴随着在这些系统的产出中发现了一种偏见和不公平问题。多种计算机科学社区,特别是机器学习,已经开始解决这个问题,常常开发算法解决方案来减轻偏差以获得更公平的输出。然而,不公平的核心基本原因之一是培训数据的偏见,这些数据没有被这种方法完全覆盖。特别是,数据中的偏差尚未成为数据工程和管理研究中的核心主题。我们调查了几个计算机科学域的偏见和不公平的研究,区分了数据管理出版物和其他领域。这涵盖了公平度量,公平识别和缓解方法,软件工程方法和群体中的缓解方法。我们确定了相关的研究差距,并显示了哪些数据管理活动可以重新掌握以处理偏见,并且哪些可能加强这种偏见。在第二部分中,我们争论了一种新的数据中心方法,克服了当前算法中心方法的局限性。这种方法侧重于引发和强制执行公平要求和对系统培训,验证和使用的数据的限制。我们争辩旨在扩展数据库管理系统以处理此类约束和缓解方法。我们讨论了关于算法,正式化,建模,用户和系统的相关未来研究方向。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号