【24h】

Should Ⅰ Tear down This Wall? Optimizing Social Metrics by Evaluating Novel Actions

机译:Ⅰ应该撕下这堵墙吗? 通过评估新颖作用来优化社会指标

获取原文

摘要

One of the fundamental challenges of governance is deciding when and how to intervene in multi-agent systems in order to impact group-wide metrics of success. This is particularly challenging when proposed interventions are novel and expensive. For example, one may wish to modify a building's layout to improve the efficiency of its escape route. Evaluating such interventions would generally require access to an elaborate simulator, which must be constructed ad-hoc for each environment, and can be prohibitively costly or inaccurate. Here we examine a simple alternative: Optimize By Observational Extrapolation (OBOE). The idea is to use observed behavioural trajectories, without any interventions, to learn predictive models mapping environment states to individual agent outcomes, and then use these to evaluate and select changes. We evaluate OBOE in socially complex gridworld environments and consider novel physical interventions that our models were not trained on. We show that neural network models trained to predict agent returns on baseline environments are effective at selecting among the interventions. Thus, OBOE can provide guidance for challenging questions like: "which wall should I tear down in order to minimize the Gini index of this group?"
机译:治理的基本挑战之一决定何时以及如何干预多助理系统,以影响集体的成功指标。当拟议的干预措施是新颖且昂贵的情况时,这尤其具有挑战性。例如,人们可能希望修改建筑物的布局以提高其逃生路线的效率。评估此类干预措施通常需要访问精心的模拟器,该模拟器必须为每个环境构建ad-hoc,并且可以过度昂贵或不准确。在这里,我们检查一个简单的替代方案:通过观察外推(Oboe)优化。该想法是在没有任何干预的情况下使用观察到的行为轨迹,以将预测模型映射环境状态映射到各个代理结果,然后使用这些来评估并选择更改。我们在社会复杂的Gridworld环境中评估双簧管,并考虑我们的模型没有接受培训的新型物理干预。我们表明,培训的神经网络模型预测基线环境上的代理返回是有效的,在干预措施方面都是有效的。因此,Oboe可以为具有挑战性的问题提供指导,如:“我应该撕下哪个墙壁以尽量减少这个群体的基尼指数?”

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号