Competitive Multi-agent Inverse Reinforcement Learning with Sub-optimal Demonstrations

Xingyu Wang; Diego Klabjan

首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Competitive Multi-agent Inverse Reinforcement Learning with Sub-optimal Demonstrations

【24h】

Competitive Multi-agent Inverse Reinforcement Learning with Sub-optimal Demonstrations

机译：具有次优示范的竞争力的多代理反增强学习

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper considers the problem of inverse reinforcement learning in zero-sum stochastic games when expert demonstrations are known to be suboptimal. Compared to previous works that decouple agents in the game by assuming optimality in expert policies, we introduce a new objective function that directly pits experts against Nash Equilibrium policies, and we design an algorithm to solve for the reward function in the context of inverse reinforcement learning with deep neural networks as model approximations. To ?nd Nash Equilibrium in large-scale games, we also propose an adversarial training algorithm for zero-sum stochastic games, and show the theoretical appeal of non-existence of local optima in its objective function. In numerical experiments, we demonstrate that our Nash Equilibrium and inverse reinforcement learning algorithms address games that are not amenable to existing benchmark algorithms. Moreover, our algorithm successfully recovers reward and policy functions regardless of the quality of the sub-optimal expert demonstration set.

机译：本文认为，当已知专家演示是次优时，零汇率随机游戏中的逆钢筋学习问题。与以前的作品相比，通过假设专家政策的最优性解散游戏中的代理，我们介绍了一种新的目标函数，直接追捕纳什均衡政策的专家，我们设计了一种在反钢化学习背景下解决奖励功能的算法深神经网络作为模型近似。在大型游戏中的ND纳什均衡，我们还提出了一种零汇率随机游戏的对抗训练算法，并在其客观函数中显示了本地最优的理论吸引力。在数值实验中，我们证明我们的纳什均衡和逆钢筋学习算法，这些算法不适合现有的基准算法。此外，我们的算法成功地恢复了奖励和策略功能，而不管次优专家演示集的质量。

著录项

来源
《JMLR: Workshop and Conference Proceedings》 |2018年第2010期|共9页
作者
Xingyu Wang; Diego Klabjan;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Multi-agent reinforcement learning with approximate model learning for competitive games [J] . Young Joon Park, Yoon Sang Cho, Seoung Bum Kim PLoS One . 2019,第9期

机译：竞争游戏近似模型学习多功能辅助加固学习
2. Learning competitive pricing strategies by multi-agent reinforcement learning [J] . Erich Kutschinski, Thomas Uthmann, Daniel Polani Journal of Economic Dynamics and Control . 2003,第11a12期

机译：通过多主体强化学习来学习竞争性定价策略
3. A multi-agent reinforcement learning method with learning of other agents for competitive game [J] . Yoichiro Matsuno, Tatsuya Yamazaki, Jun Matsuda, 電子情報通信学会技術研究報告. ニュ-ロコンピュ-ティング. Neurocomputing . 2000,第688期

机译：一种多智能体强化学习方法，结合其他智能体进行竞技游戏学习
4. Learning Virtual Grasp with Failed Demonstrations via Bayesian Inverse Reinforcement Learning [C] . Xu Xie, Changyang Li, Chi Zhang, IEEE/RSJ International Conference on Intelligent Robots and Systems . 2019

机译：通过贝叶斯逆钢筋学习，学习虚拟掌握失败的示威活动
5. Min-Max Inverse Reinforcement Learning for Learning Bi-Modal Dialogue Policies [D] . Patil, Gandharv. 2020

机译：用于学习双模对话策略的最大最大逆钢筋学习
6. Multi-agent reinforcement learning with approximate model learning for competitive games [O] . Young Joon Park, Yoon Sang Cho, Seoung Bum Kim 2012

机译：多主体强化学习和近似模型学习的竞技游戏
7. Multi-agent reinforcement learning with approximate model learning for competitive games [O] . Young Joon Park, Yoon Sang Cho, Seoung Bum Kim 2019

机译：具有竞争游戏的近似模型学习的多智能体增强学习

Competitive Multi-agent Inverse Reinforcement Learning with Sub-optimal Demonstrations

摘要

著录项

相似文献

相关主题

期刊订阅