首页> 外文会议>Annual conference on Neural Information Processing Systems >Inverse Reinforcement Learning through Structured Classification

【24h】

Inverse Reinforcement Learning through Structured Classification

机译：通过结构化分类进行逆强化学习

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper adresses the inverse reinforcement learning (IRL) problem, that is inferring a reward for which a demonstrated expert behavior is optimal. We introduce a new algorithm, SCIRL, whose principle is to use the so-called feature expectation of the expert as the parameterization of the score function of a multi-class classifier. This approach produces a reward function for which the expert policy is provably near-optimal. Contrary to most of existing IRL algorithms, SCIRL does not require solving the direct RL problem. Moreover, with an appropriate heuristic, it can succeed with only trajectories sampled according to the expert behavior. This is illustrated on a car driving simulator.

机译：本文解决了逆向强化学习（IRL）问题，即推断出对已证明的专家行为最佳的奖励。我们介绍了一种新算法SCIRL，其原理是使用专家的所谓特征期望作为多类分类器得分函数的参数化。这种方法产生了一个奖励函数，专家策略证明该函数是接近最优的。与大多数现有的IRL算法相反，SCIRL不需要解决直接RL问题。此外，使用适当的启发式方法，仅根据专家行为对轨迹进行采样即可成功。汽车驾驶模拟器上对此进行了说明。

著录项

来源
《Annual conference on Neural Information Processing Systems 》|2012年|1007-1015|共9页
会议地点
作者
Edouard Klein; Matthieu Geist; Bilal Piot; Olivier Pietquin;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Advanced planning for autonomous vehicles using reinforcement learning and deep inverse reinforcement learning [J] . You Changxi, Lu Jianbo, Filev Dimitar, Robotics and Autonomous Systems . 2019 ,第期

机译：利用强化学习和深度逆钢筋学习的自治车辆先进规划
2. From inverse optimal control to inverse reinforcement learning: A historical review [J] . Ab Azar Nematollah, Shahmansoorian Aref, Davoudi Mohsen Annual Review in Control . 2020 ,第1期

机译：从反向加强学习的反对最优控制：历史评论
3. Inverse Reinforcement Learning Based Stochastic Driver Behavior Learning [J] . Mehmet F. Ozkan, Abishek J. Rocque, Yao Ma IFAC PapersOnLine . 2021 ,第20期

机译：基于逆钢筋的随机驾驶员行为学习
4. Inverse Reinforcement Learning through Structured Classification [C] . Edouard Klein, Matthieu Geist, Bilal Piot, Annual conference on Neural Information Processing Systems . 2012

机译：通过结构化分类逆钢筋学习
5. Min-Max Inverse Reinforcement Learning for Learning Bi-Modal Dialogue Policies [D] . Patil, Gandharv. 2020

机译：用于学习双模对话策略的最大最大逆钢筋学习
6. Human Reinforcement Learning Subdivides Structured Action Spaces by Learning Effector-Specific Values [O] . Samuel J. Gershman, Bijan Pesaran, Nathaniel D. Daw 2009

机译：人类强化学习通过学习效应子特定值来细分结构化的动作空间
7. Inverse Reinforcement Learning through Structured Classification [O] . Klein Edouard, Geist Matthieu, Piot Bilal, 2012

机译：通过结构化分类进行逆强化学习

Inverse Reinforcement Learning through Structured Classification

摘要

著录项

相似文献

相关主题

期刊订阅