Maximum Entropy Inverse Reinforcement Learning Based on Behavior Cloning of Expert Examples

机译：基于行为克隆的专家例子的最大熵逆钢筋学习

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This study proposes a preprocessing framework for expert examples based on behavior cloning (BC) to solve the problem that inverse reinforcement learning (IRL) is inaccurate due to the noises of expert examples. In order to remove the noises in the expert examples, we first use supervised learning to learn the approximate expert policy, and then use this approximate expert policy to clone new expert examples from the old expert examples, the idea of this preprocessing framework is BC, IRL can obtain higher quality expert examples after preprocessing. The IRL framework adopts the form of maximum entropy, and specific experiments demonstrate the effectiveness of the proposed approach, in the case of expert examples with noises, the reward functions that after BC preprocessing is better than that without preprocessing, especially with the increase of noise level, the effect is particularly obvious.

机译：本研究提出了一种基于行为克隆（BC）来解决基于行为克隆（BC）的专家示例的预处理框架，以解决由于专家示例的噪音而不准确的逆增强学习（IRL）。为了在专家的例子中删除噪声，我们首先使用监督学习来学习近似的专家策略，然后使用这一近似专家策略来克隆新的专家案例从旧的专家示例中，这个预处理框架的想法是BC， IRL可以在预处理后获得更高质量的专家示例。 IRL框架采用最大熵的形式，具体实验证明了所提出的方法的有效性，在具有噪声的专家示例的情况下，在BC预处理之后的奖励功能比没有预处理的情况更好，特别是随着噪音的增加水平，效果尤为明显。

著录项

来源
《IEEE Data Driven Control and Learning Systems Conference》|2021年|996-1000|共5页
会议地点
作者
Dazi Li; Jianghai Du;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Conferences; Supervised learning; Cloning; Reinforcement learning; Filtering algorithms; Control systems; Entropy;

机译：会议;监督学习;克隆;加固学习;过滤算法;控制系统;熵;

相似文献

外文文献
中文文献
专利

1. A Study of Continuous Maximum Entropy Deep Inverse Reinforcement Learning [J] . Chen Xi-liang, Cao Lei, Xu Zhi-xiong, Mathematical Problems in Engineering . 2019,第8期

机译：连续最大熵深度逆强化学习的研究
2. A Study of Continuous Maximum Entropy Deep Inverse Reinforcement Learning [J] . Xi-liang Chen, Lei Cao, Zhi-xiong Xu, Mathematical Problems in Engineering: Theory, Methods and Applications . 2019,第1期

机译：持续最大熵深度抗强学习的研究
3. Infinite Time Horizon Maximum Causal Entropy Inverse Reinforcement Learning [J] . Zhengyuan Zhou, Michael Bloem, Nicholas Bambos IEEE Transactions on Automatic Control . 2018,第9期

机译：无限时域最大因果熵逆强化学习
4. Ground Delay Program Analytics with Behavioral Cloning and Inverse Reinforcement Learning [C] . Michael Bloem, Nicholas Bambos AGIFORS annual symposium . 2015

机译：具有行为克隆和逆强化学习的地面延迟程序分析
5. Acquiring Diverse Robot Skills via Maximum Entropy Deep Reinforcement Learning [D] . Haarnoja, Tuomas. 2018

机译：通过最大熵深度强化学习掌握各种机器人技能
6. Identification of animal behavioral strategies by inverse reinforcement learning [O] . Shoichiro Yamaguchi, Honda Naoki, Muneki Ikeda, 2018

机译：通过逆向强化学习识别动物行为策略
7. Learning the Car-following Behavior of Drivers Using Maximum Entropy Deep Inverse Reinforcement Learning [O] . Yang Zhou, Rui Fu, Chang Wang 2020

机译：使用最大熵深度逆钢筋学习驾驶员的汽车跟踪行为

Maximum Entropy Inverse Reinforcement Learning Based on Behavior Cloning of Expert Examples

摘要

著录项

相似文献

相关主题

期刊订阅