Convergence analysis of an incremental approach to online inverse reinforcement learning

Zhuo-jun?Jin; Hui?Qian; Shen-yi?Chen; Miao-liang?Zhu

首页> 外文期刊>Journal of Zhejiang university science >Convergence analysis of an incremental approach to online inverse reinforcement learning

【24h】

Convergence analysis of an incremental approach to online inverse reinforcement learning

机译：在线逆强化学习增量方法的收敛性分析

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Interest in inverse reinforcement learning (IRL) has recently increased, that is, interest in the problem of recovering the reward function underlying a markov decision process (MDP) given the dynamics of the system and the behavior of an expert. This paper deals with an incremental approach to online IRL. First, the convergence property of the incremental method for the IRL problem was investigated, and the bounds of both the mistake number during the learning process and regret were provided by using a detailed proof. Then an online algorithm based on incremental error correcting was derived to deal with the IRL problem. The key idea is to add an increment to the current reward estimate each time an action mismatch occurs. This leads to an estimate that approaches a target optimal value. The proposed method was tested in a driving simulation experiment and found to be able to efficiently recover an adequate reward function.

机译：逆向强化学习（IRL）的兴趣最近有所增加，也就是说，鉴于系统的动态性和专家的行为，人们对恢复基于马尔可夫决策过程（MDP）的奖励函数的问题的兴趣不断增加。本文介绍了一种在线IRL的增量方法。首先，研究了用于IRL问题的增量方法的收敛性，并使用详细的证明提供了学习过程中错误数和后悔的界限。然后推导了基于增量纠错的在线算法来处理IRL问题。关键思想是每次动作不匹配发生时，在当前奖励估算中增加一个增量。这导致接近目标最佳值的估计。在驾驶模拟实验中对提出的方法进行了测试，发现该方法能够有效地恢复足够的奖励功能。

著录项

来源
《Journal of Zhejiang university science》 |2011年第1期|共8页
作者
Zhuo-jun?Jin; Hui?Qian; Shen-yi?Chen; Miao-liang?Zhu;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Convergence analysis of an incremental approach to online inverse reinforcement learning [J] . Zhuo-jun JIN, Hui QIAN, Shen-yi CHEN, 浙江大学学报（英文版）（C辑：计算机与电子） . 2011,第001期

机译：在线逆强化学习增量方法的收敛性分析
2. Reinforcement learning for linear continuous-time systems: an incremental learning approach [J] . Tao Bian, Zhong-Ping Jiang Automatica Sinica, IEEE/CAA Journal of . 2019,第2期

机译：线性连续时间系统的强化学习：增量学习方法
3. Reinforcement Learning for Linear Continuous-time Systems: an Incremental Learning Approach [J] . Tao Bian, Zhong-Ping Jiang 自动化学报（英文版） . 2019,第002期

机译：线性连续时间系统的强化学习：一种增量学习方法
4. A reinforcement learning approach in rotated image recognition and its convergence analysis [C] . Khan M. Iftekharuddin, Yaqin Li Photonic Devices and Algorithms for Computing VIII . 2006

机译：旋转图像识别中的强化学习方法及其收敛性分析
5. Inferring Structural Models of Travel Behavior: An Inverse Reinforcement Learning Approach [D] . Feygin, Sidney. 2018

机译：推断出行行为的结构模型：反强化学习方法
6. Convergence analysis of efficient online learning in Bayesian spiking neurons [O] . Andre Van Schaik, Levin Kuhlmann, Michael Hauser-Raspe, 1936

机译：贝叶斯尖峰神经元有效在线学习的收敛性分析
7. Lifelong Incremental Reinforcement Learning With Online Bayesian Inference [O] . Zhi Wang, Chunlin Chen, Daoyi Dong 2021

机译：终身增量强化在线贝叶斯推论学习

Convergence analysis of an incremental approach to online inverse reinforcement learning

摘要

著录项

相似文献

相关主题

期刊订阅