Abstraction from demonstration for efficient reinforcement learning in high-dimensional domains

Luis C. Cobo; Kaushik Subramanian; Charles L. Isbell Jr.; Aaron D. Lanterman; Andrea L. Thomaz

首页> 外文期刊>Artificial intelligence >Abstraction from demonstration for efficient reinforcement learning in high-dimensional domains

【24h】

Abstraction from demonstration for efficient reinforcement learning in high-dimensional domains

机译：演示的抽象，用于高维领域的有效强化学习

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Reinforcement learning (RL) and learning from demonstration (LfD) are two popular families of algorithms for learning policies for sequential decision problems, but they are often ineffective in high-dimensional domains unless provided with either a great deal of problem-specific domain information or a carefully crafted representation of the state and dynamics of the world. We introduce new approaches inspired by these two techniques, which we broadly call abstraction from demonstration. Our first algorithm, state abstraction from demonstration (AfD), uses a small set of human demonstrations of the task the agent must learn to determine a state-space abstraction. Our second algorithm, abstraction and decomposition from demonstration (ADA), is additionally able to determine a task decomposition from the demonstrations. These abstractions allow RL to scale up to higher-complexity domains, and offer much better performance than LfD with orders of magnitude fewer demonstrations. Using a set of videogame-like domains, we demonstrate that using abstraction from demonstration can obtain up to exponential speed-ups in table-based representations, and polynomial speed-ups when compared with function approximation-based RL algorithms such as fitted Q-learning and LSPI.

机译：强化学习（RL）和示范学习（LfD）是两个用于顺序决策问题学习策略的流行算法家族，但除非在高维领域中提供大量特定于问题的领域信息或精心制作的状态和世界动态的表示。我们介绍了受这两种技术启发的新方法，我们将它们广泛地称为从演示中抽象。我们的第一个算法是“演示中的状态抽象”（AfD），它使用代理人必须学习的一小部分人类演示来确定状态空间抽象。我们的第二种算法，即演示的抽象和分解（ADA），还能够确定演示的任务分解。这些抽象使RL可以扩展到更高复杂性的域，并提供比LfD更好的性能，并且演示次数要少几个数量级。通过使用一组类似视频游戏的域，我们证明了使用演示的抽象可以在基于表的表示形式中获得高达指数的提速，并且与基于函数逼近的RL算法（例如拟合Q学习）相比，可以实现多项式提速。和LSPI。

著录项

来源
《Artificial intelligence》 |2014年第11期|103-128|共26页
作者
Luis C. Cobo; Kaushik Subramanian; Charles L. Isbell Jr.; Aaron D. Lanterman; Andrea L. Thomaz;
展开▼
作者单位

School of Electrical and Computer Engineering, Georgia Tech, Atlanta, GA, 30332, USA;

College of Computing, Georgia Tech, Atlanta, GA, 30332, USA;

College of Computing, Georgia Tech, Atlanta, GA, 30332, USA;

School of Electrical and Computer Engineering, Georgia Tech, Atlanta, GA, 30332, USA;

College of Computing, Georgia Tech, Atlanta, GA, 30332, USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Reinforcement learning; Learning from demonstration; Dimensionality reduction; Function approximation;

机译：强化学习;从示范中学习;降维;函数近似;

相似文献

外文文献
中文文献
专利

1. Efficient Insertion Control for Precision Assembly Based on Demonstration Learning and Reinforcement Learning [J] . Ma Yanqin, Xu De, Qin Fangbo IEEE transactions on industrial informatics . 2021,第7期

机译：基于演示学习和加固学习的精密组装有效插入控制
2. Accelerated deep reinforcement learning with efficient demonstration utilization techniques [J] . Yeo Sangho, Oh Sangyoon, Lee Minsu World Wide Web . 2021,第4期

机译：高效示范利用技术加速了深度加强学习
3. Efficient hindsight reinforcement learning using demonstrations for robotic tasks with sparse rewards [J] . Guoyu Zuo, Qishen Zhao, Jiahao Lu, International Journal of Advanced Robotic Systems . 2020,第1期

机译：使用具有稀疏奖励的机器人任务的演示高效的后敏感钢筋学习
4. Neural Discrete Abstraction of High-Dimensional Spaces: A Case Study In Reinforcement Learning [C] . Petros Giannakopoulos, Aggelos Pikrakis, Yannis Cotronis European Signal Processing Conference . 2020

机译：高维空间的神经离散抽象 - 以钢筋学习为例
5. Learning Hierarchical Abstractions from Human Demonstrations for Application-Scale Domains [D] . Leece, Michael. 2019

机译：从人类演示为应用程序规模域学习分层抽象
6. Scaled free-energy based reinforcement learning for robust and efficient learning in high-dimensional state spaces [O] . Stefan Elfwing, Eiji Uchibe, Kenji Doya 2013

机译：基于缩放自由能的增强学习可在高维状态空间中进行健壮和高效的学习
7. Scaled Free-Energy Based Reinforcement Learning for Robust and Efficient Learning in High-Dimensional State Spaces [O] . Stefan eElfwing, Eiji eUchibe, Kenji eDoya 2013

机译：基于规模自由能的强化学习在高维状态空间中的鲁棒有效学习

Abstraction from demonstration for efficient reinforcement learning in high-dimensional domains

摘要

著录项

相似文献

相关主题

期刊订阅