首页> 外文期刊>Artificial intelligence >Abstraction from demonstration for efficient reinforcement learning in high-dimensional domains
【24h】

Abstraction from demonstration for efficient reinforcement learning in high-dimensional domains

机译:演示的抽象,用于高维领域的有效强化学习

获取原文
获取原文并翻译 | 示例
           

摘要

Reinforcement learning (RL) and learning from demonstration (LfD) are two popular families of algorithms for learning policies for sequential decision problems, but they are often ineffective in high-dimensional domains unless provided with either a great deal of problem-specific domain information or a carefully crafted representation of the state and dynamics of the world. We introduce new approaches inspired by these two techniques, which we broadly call abstraction from demonstration. Our first algorithm, state abstraction from demonstration (AfD), uses a small set of human demonstrations of the task the agent must learn to determine a state-space abstraction. Our second algorithm, abstraction and decomposition from demonstration (ADA), is additionally able to determine a task decomposition from the demonstrations. These abstractions allow RL to scale up to higher-complexity domains, and offer much better performance than LfD with orders of magnitude fewer demonstrations. Using a set of videogame-like domains, we demonstrate that using abstraction from demonstration can obtain up to exponential speed-ups in table-based representations, and polynomial speed-ups when compared with function approximation-based RL algorithms such as fitted Q-learning and LSPI.
机译:强化学习(RL)和示范学习(LfD)是两个用于顺序决策问题学习策略的流行算法家族,但除非在高维领域中提供大量特定于问题的领域信息或精心制作的状态和世界动态的表示。我们介绍了受这两种技术启发的新方法,我们将它们广泛地称为从演示中抽象。我们的第一个算法是“演示中的状态抽象”(AfD),它使用代理人必须学习的一小部分人类演示来确定状态空间抽象。我们的第二种算法,即演示的抽象和分解(ADA),还能够确定演示的任务分解。这些抽象使RL可以扩展到更高复杂性的域,并提供比LfD更好的性能,并且演示次数要少几个数量级。通过使用一组类似视频游戏的域,我们证明了使用演示的抽象可以在基于表的表示形式中获得高达指数的提速,并且与基于函数逼近的RL算法(例如拟合Q学习)相比,可以实现多项式提速。和LSPI。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号