Automatic Task Decomposition and State Abstraction from Demonstration

机译：自动任务分解与示范的抽象

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Both Learning from Demonstration (LfD) and Reinforcement Learning (RL) are popular approaches for building decision-making agents. LfD applies supervised learning to a set of human demonstrations to infer and imitate the human policy, while RL uses only a reward signal and exploration to find an optimal policy. For complex tasks both of these techniques may be ineffective. LfD may require many more demonstrations than it is feasible to obtain, and RL can take an inadmissible amount of time to converge. We present Automatic Decomposition and Abstraction from demonstration (ADA), an algorithm that uses mutual information measures over a set of human demonstrations to decompose a sequential decision process into several sub tasks, finding state abstractions for each one of these sub tasks. ADA then projects the human demonstrations into the abstracted state space to build a policy. This policy can later be improved using RL algorithms to surpass the performance of the human teacher. We find empirically that ADA can find satisfying policies for problems that are too complex to be solved with traditional LfD and RL algorithms. In particular, we show that we can use mutual information across state features to leverage human demonstrations to reduce the effects of the curse of dimensionality by finding subtasks and abstractions in sequential decision processes.

机译：从示范（LFD）和强化学习（RL）的学习都是建立决策者的流行方法。 LFD适用于一系列人类示范，以推断和模仿人类政策，而RL仅使用奖励信号和探索来寻找最佳政策。对于复杂任务，这两种技术可能无效。 LFD可能需要比获得的可行性更多的示范，并且RL可以采取不允许的时间来收敛。我们呈现来自演示（ADA）的自动分解和抽象，该算法使用相互信息测量的算法，该算法在一组人类演示中分解为多个子任务，为这些子任务中的每个子任务找到状态抽象。然后，ADA将人类示范项目投入抽象的状态空间以建立政策。稍后可以使用RL算法来提高此策略来超越人文教师的性能。我们发现凭经验，ADA可以找到满意的策略，以解决传统的LFD和RL算法而无法解决的问题。特别是，我们表明我们可以在州特征中使用互信息来利用人类的演示来通过查找顺序决策过程中的子组织和抽象来减少维度诅咒的影响。

著录项

来源
《International Conference on Autonomous Agents and Multiagent Systems》|2012年||共8页
会议地点
作者
Luis C. Cobo; Charles L. Isbell; Andrea L. Thomaz;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP311.1-53;
关键词
Reinforcement learning; Learning from demonstration; Task decomposition; State abstraction;

机译：强化学习;从演示学习;任务分解;州抽象;

相似文献

外文文献
中文文献
专利

1. Automatic hierarchical mid-surface abstraction of thin-walled model based on rib decomposition [J] . Huawei Zhu, Yanli Shao, Yusheng Liu, Advances in Engineering Software . 2016,第jula期

机译：基于肋分解的薄壁模型自动分层中表面抽象
2. Research on task decomposition and state abstraction in reinforcement learning [J] . Yu Lasheng, Jiang Zhongbin, Liu Kang Artificial Intelligence Review: An International Science and Engineering Journal . 2012,第2期

机译：强化学习中的任务分解与状态抽象研究
3. Research on task decomposition and state abstraction in reinforcement learning [J] . Yu Lasheng, Jiang Zhongbin, Liu Kang Artificial Intelligence Review . 2012,第2期

机译：强化学习中的任务分解与状态抽象研究
4. Automatic Task Decomposition and State Abstraction from Demonstration [C] . Luis C. Cobo, Charles L. Isbell, Andrea L. Thomaz International Conference on Autonomous Agents and Multiagent Systems . 2012

机译：自动任务分解与示范的抽象
5. Hierarchical reinforcement learning using automatic task decomposition and exploration shaping. [D] . Djurdjevic, Predrag. 2008

机译：使用自动任务分解和探索成形的分层强化学习。
6. A research and demonstration procedure in stimulus control abstraction and environmental programming [O] . Israel Goldiamond 1964

机译：刺激控制抽象和环境程序设计的研究和演示程序
7. Automatic Error Correction of Large Circuits Using Boolean Decomposition and Abstraction [O] . Dirk W. Hoffmann, Thomas Kropf 1999

机译：使用布尔分解和抽象的大电路自动纠错
8. Construct Abstraction for Automatic Information Abstraction from Digital Images [R] . Sugisaka, M. , Johnson, J. 2006

机译：构建数字图像自动信息抽象的抽象

Automatic Task Decomposition and State Abstraction from Demonstration

摘要

著录项

相似文献

相关主题

期刊订阅