Lifelong Learning of Structure in the Space of Policies

机译：终身学习在政策空间中的结构

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We address the problem faced by an autonomous agent that must achieve quick responses to a family of qualitatively-related tasks, such as a robot interacting with different types of human participants. We work in the setting where the tasks share a state-action space and have the same qualitative objective but differ in the dynamics and reward process. We adopt a transfer approach where the agent attempts to exploit common structure in learnt policies to accelerate learning in a new one. Our technique consists of a few key steps. First, we use a probabilistic model to describe the regions in state space which successful trajectories seem to prefer. Then, we extract policy fragments from previously-learnt policies for these regions as candidates for reuse. These fragments may be treated as options with corresponding domains and termination conditions extracted by unsupervised learning. Then, the set of reusable policies is used when learning novel tasks, and the process repeats. The utility of this method is demonstrated through experiments in the simulated soccer domain, where the variability comes from the different possible behaviours of opponent teams, and the agent needs to perform well against novel opponents.

机译：我们解决了一个自治代理面临的问题，必须为一个与定性相关的任务的家庭进行快速响应，例如与不同类型的人类参与者交互的机器人。我们在任务中共享国家行动空间的环境中工作，并且具有相同的定性目标，但在动态和奖励过程中有所不同。我们采用转移方法，代理商试图利用学习政策的共同结构，以加速新的策略。我们的技术包括一些关键步骤。首先，我们使用概率模型来描述成功轨迹似乎更喜欢的状态空间中的区域。然后，我们将策略片段从先前学习的这些区域中提取策略片段作为重用的候选者。这些片段可以被视为具有相应域的选项和由无监督学习提取的相应域和终止条件。然后，在学习新型任务时使用该组可重用策略，并重复该过程。通过模拟的足球域中的实验证明了该方法的实用性，其中可变异来自对手团队的不同可能性，而代理需要对新的对手进行良好。

著录项

来源
《AAAI Symposium on Lifelong Machine Learning》|2013年||共6页
会议地点
作者
Majd Hawasly; Subramanian Ramamoorthy;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-53;
关键词

相似文献

外文文献
中文文献
专利

1. DESIGNING AND INSTRUCTING HYBRID OPEN LEARNING SPACES MODEL TO SUPPORT LIFELONG LEARNING ENGAGEMENT [J] . CAROLINE M. CRAWFORD International journal on E-learning . 2016,第3期

机译：设计和指导混合开放式学习空间模型以支持终身学习
2. Learning through social spaces: migrant women and lifelong learning in postâcolonial London [J] . Sue Jacksona* International Journal of Lifelong Education . 2010,第2期

机译：通过社交空间学习：移民妇女和后殖民时期伦敦的终身学习
3. The evolution of the European Union's lifelong learning policies: an institutional learning perspective [J] . Moosung Lee, Tryggvi Thayer, Naim Madyun Comparative Education . 2008,第4期

机译：欧盟终身学习政策的演变：制度学习的视角
4. Lifelong Learning of Structure in the Space of Policies [C] . Majd Hawasly, Subramanian Ramamoorthy AAAI Spring Symposium . 2013

机译：终身学习在政策空间中的结构
5. Lifelong Learning in the Twenty-First Century: An Investigation of the Interrelationships Between Self-Directed Learning and Lifelong Learning. [D] . Murray, Henry. 2015

机译：二十一世纪的终身学习：对自主学习与终身学习之间相互关系的研究。
6. Human Reinforcement Learning Subdivides Structured Action Spaces by Learning Effector-Specific Values [O] . Samuel J. Gershman, Bijan Pesaran, Nathaniel D. Daw 2009

机译：人类强化学习通过学习效应子特定值来细分结构化的动作空间
7. What’s the point of lifelong learning if lifelong learning has no point? On the democratic deficit of policies for lifelong learning. [O] . Biesta, Gert 2006

机译：如果终身学习没有意义，那么终身学习的重点是什么？论终身学习政策的民主缺失。

Lifelong Learning of Structure in the Space of Policies

摘要

著录项

相似文献

相关主题

期刊订阅