...
首页> 外文期刊>Paladyn: Journal of Behavioral Robotics >Active Choice of Teachers, Learning Strategies and Goals for a Socially Guided Intrinsic Motivation Learner
【24h】

Active Choice of Teachers, Learning Strategies and Goals for a Socially Guided Intrinsic Motivation Learner

机译:积极选择教师,学习战略和社会导游内在动机学习者的目标

获取原文
           

摘要

We present an active learning architecture that allows a robot to actively learn which data collection strategy is most efficient for acquiring motor skills to achieve multiple outcomes, and generalise over its experience to achieve new outcomes. The robot explores its environment both via interactive learning and goal-babbling. It learns at the same time when, who and what to actively imitate from several available teachers, and learns when not to use social guidance but use active goal-oriented self-exploration. This is formalised in the framework of life-long strategic learning.The proposed architecture, called Socially Guided Intrinsic Motivation with Active Choice of Teacher and Strategy (SGIM-ACTS), relies on hierarchical active decisions of what and how to learn driven by empirical evaluation of learning progress for each learning strategy. We illustrate with an experiment where a simulated robot learns to control its arm for realising two kinds of different outcomes. It has to choose actively and hierarchically at each learning episode: 1) what to learn: which outcome is most interesting to select as a goal to focus on for goal-directed exploration; 2) how to learn: which data collection strategy to use among self-exploration, mimicry and emulation; 3) once he has decided when and what to imitate by choosing mimicry or emulation, then he has to choose who to imitate, from a set of different teachers. We show that SGIM-ACTS learns significantly more efficiently than using single learning strategies, and coherently selects the best strategy with respect to the chosen outcome, taking advantage of the available teachers (with different levels of skills).
机译:我们展示了一个积极的学习架构,允许机器人积极了解哪些数据收集策略最有效,以获得多种结果,以实现多种结果,并通过其经验来实现新的结果。机器人通过互动学习和目标 - 唠叨探索其环境。它在同时学习,谁和什么能够从几个可用的教师积极模仿,并在不使用社交指导时学习,但使用主动目标的自我探索。这在终身战略学习框架中正式化。拟议的架构,称为社会指导的内在动机与积极选择教师和战略(SGIM-CONTS),依赖于通过实证评估驱动的分层积极决定和如何学习对每个学习策略的学习进度。我们用实验说明了模拟机器人学习控制其手臂,以实现两种不同的结果。它必须在每个学习集中进行积极和层次的选择:1)学习什么:选择哪些结果是选择作为专注于目标定向探索的目标; 2)如何学习:在自我勘探,模仿和仿真中使用哪种数据收集策略; 3)一旦他决定通过选择模仿或仿真来模仿,然后他必须从一套不同的教师那里选择谁来模仿。我们表明,SGIM-CATTS比使用单一学习策略更有效地学习,并连贯选择关于所选结果的最佳策略,利用可用的教师(具有不同的技能水平)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号