首页> 外文期刊>Autonomous Robots >Efficient policy search in low-dimensional embedding spaces by generalizing motion primitives with a parameterized skill memory
【24h】

Efficient policy search in low-dimensional embedding spaces by generalizing motion primitives with a parameterized skill memory

机译:通过使用参数化技能记忆泛化运动原语,在低维嵌入空间中进行有效的策略搜索

获取原文
获取原文并翻译 | 示例
           

摘要

Motion primitives are an established paradigm to generate complex motions from simpler building blocks. A much less addressed issue is at which level to encode and how to organize a library of motion primitives. Typically, the intrinsic variability of a skill is significantly lower-dimensional than the parameter space of motion primitive models. This paper therefore proposes a parameterized skill memory in a first step, which organizes a set of motion primitives in a low-dimensional, topology-preserving embedding space. The skill memory acts as a pivotal mechanism that links low-dimensional skill parametrization to motion primitive parameters and complete motion trajectories. The skill memory is implemented by means of a dynamical system which features continuous generalization of motion shapes and the multi-directional retrieval of motion primitive parameters from low-dimensional skill parametrizations. The skill parametrization can be predefined or automatically discovered, e.g. by unsupervised dimension reduction techniques. The paper shows that parameterized skill memories achieve excellent generalization of motion shapes from few training examples in several scenarios, including the bi-manual manipulation of a rod with the humanoid robot iCub. In a second step, the low-dimensional and topological skill parametrization is leveraged for efficient, gradient-based policy search. Policy search by generalizing motion shapes from low-dimensional parametrizations is compared to conventional policy search in the parameter space of a motion primitive model. It turns out that the reduced search space accessible through the skill memory significantly accelerates the policy improvement.
机译:运动原语是从更简单的构造块生成复杂运动的既定范例。尚未解决的问题是在哪个级别编码以及如何组织运动图元库。通常,技能的固有可变性比运动基本模型的参数空间的维数低得多。因此,本文在第一步中提出了参数化的技能记忆,该记忆在低维,拓扑保留的嵌入空间中组织了一组运动图元。技能记忆是将低维技能参数化与运动原始参数和完整运动轨迹联系起来的关键机制。技能存储器是通过动态系统实现的,该系统具有连续的运动形状概括和从低维技能参数化获取运动原始参数的多方向性的功能。技能参数化可以是预定义的或自动发现的,例如通过无监督的降维技术。本文显示了参数化的技能记忆可以在几种情况下通过很少的训练示例实现出色的运动形状泛化,包括使用类人机器人iCub进行杆的双向操纵。第二步,将低维和拓扑技能参数化用于有效的基于梯度的策略搜索。通过对来自低维参数化的运动形状进行概括来进行策略搜索,将其与常规策略搜索在运动原语模型的参数空间中进行比较。事实证明,通过技能记忆可访问的减少的搜索空间显着加速了策略改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号