首页> 外文期刊>Frontiers in Robotics and AI >A User Study on Robot Skill Learning Without a Cost Function: Optimization of Dynamic Movement Primitives via Naive User Feedback
【24h】

A User Study on Robot Skill Learning Without a Cost Function: Optimization of Dynamic Movement Primitives via Naive User Feedback

机译:没有成本函数的机器人技能学习的用户研究:通过朴素的用户反馈优化动态运动基元

获取原文
           

摘要

Enabling users to teach their robots new tasks at home is a major challenge for research in personal robotics. This work presents a user study in which participants were asked to teach the robot Pepper a game of skill. The robot was equipped with a state-of-the-art skill learning method, based on dynamic movement primitives (DMPs). The only feedback participants could give was a discrete rating after each of Pepper's movement executions ("very good", "good", "average", "not so good", "not good at all"). We compare the learning performance of the robot when applying user-provided feedback with a version of the learning where an objectively determined cost via hand-coded cost function and external tracking system is applied. Our findings suggest that a) an intuitive graphical user interface for providing discrete feedback can be used for robot learning of complex movement skills when using DMP-based optimization, making the tedious definition of a cost function obsolete; and b) un-experienced users with no knowledge about the learning algorithm naturally tend to apply a working rating strategy, leading to similar learning performance as when using the objectively determined cost. We discuss insights about difficulties when learning from user provided feedback, and make suggestions how learning continuous movement skills from non-expert humans could be improved.
机译:使用户在家中教机器人新任务是个人机器人研究的主要挑战。这项工作提出了一项用户研究,其中要求参与者向机器人Pepper教授技能游戏。机器人配备了基于动态运动原语(DMP)的最新技能学习方法。在每次Pepper动作执行之后,参与者只能给出一个离散的评分(“非常好”,“好”,“平均”,“不太好”,“一点都不好”)。我们将机器人在应用用户提供的反馈时的学习性能与通过手动编码成本函数和外部跟踪系统客观确定的成本应用的学习版本进行比较。我们的发现表明:a)当使用基于DMP的优化时,用于提供离散反馈的直观图形用户界面可用于机器人学习复杂的运动技能,从而使成本函数的繁琐定义不再适用; b)对学习算法一无所知的没有经验的用户自然会倾向于采用工作评分策略,从而获得与使用客观确定的成本时相似的学习效果。我们讨论了从用户提供的反馈中学习困难时的见解,并提出了如何改善从非专家那里学习连续运动技能的建议。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号