首页> 外文期刊>Journal of Intelligent & Robotic Systems: Theory & Application >Generalization of Force Control Policies from Demonstrations for Constrained Robotic Motion Tasks
【24h】

Generalization of Force Control Policies from Demonstrations for Constrained Robotic Motion Tasks

机译:从受约束的机器人运动任务的演示中概括出力控制策略

获取原文
获取原文并翻译 | 示例
       

摘要

Although learning of control policies from demonstrations has been thoroughly investigated in the literature, generalization of policies to new contexts still remains a challenge given that existing approaches exhibit limited performance when generalizing to new tasks. In this article, we propose two policy generalization approaches employed for generalizing motion-based force control policies with the view of performing constrained motions in presence of motion-dependent external forces. The key concept of the proposed methods is using, apart from policy values, also policy derivatives or differences which express how the policy varies with respect to variations in its input and combine these two kinds of information to generalize the policy at new inputs. The first proposed approach learns policy and policy derivative values by linear regression and combines these data into a first-order Taylor-like polynomial to estimate the policy at new inputs. The second approach learns policy and policy difference data by locally weighted regression and combines them in a superposition fashion to estimate the policy at new inputs. The policy differences in this approach represent variations of the policy in the direction of minimizing the distance between the new incoming and average-demonstrated inputs. The proposed approaches are evaluated in real-world robot constrained motion tasks by using a linear-actuated, two degrees-of-freedom haptic device.
机译:尽管在文献中已经对从示威中学习控制策略进行了全面研究,但是将策略推广到新环境仍然是一个挑战,因为现有方法在推广到新任务时性能有限。在本文中,我们提出了两种策略泛化方法,用于对基于运动的力控制策略进行泛化,以在存在依赖于运动的外力的情况下执行约束运动。提议的方法的关键概念是使用除策略值之外的策略派生或差异,这些派生或差异表示策略如何根据其输入的变化而变化,并结合这两种信息以在新的输入中概括该策略。第一个提出的方法通过线性回归学习政策和政策衍生值,并将这些数据组合为一阶泰勒式多项式,以估算新输入下的政策。第二种方法是通过局部加权回归学习政策和政策差异数据,并以叠加的方式将它们组合起来,以估算新输入下的政策。此方法中的策略差异表示在最小化新的传入和平均演示输入之间的距离的方向上的策略变化。通过使用线性驱动的两自由度触觉设备,在现实世界中机器人约束的运动任务中对提出的方法进行了评估。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号