Beyond lowest-warping cost action selection in trajectory transfer

机译：轨迹转移中超越最低翘曲成本动作的选择

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We consider the problem of learning from demonstrations to manipulate deformable objects. Recent work [1], [2], [3] has shown promising results that enable robotic manipulation of deformable objects through learning from demonstrations. Their approach is able to generalize from a single demonstration to new test situations, and suggests a nearest neighbor approach to select a demonstration to adapt to a given test situation. Such a nearest neighbor approach, however, ignores important aspects of the problem: brittleness (versus robustness) of demonstrations when generalized through this process, and the extent to which a demonstration makes progress towards a goal. In this paper, we frame the problem of selecting which demonstration to transfer as an options Markov decision process (MDP). We present max-margin Q-function estimation: an approach to learn a Q-function from expert demonstrations. Our learned policies account for variability in robustness of demonstrations and the sequential nature of our tasks. We developed two knot-tying benchmarks to experimentally validate the effectiveness of our proposed approach. The selection strategy described in [2] achieves success rates of 70% and 54%, respectively. Our approach performs significantly better, with success rates of 88% and 76%, respectively.

机译：我们考虑了从示威中学习操纵可变形物体的问题。最近的工作[1]，[2]，[3]已显示出令人鼓舞的结果，该结果使机器人能够通过从演示中学习来对可变形物体进行操纵。他们的方法能够从单一的演示推广到新的测试情况，并建议采用最近邻方法来选择演示以适应给定的测试情况。但是，这种最接近的邻居方法忽略了问题的重要方面：通过该过程进行概括时，演示的脆性（相对于鲁棒性）以及演示在实现目标方面的进展程度。在本文中，我们将选择转移哪些演示作为选项的马尔可夫决策过程（MDP）构成了问题。我们提出最大幅度Q函数估计：一种从专家演示中学习Q函数的方法。我们学到的政策说明了示威活动的健壮性和任务顺序性的可变性。我们开发了两个打结基准，以通过实验验证我们提出的方法的有效性。文献[2]中描述的选择策略分别获得了70％和54％的成功率。我们的方法效果明显更好，成功率分别为88％和76％。

著录项

来源
《IEEE International Conference on Robotics and Automation》|2015年|3231-3238|共8页
会议地点
作者
Hadfield-Menell Dylan; Lee Alex X.; Finn Chelsea; Tzeng Eric; Huang Sandy; Abbeel Pieter;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Re-engineering healthcare pipelines: Why trajectory selection is as important as process selection in enabling effective transfer of best practice [J] . Denis R. Towill International journal of health care quality assurance . 2006,第6a7期

机译：重新设计医疗保健管道：为什么轨迹选择与过程选择一样重要，才能有效地转移最佳实践
2. Uncertain portfolio selection model considering transaction costs and minimum transaction lots requirement [J] . Zhang Chao, Hu Rui, Wei Lirong Journal of intelligent & fuzzy systems: Applications in Engineering and Technology . 2017,第6期

机译：考虑交易成本和最小交易批量要求，不确定的产品组合选择模型
3. Target Selection Bias Transfers Across Different Response Actions [J] . Jeff Moher, Joo-Hyun Song Journal of experimental psychology. human perception and performance . 2014,第3期

机译：目标选择偏差跨不同响应动作传递
4. Beyond lowest-warping cost action selection in trajectory transfer [C] . Hadfield-Menell Dylan, Lee Alex X., Finn Chelsea, IEEE International Conference on Robotics and Automation . 2015

机译：超出轨迹转移中的最低翘曲成本动作选择
5. Market reaction, change in ownership structure, transaction costs and trading activity on volatility and adverse selection components: Evidence from reverse splits. [D] . Kim, Jang-Chul. 2003

机译：市场反应，所有权结构变化，交易成本以及有关波动率和逆向选择成分的交易活动：来自反向拆分的证据。
6. Target selection bias transfers across different response actions [O] . Jeff Moher, Joo-Hyun Song -1

机译：目标选择偏差跨不同响应动作转移
7. A Heuristic Strategy to Compute Ensemble of Trajectories for 3D Low Cost Earth-Moon Transfers [O] . de Sousa-Silva Priscilla A., Terra Maisa O., McInnes Colin R., 2016

机译：一种计算三维低成本地月传输轨迹集合的启发式策略
8. Optimization of Insertion Cost for Transfer Trajectories to Libration Point Orbits [R] . Howell, K. C., Wilson, R. S., Lo, M. W. 1999

机译：转移轨迹到振动点轨道的插入成本优化

Beyond lowest-warping cost action selection in trajectory transfer

摘要

著录项

相似文献

相关主题

期刊订阅