首页> 外文会议>Chinese Automation Congress >Effective Policy Adjustment via Meta-Learning for Complex Manipulation Tasks

【24h】

Effective Policy Adjustment via Meta-Learning for Complex Manipulation Tasks

机译：通过元学习进行复杂操作任务的有效策略调整

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The ability of adjusting policy is the key to learning decision making when completing complex manipulation tasks for agents. To solve this problem with the consideration of both exploration and exploitation, we propose a novel deep reinforcement learning algorithm by combining the Hindsight Experience Replay (HER) with the Model-Agnostic Meta-Learning (MAML). To solve the complex manipulation tasks, HER could provide a relatively effective exploration by converting the single-goal task to the multiple goals in such an environment where rewards are sparse and binary, enhancing the ability to search better policies according to not only the successful the transition trajectories but also the failures, and the MAML could promote the ability of exploitation, which means the proposed algorithm could learn faster and adjust the policy model from limited experience within few iterations. Plenty of simulation results on the complex tasks of manipulating objects with a robotic arm have been done, and results show that HER integrated with MAML could accelerate fine-tuning for the original policy gradient reinforcement learning with neural network policy, and also improve the performance on the success rate.

机译：调整政策的能力是在为代理完成复杂操作任务时学习决策的关键。为了解决探索和剥削的考虑，我们通过将后敏感经验重放（她）与模型 - 不可知的元学习（MAML）相结合，提出了一种新的深度加强学习算法。为了解决复杂的操纵任务，她可以通过将单个目标任务转换为在这种环境中的多个目标转换为奖励稀疏和二进制的环境，增强根据不仅根据成功搜索更好策略的能力来提供相对有效的探索。过渡轨迹还有故障，而MAML可以促进剥削能力，这意味着所提出的算法可以在很少的迭代之内从有限的经验中学到更快和调整政策模型。已经完成了充足的仿真结果，对具有机器人手臂的操作对象的复杂任务，结果表明，她与MAML集成可以加速与神经网络政策的原始政策梯度强化学习的微调，并提高性能成功率。

著录项

来源
《Chinese Automation Congress》|2018年|714p|共6页
会议地点
作者
Binghong Wu; Kuangrong Hao; Xin Cai; Xuesong Tang; Tong Wang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP2-53;
关键词
Task analysis; Training; Reinforcement learning; Trajectory; Complexity theory; Manipulators;

机译：任务分析;培训;加强学习;轨迹;复杂性理论;操纵器;

相似文献

外文文献
中文文献
专利

1. Dual task performance in a healthy young adult population: results from a symmetric manipulation of task complexity and articulation. [J] . Armieri A, Holmes JD, Spaulding SJ, Gait & posture . 2009,第2期

机译：在健康的年轻成年人口中双重任务表现：任务复杂性和清晰度的对称操纵导致的结果。
2. Investigation of Reaction Mechanisms of Bismuth Tellurium Selenide Nanomaterials for Simple Reaction Manipulation Causing Effective Adjustment of Thermoelectric Properties [J] . Cham Kim, Dong Hwan Kim, Jong Tae Kim ACS applied materials & interfaces . 2014,第2期

机译：有效调节热电性质的硒化碲碲铋纳米材料反应机理的研究
3. Effective teleoperated manipulation for humanoid robots in partially unknown real environments: team AIST-NEDO's approach for performing the Plug Task during the DRC Finals [J] . Cisneros Rafael, Nakaoka Shinichiro, Morisawa Mitsuharu, Advanced Robotics: The International Journal of the Robotics Society of Japan . 2016,第23a24期

机译：在部分未知的真实环境中对类人机器人进行有效的遥控操作：AIST-NEDO团队在DRC决赛期间执行插入任务的方法
4. Effective Policy Adjustment via Meta-Learning for Complex Manipulation Tasks [C] . Binghong Wu, Kuangrong Hao, Xin Cai, Chinese Automation Congress . 2018

机译：通过元学习对复杂的操作任务进行有效的策略调整
5. Contact-Based State Estimation and Policy Learning for Robotic Manipulation Tasks [D] . Li, Shuai 2017

机译：机器人操纵任务的基于接触的状态估计和策略学习
6. A modelling tool for policy analysis to support the design of efficient and effective policy responses for complex public health problems [O] . Jo-An Atkinson, Andrew Page, Robert Wells, 2015

机译：政策分析的建模工具可支持针对复杂的公共卫生问题设计有效的政策对策
7. Standing on unstable surface challenges postural control of tracking tasks and modulates neuromuscular adjustments specific to task complexity [O] . Lida Mademli, Dimitra Mavridi, Sebastian Bohm, 2021

机译：站在不稳定的表面挑战跟踪任务的姿势控制，并调节特定于任务复杂性的神经肌肉调整

Effective Policy Adjustment via Meta-Learning for Complex Manipulation Tasks

摘要

著录项

相似文献

相关主题

期刊订阅