Deep reinforcement learning with smooth policy update: Application to robotic cloth manipulation

Tsurumine Yoshihisa; Cui Yunduan; Uchibe Eiji; Matsubara Takamitsu

首页> 外文期刊>Robotics and Autonomous Systems >Deep reinforcement learning with smooth policy update: Application to robotic cloth manipulation

【24h】

Deep reinforcement learning with smooth policy update: Application to robotic cloth manipulation

机译：深度加强学习，具有顺利的政策更新：在机器人布操控中的应用

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Deep Reinforcement Learning (DRL), which can learn complex policies with high-dimensional observations as inputs, e.g., images, has been successfully applied to various tasks. Therefore, it may be suitable to apply them for robots to learn and perform daily activities like washing and folding clothes, cooking, and cleaning since such tasks are difficult for non-DRL methods that often require either (1) direct access to state variables or (2) well-designed hand-engineered features extracted from sensory inputs. However, applying DRL to real robots remains very challenging because conventional DRL algorithms require a huge number of training samples for learning, which is arduous in real robots. To alleviate this dilemma, in this paper, we propose two sample efficient DRL algorithms: Deep P-Network (DPN) and Dueling Deep P-Network (DDPN). The core idea is to combine the nature of smooth policy update with the capability of automatic feature extraction in deep neural networks to enhance the sample efficiency and learning stability with fewer samples. The proposed methods were first investigated by a robot-arm reaching task in the simulation that compared previous DRL methods and applied to two real robotic cloth manipulation tasks: (1) flipping a handkerchief and (2) folding a t-shirt with a limited number of samples. All the results suggest that our method outperformed the previous DRL methods. (C) 2018 The Authors. Published by Elsevier B.V.

机译：深度加强学习（DRL），可以将具有高维观察的复杂政策作为输入，例如图像，已成功应用于各种任务。因此，它可能适合将它们应用于机器人学习和执行日常活动，如洗涤和折叠衣服，烹饪和清洁，因为这种任务难以用于通常需要（1）直接访问状态变量或者的非DRL方法（2）精心设计的手工工程特征从感觉输入中提取。然而，将DRL应用于真正的机器人仍然非常具有挑战性，因为传统的DRL算法需要大量的学习样本，这在真正的机器人中艰巨。在本文中，为了缓解这种困境，我们提出了两个样本高效的DRL算法：Deep P-Network（DPN）和Dueling Deep P-Network（DDPN）。核心思想是将平稳政策更新的性质与深神经网络中的自动特征提取的能力相结合，以提高样品效率和样品较少的学习稳定性。所提出的方法首先通过机器人臂达到了在模拟中的任务来研究，这些方法比较了以前的DRL方法，并应用于两个真正的机器人布操控任务：（1）翻转手帕和（2）折叠有有限数量的T恤样品。所有结果表明，我们的方法优于前一个DRL方法。（c）2018作者。 elsevier b.v出版。

著录项

来源
《Robotics and Autonomous Systems》 |2019年第2019期|共12页
作者
Tsurumine Yoshihisa; Cui Yunduan; Uchibe Eiji; Matsubara Takamitsu;
展开▼
作者单位

Nara Inst Sci &

Technol Grad Sch Sci &

Technol Div Informat Sci 8916-5 Takayamacho Ikoma Nara Japan;

Nara Inst Sci &

Technol Grad Sch Sci &

Technol Div Informat Sci 8916-5 Takayamacho Ikoma Nara Japan;

ATR Computat Neurosci Labs Dept Brain Robot Interface 2-2-2 Hikaridai Seika Kyoto Japan;

Nara Inst Sci &

Technol Grad Sch Sci &

Technol Div Informat Sci 8916-5 Takayamacho Ikoma Nara Japan;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类机器人技术;
关键词
Deep reinforcement learning; Robotic cloth manipulation; Dynamic policy programming;

机译：深增强学习;机械手布操纵;动态政策规划;
入库时间 2022-08-20 05:08:44

相似文献

外文文献
中文文献
专利

1. Deep reinforcement learning with smooth policy update: Application to robotic cloth manipulation [J] . Tsurumine Yoshihisa, Cui Yunduan, Uchibe Eiji, Robotics and Autonomous Systems . 2019,第期

机译：深度加强学习，具有顺利的政策更新：在机器人布操控中的应用
2. Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation [J] . Dmitry Kalashnikov, Alex Irpan, Peter Pastor, JMLR: Workshop and Conference Proceedings . 2018,第2010期

机译：基于视觉的机器人操纵的可扩展深度增强学习
3. Interplay of Rhythmic and Discrete Manipulation Movements During Development: A Policy-Search Reinforcement-Learning Robot Model [J] . Valentina Cristina Meola, Daniele Caligiore, Valerio Sperati, IEEE Transactions on Cognitive and Developmental Systems . 2016,第3期

机译：有节奏的和离散的操纵运动在开发过程中的相互作用：政策搜索强化学习机器人模型
4. Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates [C] . Shixiang Gu, Ethan Holly, Timothy Lillicrap, IEEE International Conference on Robotics and Automation . 2017

机译：通过异步取消策略更新进行机器人操纵的深度强化学习
5. Leveraging Trajectory Optimization to Improve Deep Reinforcement Learning, with Application to Agile Wheeled Robot Locomotion [D] . Bellegarda, Guillaume D. 2019

机译：利用轨迹优化来改善深度加强学习，应用于敏捷轮式机器人机器人
6. Learning for a Robot: Deep Reinforcement Learning Imitation Learning Transfer Learning [O] . Jiang Hua, Liangcai Zeng, Gongfa Li, 2021

机译：学习机器人：深增强学习仿制学习转移学习
7. Deep Reinforcement Learning for Robotic Manipulation with Asynchronous Off-Policy Updates [O] . Gu, Shixiang, Holly, Ethan, Lillicrap, Timothy, 2016

机译：异步机器人机器人操纵的深度强化学习非政策更新

Deep reinforcement learning with smooth policy update: Application to robotic cloth manipulation

摘要

著录项

相似文献

相关主题

期刊订阅