首页> 外国专利> Deep reinforcement learning system for robotic manipulation

Deep reinforcement learning system for robotic manipulation

机译：用于机器人操纵的深度强化学习系统

页面导航

摘要
著录项
相似文献

摘要

A non-transitory computer-readable storage medium having instructions stored thereon, the instructions, when executed by one or more processors, causing the one or more processors to perform steps comprising: while performing multiple episodes by each of a plurality of robots Each of the episodes is an examination of the performance of a neural network policy task that represents a reinforcement learning policy for the task: storing instances of robot experience data in a buffer generated by the robots during the episodes, each of the instances the robot experience data is generated during a corresponding one of the episodes and is generated, at least in part, at a corresponding output corresponding to the neural network using the neural policy network with corresponding policy parameters for the neural network iteratively generating updated policy parameters of the neural policy network, each of the iterations of iteratively generating comprising generating the updated policy parameters using a group of one or more instances of the robot experience data in the buffer during the iteration; andby each of the robots in conjunction with a start of each of a plurality of episodes performed by the robot, updating the neural policy network to be used by the robot in the episode, wherein updating the neural policy network calls for the use of the updated policy parameters recent iteration of iteratively generating the updated policy parameters.

机译：一种在其上存储有指令的非暂时性计算机可读存储介质，该指令在由一个或多个处理器执行时使一个或多个处理器执行步骤，包括：当由多个机器人中的每一个执行多个情节时，情节是对神经网络策略任务性能的检查，该任务代表任务的强化学习策略：将情节期间机器人经验数据的实例存储在由机器人生成的缓冲区中，每个实例均会生成机器人经验数据在相应的情节中，使用神经策略网络至少部分地在与神经网络相对应的相应输出处生成该神经策略网络，其中神经网络具有相应的策略参数，迭代地生成神经策略网络的更新的策略参数，每个迭代生成的迭代，包括生成更新的策略参数rs在迭代过程中使用一组一个或多个机器人实例体验缓冲区中的数据;并由每个机器人结合由机器人执行的多个情节中的每个情节的开始，更新将在情节中由机器人使用的神经策略网络，其中更新神经策略网络要求使用更新的策略参数最近迭代迭代生成更新的策略参数。

著录项

公开/公告号DE202017105598U1

专利类型
公开/公告日2018-05-24

原文格式PDF
申请/专利权人 GOOGLE LLC (N.D.GES.D. STAATES DELAWARE);
展开▼

申请/专利号DE201720105598U
发明设计人
展开▼

申请日2017-09-15
分类号B25J9/22;
国家 DE
入库时间 2022-08-21 12:33:37

相似文献

专利
外文文献
中文文献