首页> 外国专利> SMOOTHED SARSA: REINFORCEMENT LEARNING FOR ROBOT DELIVERY TASKS

SMOOTHED SARSA: REINFORCEMENT LEARNING FOR ROBOT DELIVERY TASKS

机译：SMOOTHED SARSA：机器人交付任务的强化学习

页面导航

摘要
著录项
相似文献

摘要

The present invention provides a method for learning a policy used by a computing system to perform a task, such delivery of one or more objects by the computing system. During a first time interval, the computing system determines a first state, a first action and a first reward value. As the computing system determines different states, actions and reward values during subsequent time intervals, a state description identifying the current sate, the current action, the current reward and a predicted action is stored. Responsive to a variance of a stored state description falling below a threshold value, the stored state description is used to modify one or more weights in the policy associated with the first state.

机译：本发明提供了一种用于学习由计算系统用来执行任务的策略的方法，例如由计算系统传递一个或多个对象。在第一时间间隔期间，计算系统确定第一状态，第一动作和第一奖励值。当计算系统在随后的时间间隔期间确定不同的状态，动作和奖励值时，存储标识当前状态，当前动作，当前奖励和预测动作的状态描述。响应于所存储的状态描述的方差下降到阈值以下，所存储的状态描述用于修改与第一状态相关联的策略中的一个或多个权重。

著录项

公开/公告号WO2010045272A1

专利类型
公开/公告日2010-04-22

原文格式PDF
申请/专利权人 HONDA MOTOR CO. LTD.;GUPTA RAKESH;RAMACHANDRAN DEEPAK;
展开▼

申请/专利号WO2009US60567
发明设计人 GUPTA RAKESH;RAMACHANDRAN DEEPAK;
展开▼

申请日2009-10-13
分类号G06F15/18;
国家 WO
入库时间 2022-08-21 18:38:30

相似文献

专利
外文文献
中文文献