首页> 外国专利> IMITATION LEARNING USING A GENERATIVE PREDECESSOR NEURAL NETWORK

IMITATION LEARNING USING A GENERATIVE PREDECESSOR NEURAL NETWORK

机译:生成式前馈神经网络的模拟学习

摘要

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection policy neural network. In one aspect, a method comprises: obtaining an expert observation; processing the expert observation using a generative neural network system to generate a given observation-given action pair, wherein the generative neural network system has been trained to be more likely to generate a particular observation-particular action pair if performing the particular action in response to the particular observation is more likely to result in the environment later reaching the state characterized by a target observation; processing the given observation using the action selection policy neural network to generate a given action score for the given action; and adjusting the current values of the action selection policy neural network parameters to increase the given action score for the given action.
机译:用于训练动作选择策略神经网络的方法,系统和装置,包括在计算机存储介质上编码的计算机程序。在一个方面,一种方法包括:获得专家观察;以及使用生成神经网络系统处理专家观察以生成给定的观察给定动作对,其中,如果响应于执行特定动作,则生成神经网络系统被训练为更可能生成特定的观察特定动作对。特定的观察更有可能导致环境随后到达以目标观察为特征的状态;使用动作选择策略神经网络处理给定的观察结果,以产生给定动作的给定动作分数;调整动作选择策略神经网络参数的当前值,以增加给定动作的给定动作分数。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号