首页>
外国专利>
IMITATION LEARNING USING A GENERATIVE PREDECESSOR NEURAL NETWORK
IMITATION LEARNING USING A GENERATIVE PREDECESSOR NEURAL NETWORK
展开▼
机译:生成式前馈神经网络的模拟学习
展开▼
页面导航
摘要
著录项
相似文献
摘要
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection policy neural network. In one aspect, a method comprises: obtaining an expert observation; processing the expert observation using a generative neural network system to generate a given observation-given action pair, wherein the generative neural network system has been trained to be more likely to generate a particular observation-particular action pair if performing the particular action in response to the particular observation is more likely to result in the environment later reaching the state characterized by a target observation; processing the given observation using the action selection policy neural network to generate a given action score for the given action; and adjusting the current values of the action selection policy neural network parameters to increase the given action score for the given action.
展开▼