首页> 外国专利> REINFORCEMENT LEARNING WITH AUXILIARY TASKS

REINFORCEMENT LEARNING WITH AUXILIARY TASKS

机译:辅助任务的强化学习

摘要

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a reinforcement learning system. The method includes: training an action selection policy neural network, and during the training of the action selection neural network, training one or more auxiliary control neural networks and a reward prediction neural network. Each of the auxiliary control neural networks is configured to receive a respective intermediate output generated by the action selection policy neural network and generate a policy output for a corresponding auxiliary control task. The reward prediction neural network is configured to receive one or more intermediate outputs generated by the action selection policy neural network and generate a corresponding predicted reward. Training each of the auxiliary control neural networks and the reward prediction neural network comprises adjusting values of the respective auxiliary control parameters, reward prediction parameters, and the action selection policy network parameters.
机译:用于训练强化学习系统的方法,系统和装置,包括在计算机存储介质上编码的计算机程序。该方法包括:训练动作选择策略神经网络,以及在动作选择神经网络的训练期间,训练一个或多个辅助控制神经网络和奖励预测神经网络。每个辅助控制神经网络被配置为接收由动作选择策略神经网络生成的相应中间输出,并生成用于对应的辅助控制任务的策略输出。奖励预测神经网络被配置为接收由动作选择策略神经网络生成的一个或多个中间输出,并生成相应的预测奖励。训练每个辅助控制神经网络和奖励预测神经网络包括调整各个辅助控制参数,奖励预测参数和动作选择策略网络参数的值。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号