首页> 外国专利> LEARNING OPTIONS FOR ACTION SELECTION WITH META-GRADIENTS IN MULTI-TASK REINFORCEMENT LEARNING

LEARNING OPTIONS FOR ACTION SELECTION WITH META-GRADIENTS IN MULTI-TASK REINFORCEMENT LEARNING

机译:多任务强化学习中的Meta-梯度学习选项

摘要

A reinforcement learning system, method, and computer program code for controlling an agent to perform a plurality of tasks while interacting with an environment. The system learns options, where an option comprises a sequence of primitive actions performed by the agent under control of an option policy neural network. In implementations the system discovers options which are useful for multiple different tasks by meta-learning rewards for training the option policy neural network whilst the agent is interacting with the environment.
机译:用于控制代理的加强学习系统,方法和计算机程序代码在与环境交互的同时执行多个任务。 系统学习选项,其中一个选项包括由代理在选项策略神经网络的控制下执行的代理执行的原始动作序列。 在实现中,系统发现了通过元学习奖励来训练选项策略神经网络的多个不同任务的选项,同时代理与环境交互。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号