首页> 外国专利> STACKED CONVOLUTIONAL LONG SHORT-TERM MEMORY FOR MODEL-FREE REINFORCEMENT LEARNING

STACKED CONVOLUTIONAL LONG SHORT-TERM MEMORY FOR MODEL-FREE REINFORCEMENT LEARNING

机译:无模型强化学习的堆栈式卷积长期短期记忆

摘要

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for controlling an agent interacting with an environment. One of the methods includes obtaining a representation of an observation; processing the representation using a convolutional long short-term memory (LSTM) neural network comprising a plurality of convolutional LSTM neural network layers; processing an action selection input comprising the final LSTM hidden state output for the time step using an action selection neural network that is configured to receive the action selection input and to process the action selection input to generate an action selection output that defines an action to be performed by the agent at the time step; selecting, from the action selection output, the action to be performed by the agent at the time step in accordance with an action selection policy; and causing the agent to perform the selected action.
机译:方法,系统和装置,包括编码在计算机存储介质上的计算机程序,用于控制与环境交互的代理。其中一种方法包括获取观测值的表示;使用包括多个卷积LSTM神经网络层的卷积长短期记忆(LSTM)神经网络处理表示;使用动作选择神经网络处理该时间步的包括最终LSTM隐藏状态输出的动作选择输入,该动作选择神经网络配置为接收动作选择输入并处理该动作选择输入以生成将动作定义为以下内容的动作选择输出:由代理在时间步骤执行;从动作选择输出中,根据动作选择策略,选择代理在该时间步骤要执行的动作;并使代理执行选定的操作。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号