首页> 外国专利> TRAINING AN UNSUPERVISED MEMORY-BASED PREDICTION SYSTEM TO LEARN COMPRESSED REPRESENTATIONS OF AN ENVIRONMENT

TRAINING AN UNSUPERVISED MEMORY-BASED PREDICTION SYSTEM TO LEARN COMPRESSED REPRESENTATIONS OF AN ENVIRONMENT

机译:训练未经监督的基于内存的预测系统以学习环境的压缩表示

摘要

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a memory-based prediction system configured to receive an input observation characterizing a state of an environment interacted with by an agent and to process the input observation and data read from a memory to update data stored in the memory and to generate a latent representation of the state of the environment. The method comprises: for each of a plurality of time steps: processing an observation for the time step and data read from the memory to: (i) update the data stored in the memory, and (ii) generate a latent representation of the current state of the environment as of the time step; and generating a predicted return that will be received by the agent as a result of interactions with the environment after the observation for the time step is received.
机译:方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于训练基于内存的预测系统,该系统配置为接收表征与代理交互的环境状态的输入观测值并处理输入观测值和数据从内存中读取数据以更新存储在内存中的数据,并生成环境状态的潜在表示。该方法包括:对于多个时间步骤中的每一个:处理该时间步骤的观测值以及从存储器中读取的数据,以:(i)更新存储在存储器中的数据,以及(ii)生成当前的潜在表示。到时间步长为止的环境状态;并在收到对时间步长的观察之后,作为与环境交互的结果,生成由代理接收的预测回报。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号