首页> 外国专利> Neural episode control

Neural episode control

机译：神经插曲控制

页面导航

摘要
著录项
相似文献

摘要

This paper relates to reinforcement learning.The method isMaintaining individual episode memory data for each of a plurality of actions andReceiving the current observation that characterizes the current state of the environment being interacted by the agent andTo generate the current key embedding on the current observationThe present observation is performed using an embedded neural network according to the current value of the parameters of the embedded neural network.Each action of multiple actionsAccording to the distance measurementFor current key embeddingDetermining p-neighbor key embedding in episodic data for action andFrom return estimates mapped by p-neighbor key embedding in episodic data for actionDetermining the Q value for behavior andUse the Q value for behaviorChoose actions from multiple actions as actions to be performed by an agent.Diagram

机译：本文涉及强化学习。该方法为多个动作中的每一个进行绘制的单个插曲存储器数据，并且具有表征由代理的环境的当前状态的当前观察，并且使用嵌入式神经网络生成当前观察的当前键嵌入的当前键嵌入的当前键。根据嵌入式神经网络参数的当前值。将多个动作的AccocationAccOnding的距离测量到当前密钥嵌入的距离测量的距离嵌入P邻居键嵌入的P邻居数据嵌入到ePiSodic数据中的P邻居键映射的返回估计ActionDeTEMINTERMINTION for行为的Q值，并利用来自多个操作的行为操作，从多个操作作为要由代理程序执行的操作

著录项

公开/公告号JP2021064387A

专利类型
公开/公告日2021-04-22

原文格式PDF
申请/专利权人ディープマインドテクノロジーズリミテッド;
展开▼

申请/专利号JP20200213556
发明设计人ベニグノ・ウリア－マルティネス;アレクサンダー・プリッツェル;チャールズ・ブランデル;アドリア・ピュイグドメネク・バディア;
展开▼

申请日2020-12-23
分类号G06N3/08;G06N20;
国家 JP
入库时间 2022-08-24 18:21:11

相似文献

专利
外文文献
中文文献