首页> 外国专利> OPPONENT MODELING WITH ASYNCHRONOUS METHODS IN DEEP RL

OPPONENT MODELING WITH ASYNCHRONOUS METHODS IN DEEP RL

机译：深层RL中采用异步方法的对手建模

页面导航

摘要
著录项
相似文献

摘要

A computer system and method for extending parallelized asynchronous reinforcement learning to include agent modeling for training a neural network is described. Coordinated operation of plurality of hardware processors or threads is utilized such that each functions as a worker process that is configured to simultaneously interact with a target computing environment for local gradient computation based on a loss determination mechanism and to update global network parameters. The loss determination mechanism includes at least a policy loss term (actor), a value loss term (critic), and a supervised cross entropy loss. Variations are described further where the neural network is adapted to include a latent space to track agent policy features.

机译：描述了一种用于扩展并行化异步强化学习以包括用于训练神经网络的代理模型的计算机系统和方法。利用多个硬件处理器或线程的协调操作，以使每个函数充当工作进程，该工作进程被配置为与目标计算环境同时进行交互以基于损耗确定机制进行局部梯度计算并更新全局网络参数。损失确定机制至少包括策略损失项（参与者），价值损失项（评论）和监督交叉熵损失。进一步描述了变体，其中神经网络适于包括潜在空间以跟踪代理策略特征。

著录项

公开/公告号US2020143208A1

专利类型
公开/公告日2020-05-07

原文格式PDF
申请/专利权人 ROYAL BANK OF CANADA;
展开▼

申请/专利号US201916674782
发明设计人 PABLO FRANCISCO HERNANDEZ LEAL;BILAL KARTAL;MATTHEW EDMUND TAYLOR;
展开▼

申请日2019-11-05
分类号G06K9/62;G06N3/04;G06N3/08;G06F9/38;G06F17/16;G06N5/04;
国家 US
入库时间 2022-08-21 11:19:39

相似文献

专利
外文文献
中文文献