首页>
外国专利>
OPPONENT MODELING WITH ASYNCHRONOUS METHODS IN DEEP RL
OPPONENT MODELING WITH ASYNCHRONOUS METHODS IN DEEP RL
展开▼
机译:深层RL中采用异步方法的对手建模
展开▼
页面导航
摘要
著录项
相似文献
摘要
A computer system and method for extending parallelized asynchronous reinforcement learning to include agent modeling for training a neural network is described. Coordinated operation of plurality of hardware processors or threads is utilized such that each functions as a worker process that is configured to simultaneously interact with a target computing environment for local gradient computation based on a loss determination mechanism and to update global network parameters. The loss determination mechanism includes at least a policy loss term (actor), a value loss term (critic), and a supervised cross entropy loss. Variations are described further where the neural network is adapted to include a latent space to track agent policy features.
展开▼