Deep Q-Network with Predictive State Models in Partially Observable Domains

Danning Yu; Kun Ni; Yunlong Liu

首页> 外文期刊>Mathematical Problems in Engineering: Theory, Methods and Applications >Deep Q-Network with Predictive State Models in Partially Observable Domains

【24h】

Deep Q-Network with Predictive State Models in Partially Observable Domains

机译：具有部分可观察域中的预测状态模型的深Q网络

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

While deep reinforcement learning (DRL) has achieved great success in some large domains, most of the related algorithms assume that the state of the underlying system is fully observable. However, many real-world problems are actually partially observable. For systems with continuous observation, most of the related algorithms, e.g., the deep Q-network (DQN) and deep recurrent Q-network (DRQN), use history observations to represent states; however, they often make computation-expensive and ignore the information of actions. Predictive state representations (PSRs) can offer a powerful framework for modelling partially observable dynamical systems with discrete or continuous state space, which represents the latent state using completely observable actions and observations. In this paper, we present a PSR model-based DQN approach which combines the strengths of the PSR model and DQN planning. We use a recurrent network to establish the recurrent PSR model, which can fully learn dynamics of the partially continuous observable environment. Then, the model is used for the state representation and update of DQN, which makes DQN no longer rely on a fixed number of history observations or recurrent neural network (RNN) to represent states in the case of partially observable environments. The strong performance of the proposed approach is demonstrated on a set of robotic control tasks from OpenAI Gym by comparing with the technique with the memory-based DRQN and the state-of-the-art recurrent predictive state policy (RPSP) networks. Source code is available at https://github.com/RPSR-DQN/paper-code.git.

机译：虽然深度加强学习（DRL）在一些大域中取得了巨大的成功，但大多数相关算法假设底层系统的状态是完全可观察的。然而，许多真实世界问题实际上是部分观察到的。对于具有连续观察的系统，大多数相关算法，例如，深度Q-Network（DQN）和深度复发性Q-Network（DRQN），使用历史观察来表示状态;但是，它们通常会使计算昂贵并忽略动作的信息。预测状态表示（PSR）可以提供一种强大的框架，用于使用离散或连续状态空间建模部分可观察的动态系统，其表示使用完全可观察的动作和观察的潜在状态。在本文中，我们提出了一种基于PSR模型的DQN方法，它结合了PSR模型和DQN规划的优势。我们使用经常性网络来建立复发性PSR模型，可以完全学习部分连续可观察环境的动态。然后，该模型用于DQN的状态表示和更新，这使得DQN不再依赖于固定数量的历史观察或经常性神经网络（RNN）来代表部分可观察环境的情况。通过与基于内存的DRQN的技术与现有的经常性预测状态政策（RPSP）网络的技术相比，在Openai Bumb的一组机器人控制任务中对所提出的方法的强劲表明。源代码可在https://github.com/rpsr-dqn/paper-code.git中获得。

著录项

来源
《Mathematical Problems in Engineering: Theory, Methods and Applications》 |2020年第1期|共9页
作者
Danning Yu; Kun Ni; Yunlong Liu;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. A Model Approximation Scheme for Planning in Partially Observable Stochastic Domains [J] . Liu W., Zhang N. L. The Journal of Artificial Intelligence Research . 1997,第4期

机译：在部分可观察的随机域中进行规划的模型近似方案
2. Sim-to-Real Quadrotor Landing via Sequential Deep Q-Networks and Domain Randomization [J] . Progress in Artificial Intelligence . 2020,第1期

机译：SIM-to-Real四轮车通过顺序深度Q-Network和域随机化降落
3. Partially-erupting prominences: a comparison between observations and model-predicted observables [J] . D. Tripathi, S. E. Gibson, J. Qiu, Astronomy and astrophysics . 2009,第6期

机译：突出的部分突出：观测值与模型预测的可观测值之间的比较
4. Partially Observable Multi-Agent RL with Enhanced Deep Distributed Recurrent Q-Network [C] . Longtao Fan, Yuan-yuan Liu, Sen Zhang International Conference on Information Science and Control Engineering . 2018

机译：具有增强的深度分布式循环Q网络的部分可观察的多Agent RL
5. Automated hierarchy discovery for planning in partially observable domains [D] . Charlin, Laurent 2007

机译：在部分可观察的域中进行计划的自动化层次结构发现
6. Modeling treatment of ischemic heart disease with partially observable Markov decision processes. [O] . M. Hauskrecht, H. Fraser 1998

机译：使用局部可观察的马尔可夫决策过程对缺血性心脏病的治疗进行建模。
7. A Model Approximation Scheme for Planning in Partially Observable Stochastic Domains [O] . Zhang, N. L., Liu, W. 1997

机译：部分可观测量规划的模型逼近方案随机域

Deep Q-Network with Predictive State Models in Partially Observable Domains

摘要

著录项

相似文献

相关主题

期刊订阅