Feature-based aggregation and deep reinforcement learning: a survey and some new implementations

Dimitri P. Bertsekas

首页> 外文期刊>Automatica Sinica, IEEE/CAA Journal of >Feature-based aggregation and deep reinforcement learning: a survey and some new implementations

【24h】

Feature-based aggregation and deep reinforcement learning: a survey and some new implementations

机译：基于特征的聚合和深度强化学习：一项调查和一些新的实现

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper we discuss policy iteration methods for approximate solution of a finite-state discounted Markov decision problem, with a focus on feature-based aggregation methods and their connection with deep reinforcement learning schemes. We introduce features of the states of the original problem, and we formulate a smaller “aggregate” Markov decision problem, whose states relate to the features. We discuss properties and possible implementations of this type of aggregation, including a new approach to approximate policy iteration. In this approach the policy improvement operation combines feature-based aggregation with feature construction using deep neural networks or other calculations. We argue that the cost function of a policy may be approximated much more accurately by the nonlinear function of the features provided by aggregation, than by the linear function of the features provided by neural network-based reinforcement learning, thereby potentially leading to more effective policy improvement.

机译：在本文中，我们讨论了用于有限状态折扣Markov决策问题的近似解的策略迭代方法，重点是基于特征的聚合方法及其与深度强化学习方案的联系。我们介绍了原始问题的状态特征，并制定了一个较小的“汇总”马尔可夫决策问题，其状态与特征有关。我们讨论了这种聚合的属性和可能的实现，包括一种近似策略迭代的新方法。在这种方法中，策略改进操作将基于特征的聚合与使用深度神经网络或其他计算的特征构建相结合。我们认为，与通过基于神经网络的强化学习提供的特征的线性函数相比，通过聚集提供的特征的非线性函数可以更准确地近似策略的成本函数，从而有可能导致更有效的策略改善。

著录项

来源
《Automatica Sinica, IEEE/CAA Journal of》 |2019年第1期|1-31|共31页
作者
Dimitri P. Bertsekas;
展开▼
作者单位

Department of Electrical Engineering and Computer Science and the Laboratory for Information and Decision Systems Massachusetts Institute of Technology ( M.I.T. ) Cambridge MA 02139 USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Neural networks; Cost function; Computer architecture; Feature extraction; Machine learning; Learning (artificial intelligence); Mathematical model;

机译：神经网络;成本函数;计算机架构;特征提取;机器学习;学习（人工智能）;数学模型;

相似文献

外文文献
中文文献
专利

1. Feature-Based Aggregation and Deep Reinforcement Learning:A Survey and Some New Implementations [J] . Dimitri P.Bertsekas 自动化学报：英文版 . 2019,第001期

机译：基于特征的聚合和深度强化学习：一项调查和一些新的实现
2. Feature-Based Aggregation and Deep Reinforcement Learning:A Survey and Some New Implementations [J] . Dimitri P.Bertsekas 自动化学报（英文版） . 2019,第001期

机译：基于特征的聚合和深度强化学习：一项调查和一些新的实现
3. The Implementation of Deep Reinforcement Learning in E-Learning and Distance Learning: Remote Practical Work [J] . Abdelali El Gourari, Mustapha Raoufi, Mohammed Skouri, Mobile information systems . 2021,第a期

机译：电子学习和远程学习深度加强学习的实施：远程实践工作
4. A Deep Reinforcement learning based Approach for Channel Aggregation in IEEE 802.11 ax [C] . Mengqi Han, Ziru Chen, Lin X. Cai, IEEE Global Communications Conference . 2020

机译：IEEE 802.11斧头信道聚集的基于深度加强学习方法
5. On Deep Reinforcement Learning for Games: Generalization of Deep Q-Learning with Multiple Policy Heads [D] . Boucher, Mathieu. 2020

机译：关于游戏的深度加固学习：多重政策头部深度Q学的泛化
6. A novel deep learning architecture outperforming ‘off-the-shelf’ transfer learning and feature-based methods in the automated assessment of mammographic breast density [O] . Eleftherios Trivizakis, Georgios S. Ioannidis, Vasileios D. Melissianos, -1

机译：新型的深度学习架构在乳房X线摄影乳房密度的自动评估中优于现成的迁移学习和基于特征的方法
7. Knowledge-Assisted Deep Reinforcement Learning in 5G Scheduler Design: From Theoretical Framework to Implementation [O] . Zhouyou Gu, Changyang She, Wibowo Hardjawana, 2021

机译：5G调度设计中的知识辅助深度加固学习：从理论框架到实施

Feature-based aggregation and deep reinforcement learning: a survey and some new implementations

摘要

著录项

相似文献

相关主题

期刊订阅