Reinforcement Learning in Continuous Time and Space

Kenji Doya

首页> 外文期刊>Neural computation >Reinforcement Learning in Continuous Time and Space

【24h】

Reinforcement Learning in Continuous Time and Space

机译：连续时间和空间中的强化学习

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

This article presents a reinforcement learning framework for continuous- time dynamical systems without a priori discretization of time, state, and action. Based on the Hamilton-Jacobi-Bellman (HJB) equation for infinite- horizon, discounted reward problems, we derive algorithms for estimat- ing value functions and improving policies with the use of function ap- proximators.

机译：本文提出了一种用于连续时间动力系统的强化学习框架，而没有时间，状态和动作的先验离散。基于Hamilton-Jacobi-Bellman（HJB）方程，该方程解决了无限期折扣折扣问题，我们推导了用于估计价值函数并使用函数近似值改进策略的算法。

著录项

来源
《Neural computation》 |2000年第1期|p.219-245|共27页
作者
Kenji Doya;
展开▼
作者单位

展开▼
收录信息美国《科学引文索引》(SCI);美国《化学文摘》(CA);
原文格式 PDF
正文语种 eng
中图分类人工智能理论;
关键词

相似文献

外文文献
中文文献
专利

1. Reinforcement Learning in Continuous Time and Space: A Stochastic Control Approach [J] . Haoran Wang, Thaleia Zariphopoulou, Xun Yu Zhou Journal of machine learning research . 2020,第a期

机译：连续时间和空间的加固学习：随机控制方法
2. Reinforcement Learning in Continuous Time and Space: Interference and Not Ill Conditioning Is the Main Problem When Using Distributed Function Approximators [J] . Baddeley B. IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics . 2008,第4期

机译：连续时间和空间中的强化学习：使用分布式函数逼近器时的主要问题是干扰和非病态调节
3. Reinforcement Learning in High-dimensional Continuous State Spaces - A State Space Compression Method Based on Multivariate Analysis - [J] . Hideki Satoh, 佐藤仁樹電子情報通信学会技術研究報告. 非線形問題. Nonlinear Problems . 2006,第547期

机译：高维连续状态空间的强化学习-一种基于多元分析的状态空间压缩方法-
4. Reinforcement Learning Based Continuous-Time On-line Spacecraft Dynamics Control: Case Study of NASA SPHERES Spacecraft [C] . Fei Sun, Kamran Turkoglu AIAA guidance, navigation, and control conference;AIAA SciTech forum . 2018

机译：基于强化学习的连续时间在线航天器动力学控制：NASA SPHERES航天器案例研究
5. A Smoothing Framework for Stochastic Continuous-Time Reinforcement Learning Problem [D] . Hu, Bowen. 2021

机译：用于随机连续时间增强学习问题的平滑框架
6. Correction: Spike-Based Reinforcement Learning in Continuous State and Action Space: When Policy Gradient Methods Fail [O] . Eleni Vasilaki, Nicolas Frémaux, Robert Urbanczik, 2009

机译：更正：在连续状态和动作空间中基于峰值的强化学习：当策略梯度方法失败时
7. Policy iterations for reinforcement learning problems in continuous time and space — Fundamental theory and methods [O] . Jaeyoung Lee, Richard S. Sutton 2021

机译：连续时间和空间中加强学习问题的政策迭代 - 基础理论与方法

Reinforcement Learning in Continuous Time and Space

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅