An information-theoretic analysis of return maximization in reinforcement learning.

Iwata K

首页> 外文期刊>Neural Networks: The Official Journal of the International Neural Network Society >An information-theoretic analysis of return maximization in reinforcement learning.

【24h】

An information-theoretic analysis of return maximization in reinforcement learning.

机译：强化学习中收益最大化的信息理论分析。

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We present a general analysis of return maximization in reinforcement learning. This analysis does not require assumptions of Markovianity, stationarity, and ergodicity for the stochastic sequential decision processes of reinforcement learning. Instead, our analysis assumes the asymptotic equipartition property fundamental to information theory, providing a substantially different view from that in the literature. As our main results, we show that return maximization is achieved by the overlap of typical and best sequence sets, and we present a class of stochastic sequential decision processes with the necessary condition for return maximization. We also describe several examples of best sequences in terms of return maximization in the class of stochastic sequential decision processes, which satisfy the necessary condition.

机译：我们对强化学习中的回报最大化进行一般分析。对于强化学习的随机顺序决策过程，此分析不需要马尔可夫性，平稳性和遍历性的假设。取而代之的是，我们的分析假设信息论基础的渐近均分性质，与文献提供的观点截然不同。作为我们的主要结果，我们证明了典型和最佳序列集的重叠实现了收益最大化，并且我们提出了一类具有收益最大化条件的随机顺序决策过程。我们还根据满足必要条件的随机顺序决策过程中的收益最大化描述了最佳序列的几个示例。

著录项

来源
《Neural Networks: The Official Journal of the International Neural Network Society》 |2011年第10期|共8页
作者
Iwata K;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类神经病学;
关键词

相似文献

外文文献
中文文献
专利

1. An information-theoretic analysis of return maximization in reinforcement learning. [J] . Iwata K Neural Networks: The Official Journal of the International Neural Network Society . 2011,第10期

机译：强化学习中收益最大化的信息理论分析。
2. A Role of the Asymptotic Equipartition Property in Return Maximization of Reinforcement Learning [J] . Kazunori Iwata, Hideaki Sakai, Kazushi Ikeda, 電子情報通信学会技術研究報告. ニュ-ロコンピュ-ティング. Neurocomputing . 2005,第759期

机译：渐近均分性质在强化学习收益最大化中的作用
3. A Role of the Asymptotic Equipartition Property in Return Maximization of Reinforcement Learning [J] . Kazunori IWATA, Hideaki SAKAI, Kazushi IKEDA 電子情報通信学会技術研究報告. ニュ-ロコンピュ-ティング. Neurocomputing . 2004,第759期

机译：渐近均分性质在强化学习收益最大化中的作用
4. An Information-Spectrum Approach to Analysis of Return Maximization in Reinforcement Learning [C] . Kazunori Iwata International conference on neural information processing;ICONIP 2010 . 2011

机译：强化学习中收益最大化分析的信息频谱方法
5. Stronger bidding strategies through empirical game-theoretic analysis and reinforcement learning. [D] . Schvartzman, Leonardo Julian. 2009

机译：通过经验博弈论分析和强化学习，可以制定更强的出价策略。
6. Frequency of reinforcement as a determinant of extinction-induced aggression during errorless discrimination learning. [O] . M Rilling, H J Caplan 1975

机译：强化的频率作为无误判别学习过程中灭绝诱发的攻击行为的决定因素。
7. Structural Return Maximization for Reinforcement Learning [O] . Joseph, Joshua, Velez, Javier, Roy, Nicholas 2014

机译：强化学习的结构回归最大化

An information-theoretic analysis of return maximization in reinforcement learning.

摘要

著录项

相似文献

相关主题

期刊订阅