首页> 外文会议>International conference on neural information processing;ICONIP 2010 >An Information-Spectrum Approach to Analysis of Return Maximization in Reinforcement Learning

【24h】

An Information-Spectrum Approach to Analysis of Return Maximization in Reinforcement Learning

机译：强化学习中收益最大化分析的信息频谱方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In reinforcement learning, Markov decision processes are the most popular stochastic sequential decision processes. We frequently assume stationar-ity or ergodicity, or both to the process for its analysis, but most stochastic sequential decision processes arising in reinforcement learning are in fact, not necessarily Markovian, stationary, or ergodic. In this paper, we give an information-spectrum analysis of return maximization in more general processes than stationary or ergodic Markov decision processes. We also present a class of stochastic sequential decision processes with the necessary condition for return maximization. We provide several examples of best sequences in terms of return maximization in the class.

机译：在强化学习中，马尔可夫决策过程是最流行的随机顺序决策过程。我们经常假设平稳性或遍历性，或两者兼而有之，以进行分析，但实际上，强化学习中出现的大多数随机顺序决策过程不一定是马尔可夫式，平稳性或遍历性的。在本文中，我们给出了比平稳或遍历马尔可夫决策过程更一般的过程中收益最大化的信息频谱分析。我们还提出了一类随机的顺序决策过程，具有最大化回报的必要条件。我们提供了有关类中返回最大化的最佳序列的几个示例。

著录项

来源
《International conference on neural information processing;ICONIP 2010 》|2011年|p.478-485|共8页
会议地点
作者
Kazunori Iwata;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类信息处理（信息加工） ;
关键词

相似文献

外文文献
中文文献
专利

1. An information-theoretic analysis of return maximization in reinforcement learning. [J] . Iwata K Neural Networks: The Official Journal of the International Neural Network Society . 2011 ,第10期

机译：强化学习中收益最大化的信息理论分析。
2. The asymptotic equipartition property in reinforcement learning and its relation to return maximization. [J] . Iwata K, Ikeda K, Sakai H Neural Networks: The Official Journal of the International Neural Network Society . 2006 ,第1期

机译：强化学习中的渐近等分性质及其与收益最大化的关系。
3. A Role of the Asymptotic Equipartition Property in Return Maximization of Reinforcement Learning [J] . Kazunori Iwata, Hideaki Sakai, Kazushi Ikeda, 電子情報通信学会技術研究報告. ニュ-ロコンピュ-ティング. Neurocomputing . 2005 ,第759期

机译：渐近均分性质在强化学习收益最大化中的作用
4. An Information-Spectrum Approach to Analysis of Return Maximization in Reinforcement Learning [C] . Kazunori Iwata International Confernece on Neural Information Processing . 2010

机译：增强学习中返回最大化分析的信息谱方法
5. An Empirical Approach to Adoption of a Rate of Return Maximizing Portfolio. [D] . Baptiste, Evens. 2012

机译：采用收益率最大化投资组合的经验方法。
6. How much of reinforcement learning is working memory not reinforcement learning? A behavioral computational and neurogenetic analysis [O] . Anne G. E. Collins, Michael J. Frank -1

机译：钢筋学习多少是工作记忆而不是加强学习？行为计算和神经肝分析
7. Structural Return Maximization for Reinforcement Learning [O] . Joseph, Joshua, Velez, Javier, Roy, Nicholas 2014

机译：强化学习的结构回归最大化

An Information-Spectrum Approach to Analysis of Return Maximization in Reinforcement Learning

摘要

著录项

相似文献

相关主题

期刊订阅