首页> 美国政府科技报告 >Learning Representation and Control in Markov Decision Processes.
【24h】

Learning Representation and Control in Markov Decision Processes.

机译:马尔可夫决策过程中的学习表示与控制。

获取原文

摘要

This research investigated algorithms for approximately solving Markov decision processes (MDPs), a widely used model of sequential decision making. Much past work on solving MDPs in adaptive dynamic programming and reinforcement learning has assumed representations, such as basis functions, are provided by a human expert. The research investigated a variety of approaches to automatic basis construction, including reward-sensitive and reward-invariant methods, diagonalization and dilation methods, as well as orthogonal and over-complete representations. A unifying perspective on the various basis construction methods emerges from showing they result from different power series expansions of value functions, including the Neumann series expansion, the Laurent series expansion, and the Schultz expansion. The research also develops new computational algorithms for learning sparse solutions to MDPs using convex optimization methods.

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号