首页> 美国政府科技报告 >Contraction Conditions for Average and alpha-Discount Optimality in CountableState Markov Games with Unbounded Rewards

【24h】

Contraction Conditions for Average and alpha-Discount Optimality in CountableState Markov Games with Unbounded Rewards

机译：具有无界奖励的Countablestate markov游戏中平均和alpha折扣最优性的收缩条件

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The goal of this paper is to provide a theory of N-person Markov games withunbounded cost, for a countable state space and compact action spaces. We investigate both the finite and infinite horizon problems. For the latter, we consider the discounted cost as well as the expected average cost. We present conditions for the infinite horizon problems for which equilibrium policies exist for all players within the stationary policies, and show that the costs in equilibrium satisfy the optimality equations. Similar results are obtained for the finite horizon costs, for which equilibrium policies are shown to exist for all players within the Markov policies. As special case of N-person games, we investigate the zero-sum (2 players) game, for which we establish the convergence of the value iteration algorithm. We conclude by studying an application of a zero-sum Markov game in a queueing model.

著录项

作者
Altman, E.; Hordijk, A.; Spieksma, F. M.;
展开▼
作者单位

展开▼
年度 1994
页码 1-30
总页数 30
原文格式 PDF
正文语种 eng
中图分类工业技术;
关键词
Game theory; Markov processes; Iteration; Equilibrium; Costs; Optimization; Algorithms; Convergence; Queueing theory;

机译：博弈论;马尔可夫过程;迭代;均衡;成本;优化;算法;收敛;排队论;

相似文献

外文文献
中文文献
专利

1. Contraction conditions for average and alpha-discount optimality in countable state Markov games with unbounded rewards [J] . Altman E, Hordijk A, Spieksma FM Mathematics of operations research . 1997,第3期

机译：具有无穷奖励的可数状态Markov游戏中平均和alpha折扣最优的收缩条件
2. Necessary and sufficient optimality conditions for average reward of controlled Markov chains [J] . Sladky Karel Kybernetika . 1973,第2期

机译：受控马尔可夫链平均奖励的充要条件
3. Weak conditions for the existence of optimal stationary policies in average Markov decision chains with unbounded costs [J] . Cavazos-Cadena Rolando Kybernetika . 1989,第3期

机译：具有无限成本的平均马尔可夫决策链中最优平稳策略存在的弱条件
4. Some approximations for stochastic games with unbounded reward and average payoff [C] . Tidball, M.M., Pourtallier, . 1997

机译：具有无限奖励和平均收益的随机游戏的一些近似值
5. Average, sensitive and Blackwell-optimal policies in denumerable Markov decision chains with unbounded rewards [O] . Dekker, R. (Rommert), Hordijk, A. (Arie) 1988

机译：具有无穷回报的可数马尔可夫决策链中的平均，敏感和布莱克韦尔最优策略
6. Blackwell Optimality in the Class of All Policies in Markov Decision Chains witha Borel State Space and Unbounded Rewards [R] . Hordijk, A., Yushkevich, A. A. 2000

机译：具有Borel状态空间和无界奖励的马尔可夫决策链中所有策略类的Blackwell最优性

Contraction Conditions for Average and alpha-Discount Optimality in CountableState Markov Games with Unbounded Rewards

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅