On Canonical Forms for Zero-Sum Stochastic Mean Payoff Games

Endre Boros; Khaled Elbassioni; Vladimir Gurvich; Kazuhisa Makino

首页> 外文期刊>Dynamic games and applications >On Canonical Forms for Zero-Sum Stochastic Mean Payoff Games

【24h】

On Canonical Forms for Zero-Sum Stochastic Mean Payoff Games

机译：零和随机均值支付游戏的规范形式

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We consider two-person zero-sum mean payoff undiscounted stochastic games and obtain sufficient conditions for the existence of a saddle point in uniformly optimal stationary strategies. Namely, these conditions enable us to bring the game, by applying potential transformations, to a canonical form in which locally optimal strategies are globally optimal, and hence the value for every initial position and the optimal strategies of both players can be obtained by playing the local game at each state. We show that these conditions hold for the class of additive transition (AT) games, that is, the special case when the transitions at each state can be decomposed into two parts, each controlled completely by one of the two players. An important special case of AT-games form the so-called BWR-games which are played by two players on a directed graph with positions of three types: Black, White and Random. We give an independent proof for the existence of a canonical form in such games, and use this result to derive the existence of a canonical form (and hence, of a saddle point in uniformly optimal stationary strategies) in a wide class of games, which includes stochastic games with perfect information (Pl), switching controller (SC) games and additive rewards, additive transition (ARAT) games. Unlike the proof for AT-games, our proof for the BWR-case does not rely on the existence of a saddle point in stationary strategies. We also derive some algorithmic consequences from these our reductions to BWR-games, in terms of solving PI-, and ARAT-games in sub-exponential time.

机译：我们考虑了两人零和平均收益无折扣的随机博弈，并为一致最优平稳策略中的鞍点的存在获得了充分的条件。也就是说，这些条件使我们能够通过应用潜在的转换，将游戏带入一种规范形式，在这种形式中局部最优策略是全局最优的，因此，通过玩游戏，可以获得每个初始位置的价值和两个玩家的最优策略。每个州的本地游戏。我们证明了这些条件适用于加性过渡（AT）游戏类别，即特殊情况，即每个状态的过渡可以分解为两个部分，每个部分完全由两个参与者之一控制。 AT游戏的一个重要特例是所谓的BWR游戏，由两个玩家在有向图上以三种类型的位置进行游戏：黑，白和随机。我们给出了此类游戏中规范形式存在的独立证明，并使用此结果来推导广泛类游戏中规范形式的存在（因此，在统一最优固定策略中存在鞍点）。包括具有完善信息（Pl）的随机游戏，切换控制器（SC）游戏和附加奖励，附加过渡（ARAT）游戏。与用于AT游戏的证明不同，我们针对BWR情况的证明不依赖平稳策略中鞍点的存在。我们还通过在亚指数时间内解决PI和ARAT游戏，从减少BWR游戏中得出了一些算法上的结果。

著录项

来源
《Dynamic games and applications》 |2013年第2期|共34页
作者
Endre Boros; Khaled Elbassioni; Vladimir Gurvich; Kazuhisa Makino;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类 51.9;
关键词
Stochastic games; Zero-sum; Saddle point; Equilibrium; Mean-payoff games;

机译：随机博弈零和鞍点均衡均值博弈;

相似文献

外文文献
中文文献
专利

1. On Canonical Forms for Zero-Sum Stochastic Mean Payoff Games [J] . Endre Boros, Khaled Elbassioni, Vladimir Gurvich, Dynamic games and applications . 2013,第2期

机译：零和随机均值支付游戏的规范形式
2. Zero-Sum Stochastic Games with Partial Information and Average Payoff [J] . Subhamay Saha Journal of Optimization Theory and Applications . 2014,第1期

机译：具有部分信息和平均收益的零和随机游戏
3. Constant payoff in zero-sum stochastic games [J] . Catoni Olivier, Oliu-Barton Miquel, Ziliotto Bruno Annales de l'Institut Henri Poincare. Probabilites et Statistiques . 2021,第4期

机译：在零和随机游戏中的持续收益
4. Stochastic Recursive Zero-Sum Differential Game and Mixed Zero-Sum Differential Game Problem with Payoff Functional in BDSDES [C] . Renwei Jia, Lifeng Wei, Xiaodong Liu IEEE International Conference of Safe Production and Informatization . 2020

机译：随机递归零和差动游戏和BDSDES的收益功能混合零和差分游戏问题
5. Deception in two-player zero-sum stochastic games: Theory and application to warfare games. [D] . Singh, Rajdeep. 2006

机译：两人零和随机游戏中的欺骗：理论和在战争游戏中的应用。
6. Zero-Sum Matrix Game with Payoffs of Dempster-Shafer Belief Structures and Its Applications on Sensors [O] . Xinyang Deng, Wen Jiang, Jiandong Zhang 2017

机译：具有Dempster-Shafer信念结构收益的零和矩阵博弈及其在传感器中的应用
7. Zero-Sum Stochastic Games with Partial Information and Average Payoff [O] . Saha, Subhamay 2014

机译：具有部分信息和平均收益的零和随机游戏
8. I. Criterion Equivalence in Discrete Dynamic Programming. II. Stochastic Games with Perfect Information and Time Average Payoff [R] . Lippman, S. A., Liggett, T. M. 1968

机译：I.离散动态规划中的判据等价。 II。具有完美信息和时间平均收益的随机游戏

On Canonical Forms for Zero-Sum Stochastic Mean Payoff Games

摘要

著录项

相似文献

相关主题

期刊订阅