首页> 外文期刊>Journal of economic theory >Best-response dynamics in zero-sum stochastic games
【24h】

Best-response dynamics in zero-sum stochastic games

机译:零加速游戏中的最佳响应动态

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

We define and analyse three learning dynamics for two-player zero-sum discounted-payoff stochastic games. A continuous-time best-response dynamic in mixed strategies is proved to converge to the set of Nash equilibrium stationary strategies. Extending this, we introduce a fictitious-play-like process in a continuous-time embedding of a stochastic zero-sum game, which is again shown to converge to the set of Nash equilibrium strategies. Finally, we present a modified 8-converging best-response dynamic, in which the discount rate converges to 1, and the learned value converges to the asymptotic value of the zero-sum stochastic game. The critical feature of all the dynamic processes is a separation of adaption rates: beliefs about the value of states adapt more slowly than the strategies adapt, and in the case of the 8-converging dynamic the discount rate adapts more slowly than everything else. (c) 2020 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
机译:我们为双人零和折扣 - 支付随机游戏定义和分析三个学习动态。证明了混合策略中连续最佳响应动态的动态融合到纳什均衡固定策略集。扩展这一点,我们在连续时间嵌入随机零和游戏的连续嵌入时介绍了一个虚拟游戏过程,这再次被显示为收敛到纳什均衡策略。最后,我们介绍了修改的8聚串最佳响应动态,其中折扣率会聚到1,并且学习值会聚到零和随机游戏的渐近值。所有动态流程的关键特征是分离适应率:关于状态的价值的信念比策略适应更慢,而在8-趋同的动态的情况下,折扣率比其他一切速度更慢。 (c)2020作者。由elsevier Inc.发布这是CC下的开放式访问文章(http://creativecommons.org/licenses/by/4.0/)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号