首页> 外文会议>IEEE Conference on Decision and Control >Independently Randomized Symmetric Policies are Optimal for Exchangeable Stochastic Teams with Infinitely Many Decision Makers
【24h】

Independently Randomized Symmetric Policies are Optimal for Exchangeable Stochastic Teams with Infinitely Many Decision Makers

机译:独立随机的对称政策对于无限多决策者的可交换随机团队是最优的

获取原文

摘要

We study stochastic team (known also as decentralized stochastic control or identical interest stochastic game) problems with large or countably infinite number of decision makers, and characterize existence and structural properties for (globally) optimal policies. We consider in particular both static and dynamic non-convex team problems where the cost function and dynamics satisfy an exchangeability condition. We first establish a de Finetti type representation theorem for exchangeable decentralized policies, that is, for the probability measures induced by admissible policies under decentralized information structures. For a general setup of stochastic team problems with N decision makers, under exchangeability of observations of decision makers and the cost function, we show that without loss of global optimality, the search for optimal policies over any convex set of probability measures on policies can be restricted to those that are N-exchangeable. Then, by extending N-exchangeable policies to infinitely exchangeable ones, establishing a convergence argument for the induced costs, and using the presented de Finetti type theorem, we establish the existence of an optimal decentralized policy for static and dynamic teams with countably infinite number of decision makers, which turns out to be symmetric (i.e., identical) and randomized. In particular, unlike prior work, convexity of the cost is not assumed.
机译:我们研究随机团队(也称为分散的随机控制或相同的兴趣随机游戏)问题,具有大型或可比无限的决策者,并表征(全球)最佳政策的存在和结构性。我们特别考虑静态和动态非凸的团队问题,其中成本函数和动态满足交换性条件。我们首先建立一个可交换的分散政策的De Finetti型表示定理,即,在分散的信息结构下受理政策引起的概率措施。对于N决策者的一般性设置,在决策者的可交换性和成本职能的可交换性下,我们展示了没有损失全球最优性的情况下,在任何凸起的政策上的任何凸起概率措施上都可以获得最佳政策限于那些是N-易换的人。然后,通过将N-易换的策略扩展到无数可交换的策略,为诱导成本建立融合参数,并使用所呈现的de Finetti型定理,我们建立了具有可选无限数量的静态和动态团队的最佳分散政策的存在决策者,结果是对称(即,相同)和随机化。特别是,与现有工作不同,没有假设成本的凸起。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号