首页> 外文期刊>Information Theory, IEEE Transactions on >Indexability of Restless Bandit Problems and Optimality of Whittle Index for Dynamic Multichannel Access
【24h】

Indexability of Restless Bandit Problems and Optimality of Whittle Index for Dynamic Multichannel Access

机译:动态多通道访问的不安定匪问题的可索引性和Whittle索引的最优性

获取原文
获取原文并翻译 | 示例
           

摘要

In this paper, we consider a class of restless multiarmed bandit processes (RMABs) that arises in dynamic multichannel access, user/server scheduling, and optimal activation in multiagent systems. For this class of RMABs, we establish the indexability and obtain Whittle index in closed form for both discounted and average reward criteria. These results lead to a direct implementation of Whittle index policy with remarkably low complexity. When arms are stochastically identical, we show that Whittle index policy is optimal under certain conditions. Furthermore, it has a semiuniversal structure that obviates the need to know the Markov transition probabilities. The optimality and the semiuniversal structure result from the equivalence between Whittle index policy and the myopic policy established in this work. For nonidentical arms, we develop efficient algorithms for computing a performance upper bound given by Lagrangian relaxation. The tightness of the upper bound and the near-optimal performance of Whittle index policy are illustrated with simulation examples.
机译:在本文中,我们考虑了动态多通道访问,用户/服务器调度以及多主体系统中的最佳激活中出现的一类不安定的多臂匪盗过程(RMAB)。对于这类RMAB,我们建立折价率并以封闭形式获得折扣和平均奖励标准的Whittle指数。这些结果导致以较低的复杂性直接执行Whittle索引策略。当武器随机相同时,我们表明Whittle指数政策在某些条件下是最优的。此外,它具有半通用结构,因此无需了解马尔可夫转移概率。最优性和半通用性结构是由Whittle指数政策与这项工作中建立的近视政策之间的等价性产生的。对于不相同的手臂,我们开发了有效的算法来计算拉格朗日松弛给出的性能上限。通过仿真实例说明了Whittle索引策略的上限的紧密性和接近最佳的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号