首页> 外文学位 >Thompson Sampling for the Control of a Queue with Demand Uncertainty
【24h】

Thompson Sampling for the Control of a Queue with Demand Uncertainty

机译:汤普森采样控制需求不确定的队列

获取原文
获取原文并翻译 | 示例

摘要

We study an admission control problem in which the customer arrival rate is unknown and needs to be learned from data using Bayesian inference. Two key defining features of this model are that: (1) when the arrival rate is known, the DP equations can be solved explicitly to obtain the optimal policy over the infinite horizon, and (2) uninformative actions are unavoidable and occur infinitely often.;We extend the standard proof techniques for Thompson sampling to admission control, in which uninformative actions occur infinitely often, and show that asymptotically optimal convergence rates of the posterior error and worst-case average regret are achieved. Finally, we show that under simple assumptions, our techniques generalize to a broader class of policies, which we call Generalized Thompson sampling. We show that this class of policies achieves asymptotically optimal convergence rates and can outperform standard Thompson sampling in numerical simulation.
机译:我们研究了一个准入控制问题,其中客户到达率未知,需要使用贝叶斯推理从数据中学习。该模型的两个关键定义特征是:(1)当到达率已知时,可以明确求解DP方程以获得无限范围内的最优策略;(2)无信息的行为是不可避免的,并且经常发生。 ;我们将汤普森抽样的标准证明技术扩展到准入控制,在这种控制中,无信息的动作经常无限地发生,并表明后验误差和最坏情况的平均后悔的渐近最优收敛速度。最后,我们表明,在简单的假设下,我们的技术可以推广到更广泛的政策类别,我们称之为广义汤普森抽样。我们证明,此类策略可实现渐近最优收敛速度,并且在数值模拟中可以胜过标准的汤普森采样。

著录项

  • 作者

    Gimelfarb, Michael.;

  • 作者单位

    University of Toronto (Canada).;

  • 授予单位 University of Toronto (Canada).;
  • 学科 Operations research.;Statistics.
  • 学位 M.A.S.
  • 年度 2017
  • 页码 53 p.
  • 总页数 53
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号