首页> 外文期刊>Selected Topics in Signal Processing, IEEE Journal of >Optimally Sensing a Single Channel Without Prior Information: The Tiling Algorithm and Regret Bounds
【24h】

Optimally Sensing a Single Channel Without Prior Information: The Tiling Algorithm and Regret Bounds

机译:在没有先验信息的情况下最佳地感测单个通道:平铺算法和后悔范围

获取原文
获取原文并翻译 | 示例
           

摘要

We consider the task of optimally sensing a two-state Markovian channel with an observation cost and without any prior information regarding the channel's transition probabilities. This task is of interest in the field of cognitive radio as a model for opportunistic access to a communication network by a secondary user. The optimal sensing problem may be cast into the framework of model-based reinforcement learning in a specific class of partially observable Markov decision processes (POMDPs). We propose the Tiling Algorithm, an original method aimed at reaching an optimal tradeoff between the exploration (or estimation) and exploitation requirements. It is shown that this algorithm achieves finite horizon regret bounds that are as good as those recently obtained for multi-armed bandits and finite-state Markov decision processes (MDPs).
机译:我们考虑了以观测成本最佳地感测两态马尔可夫信道的任务,而没有关于信道转换概率的任何先验信息。该任务在认知无线电领域中是令人感兴趣的,它是辅助用户机会性地访问通信网络的模型。最佳感测问题可以在部分可观察的马尔可夫决策过程(POMDP)的特定类中,植入基于模型的强化学习框架中。我们提出了平铺算法,这是一种原始方法,旨在在勘探(或估算)与开发需求之间达到最佳平衡。结果表明,该算法可以实现有限水平的后悔边界,该边界与最近针对多臂匪和有限状态马尔可夫决策过程(MDP)获得的后边界一样好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号