首页> 外文期刊>IEEE Transactions on Cognitive Communications and Networking >A Deep Actor-Critic Reinforcement Learning Framework for Dynamic Multichannel Access
【24h】

A Deep Actor-Critic Reinforcement Learning Framework for Dynamic Multichannel Access

机译:动态多通道访问的深层演员批评素描框架

获取原文
获取原文并翻译 | 示例

摘要

To make efficient use of limited spectral resources, we in this work propose a deep actor-critic reinforcement learning based framework for dynamic multichannel access. We consider both a single-user case and a scenario in which multiple users attempt to access channels simultaneously. We employ the proposed framework as a single agent in the single-user case, and extend it to a decentralized multi-agent framework in the multi-user scenario. In both cases, we develop algorithms for the actor-critic deep reinforcement learning and evaluate the proposed learning policies via experiments and numerical results. In the single-user model, in order to evaluate the performance of the proposed channel access policy and the framework's tolerance against uncertainty, we explore different channel switching patterns and different switching probabilities. In the case of multiple users, we analyze the probabilities of each user accessing channels with favorable channel conditions and the probability of collision. We also address a time-varying environment to identify the adaptive ability of the proposed framework. Additionally, we provide comparisons (in terms of both the average reward and time efficiency) between the proposed actor-critic deep reinforcement learning framework, Deep-Q network (DQN) based approach, random access, and the optimal policy when the channel dynamics are known.
机译:为了有效利用有限的光谱资源,我们在这项工作中提出了一个深入的演员批评批评基于动态多通道访问的框架框架。我们考虑一个单用户案例和一个方案,其中多个用户尝试同时访问通道。我们在单用户案例中使用所提出的框架作为单个代理,并将其扩展到多用户场景中的分散的多代理框架。在这两种情况下,我们开发了演员批评的深度加强学习算法,并通过实验和数值结果评估建议的学习政策。在单用户模型中,为了评估所提出的频道访问策略的性能和框架的不确定性的容忍度,我们探讨了不同的频道切换模式和不同的交换概率。在多个用户的情况下,我们分析每个用户访问频道的概率,其具有有利的信道条件和碰撞概率。我们还解决了一个时变的环境,以确定所提出的框架的自适应能力。此外,我们在渠道动态的拟议演员 - 评论家深度加强学习框架,深度Q网络(DQN)的方法,随机访问和最佳政策之间提供比较(根据平均奖励和时间效率)已知。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号