首页> 外文期刊>Wireless Networks >Actor-critic deep learning for efficient user association and bandwidth allocation in dense mobile networks with green base stations
【24h】

Actor-critic deep learning for efficient user association and bandwidth allocation in dense mobile networks with green base stations

机译:Actor-critic深度学习可在具有绿色基站的密集移动网络中实现有效的用户关联和带宽分配

获取原文
获取原文并翻译 | 示例
           

摘要

In this paper, we introduce an efficient user-association and bandwidth-allocation scheme based on an actor-critic deep learning framework for downlink data transmission in dense mobile networks. In this kind of network, small cells are densely deployed in a single macrocell, and share the same spectrum band with the macrocell. The small-cell base stations are also called green base stations since they are powered solely by solar-energy harvesters. Therefore, we propose an actor-critic deep learning (ACDL) algorithm for the purpose of maximizing long-term network performance while adhering to constraints on harvested energy and spectrum sharing. For this purpose, the agent of the ACDL algorithm tries to obtain an optimal user-association and bandwidth-allocation policy by interacting with the network's environment. We first formulate the optimization problem in this paper as a Markov decision process, during which the agent learns about the evolution of the environment through trial and error experience. Then, we use a deep neural network to model the policy function and the value function in the actor and in the critic of the agent, respectively. The actor selects an action based on the output of the policy network. Meanwhile, the critic uses the output of the value network to help the actor evaluate the taken action. Numerical results demonstrate that the proposed algorithm can enhance network performance in the long run.
机译:在本文中,我们介绍了一种基于行为者深度学习框架的高效用户关联和带宽分配方案,用于密集移动网络中的下行链路数据传输。在这种网络中,小型小区密集地部署在单个宏小区中,并且与宏小区共享相同的频带。小型基站也被称为绿色基站,因为它们仅由太阳能收集器供电。因此,我们提出一种基于行为者的深度学习(ACDL)算法,目的是最大限度地提高长期网络性能,同时遵守对采集的能量和频谱共享的约束。为此,ACDL算法的代理尝试通过与网络环境进行交互来获得最佳的用户关联和带宽分配策略。我们首先在本文中将优化问题表述为马尔可夫决策过程,在此过程中,代理通过反复试验来了解环境的演变。然后,我们使用深度神经网络分别对参与者的行为者和批评者的政策函数和价值函数进行建模。参与者根据策略网络的输出选择一个动作。同时,评论家使用价值网络的输出来帮助参与者评估所采取的行动。数值结果表明,该算法可以长期提高网络性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号