In this paper, the throughput maximization of millimeter-wave (mm-Wave) ultra-dense networks (UDN) using dynamic spectrum sharing (DSS) is considered. Most of the existing works only allow temporal-domain access and admit at most one user at each time slot, resulting in significant under-utilization of spectrum resource, which will be less attractive to mm-wave UDN applications. A generalized temporal-spatial sharing scheme is proposed in this paper for UDN by exploiting the location information of incumbent devices, where multiple users are allowed to access each channel simultaneously via spatial separations. For distributed applications, the global information exchange among secondary users (SU) tends to be impractical, given the unaffordable signaling overhead and latency. Thus, a non-cooperative game with fine-grained two-dimensional reuse is formulated, which leads to a more efficient access strategy. It is then proved to be an ordinary potential game (OPG), which guarantees the existence of the strategy Nash equilibrium (NE). Finally, an improved decentralized reinforcement learning algorithm is designed, with which SUs can learn from wireless environments and adapt towards to a NE point, relying on the individual observation and the historical action-reward (rather than the global information exchanging). The convergence efficiency of the new scheme is also rigorously proved. Numerical simulations are provided to validate the performances of the proposed schemes.ud
展开▼
机译:在本文中,考虑了使用动态频谱共享(DSS)最大化毫米波(mm-Wave)超密集网络(UDN)的吞吐量。现有的大多数工作都只允许时域访问,并且每个时隙最多只能允许一个用户使用,从而导致频谱资源的严重利用不足,这对毫米波UDN应用的吸引力较小。通过利用现有设备的位置信息,为UDN提出了一种通用的时空共享方案,在该设备中,允许多个用户通过空间分隔同时访问每个信道。对于分布式应用程序,由于信令开销和延迟无法承受,二级用户(SU)之间的全局信息交换往往不切实际。因此,制定了具有细粒度的二维重用的非合作游戏,这导致了更有效的访问策略。然后证明这是一个普通的潜在博弈(OPG),它保证了策略纳什均衡(NE)的存在。最后,设计了一种改进的分散式强化学习算法,SU可以借助该算法从无线环境中学习并适应于NE点,这取决于个人的观察和历史行动奖励(而不是全局信息交换)。新方案的收敛效率也得到了严格证明。提供数值模拟以验证所提出方案的性能。 ud
展开▼