首页> 外文期刊>IEEE Transactions on Vehicular Technology >Cooperative Caching and Fetching in D2D Communications - A Fully Decentralized Multi-Agent Reinforcement Learning Approach
【24h】

Cooperative Caching and Fetching in D2D Communications - A Fully Decentralized Multi-Agent Reinforcement Learning Approach

机译:D2D通信中的合作缓存和取出 - 一种完全分散的多功能加强学习方法

获取原文
获取原文并翻译 | 示例
           

摘要

To satisfy the increasing demands of cellular traffic, cooperative content caching at the network edge (e.g., User Equipment) has become a promising paradigm in the next-generation cellular networks. Device-to-Device (D2D) communications can improve the content caching and fetching performance without deploying additional infrastructure. In this paper, we investigate the joint optimization of cooperative caching and fetching in dynamic D2D environment for minimizing the overall content fetching delay. We formulate it as a decentralized partially observable Markov game for finding the optimal policies at agents. To address this problem, we propose a Fully Decentralized Soft Multi-Agent Reinforcement Learning (FDS-MARL) algorithm, which extends the soft actor-critic framework to non-stationary multi-agent environment for fully decentralized learning and it contains the following major design components: Graph Attention Network based self-attention for cooperative inter-agent coordination, a consensus communication mechanism for effectively reducing the information loss and non-stationarity of the environment while keeping gradual global consensus, and an influence based transmission scheduling mechanism for effective credit assignment and also alleviation of potential transmission contentions among agents. Simulation results show that FDS-MARL can improve the content caching and fetching performance significantly compared with the representative work in the literature.
机译:为了满足蜂窝交通的不断增加的需求,网络边缘(例如用户设备)在网络边缘(例如,用户设备)的协同内容已经成为下一代蜂窝网络中的有前途的范例。设备到设备(D2D)通信可以在不部署其他基础架构的情况下改善内容缓存和提取性能。在本文中,我们调查了在动态D2D环境中的协同缓存和获取的联合优化,以最大限度地减少整体内容提取延迟。我们将其制定为一个分散的部分可观察马尔可夫游戏,用于寻找代理商的最佳政策。为了解决这个问题,我们提出了一个完全分散的软多功能代理强化学习(FDS-MARL)算法,它将软演员 - 评论家框架扩展到非静止多智能经纪环境,以获得完全分散的学习,它包含以下主要设计组件:图表关注基于网络的自我关注,共同协商机制,有效减少环境的信息丢失和环境的非公平性,以及基于影响的有效信用分配的传输调度机制并且还减轻了代理商之间的潜在传播凋亡。仿真结果表明,与文献中的代表性工作相比,FDS-Marl可以显着提高内容缓存和提取性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号