首页> 外文会议>IEEE International Conference on Communications Workshops >Federated Double Deep Q-learning for Joint Delay and Energy Minimization in IoT networks
【24h】

Federated Double Deep Q-learning for Joint Delay and Energy Minimization in IoT networks

机译:联邦双重Q-Learning用于IOT网络中的联合延迟和能量最小化

获取原文

摘要

In this paper, we propose a federated deep reinforcement learning framework to solve a multi-objective optimization problem, where we consider minimizing the expected long-term task completion delay and energy consumption of IoT devices. This is done by optimizing offloading decisions, computation resource allocation, and transmit power allocation. Since the formulated problem is a mixed-integer non-linear programming (MINLP), we first cast our problem as a multi-agent distributed deep reinforcement learning (DRL) problem and address it using double deep Q-network (DDQN), where the actions are offloading decisions. The immediate cost of each agent is calculated through solving either the transmit power optimization or local computation resource optimization, based on the selected offloading decisions (actions). Then, to enhance the learning speed of IoT devices (agents), we incorporate federated learning (FDL) at the end of each episode. FDL enhances the scalability of the proposed DRL framework, creates a context for cooperation between agents, and minimizes their privacy concerns. Our numerical results demonstrate the efficacy of our proposed federated DDQN framework in terms of learning speed compared to federated deep Q network (DQN) and non-federated DDQN algorithms. In addition, we investigate the impact of batch size, network layers, DDQN target network update frequency on the learning speed of the FDL.
机译:在本文中,我们提出了联合的深度加强学习框架来解决了多目标优化问题,在那里我们考虑最小化IoT设备的预期长期任务完成延迟和能耗。这是通过优化卸载决策,计算资源分配和发送功率分配来完成的。由于配制的问题是混合整数非线性编程(MINLP),我们首先将我们的问题作为一个多代理分布式的深度加强学习(DRL)问题,并使用双层Q-Network(DDQN)来解决它,其中行动正在卸载决策。通过根据所选择的卸载决策来解决发送功率优化或本地计算资源优化来计算每个代理的直接成本。然后,为了提高IoT设备(代理)的学习速度,我们在每集结束时纳入联合学习(FDL)。 FDL增强了拟议的DRL框架的可扩展性,为代理商之间的合作创建了背景,并最大限度地减少了他们的隐私问题。我们的数值结果表明,与联邦深度Q网络(DQN)和非联邦DDQN算法相比,我们提出的联邦DDQN框架在学习速度方面的功效。此外,我们研究了批量大小,网络层,DDQN目标网络更新频率对FDL的学习速度的影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号