首页> 外文期刊>IEEE systems journal >An Actor-Critic Deep Reinforcement Learning Approach for Transmission Scheduling in Cognitive Internet of Things Systems
【24h】

An Actor-Critic Deep Reinforcement Learning Approach for Transmission Scheduling in Cognitive Internet of Things Systems

机译:事实互联网传输调度的演员评论家深度加强学习方法

获取原文
获取原文并翻译 | 示例
           

摘要

The cognitive Internet of Things (CIoT) has attracted much interest recently in wireless networks due to its wide applications in smart cities, intelligent transportation systems, and smart metering networks. However, how to smartly schedule the packet transmission in CIoT systems is still a key challenge, that is, how to design a smart agent to realize the intelligent decision making and effective interoperability. In this paper, we model the system state transformation as a Markov decision process, and an actor-critic deep reinforcement learning algorithm based on a fuzzy normalized radial basis function neural network (called AC-FNRBF) is proposed to efficiently solve the intelligent transmission scheduling problem in CIoT systems under high-dimensional variables. The proposed AC-FNRBF algorithm can better approximate both the action function of the actor and the state-action value function of the critic without requiring the system prior knowledge, and a new reward function is established to maximize the system benefit, which jointly takes the transmission packet rate, the system throughput, the power consumption, and the transmission delay into account. Moreover, the AC-FNRBF has the ability to adjust its learning structure and parameters in dynamic environments. Simulation results verify that the proposed algorithm achieves higher transmission packet rate and system throughput with lower power consumption and transmission delay, compared with other existing reinforcement learning algorithms.
机译:由于智能城市,智能交通系统和智能计量网络的广泛应用,最近在无线网络中吸引了认知的事情(CIOT)引起了许多利益。但是,如何巧妙地安排CIOR系统中的数据包传输仍然是一个关键挑战,即如何设计智能代理以实现智能决策和有效的互操作性。在本文中,我们模拟了系统状态转换作为马尔可夫决策过程,并提出了一种基于模糊归一化径向基函数神经网络(称为AC-FNRBF)的演员 - 评论家深度加强学习算法,以有效地解决智能传输调度高维变量下的CIET系统问题。所提出的AC-FNRBF算法可以更好地近似于批评者的动作功能和批评者的状态 - 动作价值函数而不需要系统事先知识,并且建立了一个新的奖励功能,以最大限度地提高系统效益,这将共同采取的系统效益传输分组速率,系统吞吐量,功耗和传输延迟考虑。此外,AC-FNRBF能够在动态环境中调整其学习结构和参数。仿真结果验证,与其他现有增强学习算法相比,该算法验证了较高的传输分组速率和具有较低功耗和传输延迟的系统吞吐量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号