首页> 外文期刊>Evolving Systems >Adaptive maximum-lifetime routing in mobile ad-hoc networks using temporal difference reinforcement learning
【24h】

Adaptive maximum-lifetime routing in mobile ad-hoc networks using temporal difference reinforcement learning

机译:使用时间差异强化学习的移动自组织网络中的自适应最大生存时间路由

获取原文
获取原文并翻译 | 示例
           

摘要

Mobile ad-hoc NETworks (MANETs) are very dynamic environments. A routing protocol for MANETs should be adaptive in order to operate correctly in presence of variable network conditions. Reinforcement learning (RL) is a recently used technique to achieve adaptive routing in MANETs. In comparison to other machine learning and computational intelligence techniques, RL achieves optimal results at low processing and medium memory costs. To deal with adaptive energy-aware routing issue in MANETs, a RL-based maximum-lifetime routing strategy is proposed. Each mobile node learns how to adjust its route-request packets forwarding-rate according to its energy profile. In terms of RL-resolution methods, Q-Learning, SARSA, Q(λ) and SARSA(λ) which are Temporal difference RL-algorithms are used. The proposed RL model is implemented on the top of AODV routing protocol. Simulation results show that the RL-based AODV achieved good performances in comparison to Time-Delay and Probability based AODV. Particularly, the Q-Learning based AODV has marked the best global performances in terms of energy efficiency and end to end delay.
机译:移动即席NETworks(MANET)是非常动态的环境。 MANET的路由协议应该是自适应的,以便在可变网络条件下正确运行。强化学习(RL)是最近在MANET中用于实现自适应路由的技术。与其他机器学习和计算智能技术相比,RL以低处理和中等内存成本获得了最佳结果。为了解决MANET中自适应的能量感知路由问题,提出了一种基于RL的最大寿命路由策略。每个移动节点都学习如何根据其能量配置文件调整其路由请求数据包的转发速率。在RL解析方法方面,使用了时间差RL算法的Q-Learning,SARSA,Q(λ)和SARSA(λ)。所提出的RL模型是在AODV路由协议的顶部实现的。仿真结果表明,与基于时间和概率的AODV相比,基于RL的AODV表现良好。特别是,基于Q-Learning的AODV在能效和端到端延迟方面表现出最佳的全球性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号