首页> 外文期刊>Journal of Parallel and Distributed Computing >Improving reliability in resource management through adaptive reinforcement learning for distributed systems
【24h】

Improving reliability in resource management through adaptive reinforcement learning for distributed systems

机译:通过对分布式系统进行自适应强化学习来提高资源管理的可靠性

获取原文
获取原文并翻译 | 示例

摘要

Demands on capacity of distributed systems (e.g., Grid and Cloud) play a crucial role in today's information era due to the growing scale of the systems. While the distributed systems provide a vast amount of computing power their reliability is often hard to be guaranteed. This paper presents effective resource management using adaptive reinforcement learning (RL) that focuses on improving successful execution with low computational complexity. The approach uses an emerging methodology of RL in conjunction with neural network to help a scheduler that effectively observes and adapts to dynamic changes in execution environments. The observation of environment at various learning stages that normalize by resource-aware availability and feedback-based scheduling significantly brings the environments closer to the optimal solutions. Our approach also solves a high computational complexity in RL system through on-demand information sharing. Results from our extensive simulations demonstrate the effectiveness of adaptive RL for improving system reliability.
机译:由于系统规模的不断扩大,对分布式系统(例如网格和云)的容量需求在当今的信息时代起着至关重要的作用。尽管分布式系统提供了大量的计算能力,但通常很难保证其可靠性。本文提出了使用自适应强化学习(RL)的有效资源管理,该方法专注于以较低的计算复杂度来提高成功执行的效率。该方法将RL的新兴方法与神经网络结合使用,以帮助调度程序有效地观察并适应执行环境中的动态变化。通过了解资源的可用性和基于反馈的计划对各个学习阶段的环境进行观察,可以使环境更接近最佳解决方案。我们的方法还通过按需信息共享解决了RL系统中的高计算复杂性。我们广泛的仿真结果证明了自适应RL对于提高系统可靠性的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号