首页> 外文期刊>Journal of combinatorial optimization >A reinforcement-learning approach for admission control in distributed network service systems
【24h】

A reinforcement-learning approach for admission control in distributed network service systems

机译:分布式网络服务系统中接纳控制的强化学习方法

获取原文
获取原文并翻译 | 示例
           

摘要

In the distributed network service systems such as streaming-media systems and resource-sharing systems with multiple service nodes, admission control (AC) technology is an essential way to enhance performance. Model-based optimization approaches are good ways to be applied to analyze and solve the optimal AC policy. However, due to "the curse of dimensionality", computing such policy for practical systems is a rather difficult task. In this paper, we consider a general model of the distributed network service systems, and address the problem of designing an optimal AC policy. An analytical model is presented for the system with fixed parameters based on semi-Markov decision process (SMDP). We design an event-driven AC policy, and the stationary randomized policy is taken as the policy structure. To solve the SMDP, both the state aggregation approach and the reinforcement-learning (RL) method with online policy optimization algorithm are applied. Then, we extend the problem by considering the system with time-varying parameters, where the arrival rates of requests at each service node may change over time. In view of this situation, an AC policy switching mechanism is presented. This mechanism allows the system to decide whether to adjust its AC policy according to the policy switching rule. And in order to maximize the gain of system, that is, to obtain the optimal AC policy switching rule, another RL-based algorithm is applied. To assess the effectiveness of SMDP-based AC policy and policy switching mechanism for the system, numerical experiments are presented. We compare the performance of optimal policies obtained by the solutions of proposed methods with other classical AC policies. The simulation results illustrate that higher performance and computational efficiency could be achieved by using the SMDP model and RL-based algorithms proposed in this paper.
机译:在诸如具有多个服务节点的流媒体系统和资源共享系统之类的分布式网络服务系统中,准入控制(AC)技术是提高性能的必不可少的方法。基于模型的优化方法是用于分析和求解最优交流策略的好方法。但是,由于“维数的诅咒”,为实际系统计算这种策略是一项相当困难的任务。在本文中,我们考虑了分布式网络服务系统的通用模型,并解决了设计最佳交流策略的问题。提出了一种基于半马尔可夫决策过程(SMDP)的固定参数系统解析模型。我们设计了一个事件驱动的AC策略,并以静态随机策略为策略结构。为了解决SMDP,应用状态聚合方法和带有在线策略优化算法的强化学习(RL)方法。然后,我们通过考虑具有时变参数的系统来扩展问题,其中每个服务节点上的请求到达率可能会随时间变化。针对这种情况,提出了一种AC策略切换机制。通过这种机制,系统可以根据策略切换规则来决定是否调整其AC策略。为了最大化系统的增益,即获得最优的AC策略切换规则,应用了另一种基于RL的算法。为了评估基于SMDP的交流策略和系统策略切换机制的有效性,提出了数值实验。我们将通过提议方法的解决方案获得的最优策略的性能与其他经典AC策略进行比较。仿真结果表明,使用本文提出的SMDP模型和基于RL的算法可以实现更高的性能和计算效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号