首页> 外文期刊>Elevator world >A STUDY OF ELEVATOR DYNAMIC SCHEDULING POLICY BASED ON REINFORCEMENT LEARNING
【24h】

A STUDY OF ELEVATOR DYNAMIC SCHEDULING POLICY BASED ON REINFORCEMENT LEARNING

机译:基于强化学习的电梯动态调度策略研究。

获取原文
获取原文并翻译 | 示例
           

摘要

The problem of elevator group scheduling is formulated by the framework of the Markov Decision Process (MDP), and then the elements in the model of a reinforcement learning algorithm are defined. When reinforcement learning is applied, the stochastic action-selected policy and feed-forward neural network are used to handle the problems of exploration and generalization of value function respectively, which are integrated into the value iteration algorithm, called "Q-learning," to build up the whole algorithm for elevator group scheduling. The simulation results demonstrate the good learning ability, good performance and the adaptability for different traffic flows of algorithm scheduling.
机译:在马尔可夫决策过程(MDP)的框架下提出了电梯群调度问题,然后定义了强化学习算法模型中的元素。在应用强化学习时,随机动作选择策略和前馈神经网络分别用于处理价值函数的探索和泛化问题,这些问题被集成到称为“ Q学习”的价值迭代算法中,建立了电梯群调度的整体算法。仿真结果表明,该算法具有良好的学习能力,良好的性能以及对不同流量的算法调度的适应性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号