...
首页> 外文期刊>IEEE transactions on automation science and engineering: a publication of the IEEE Robotics and Automation Society >Real-Time Scheduling for Dynamic Partial-No-Wait Multiobjective Flexible Job Shop by Deep Reinforcement Learning
【24h】

Real-Time Scheduling for Dynamic Partial-No-Wait Multiobjective Flexible Job Shop by Deep Reinforcement Learning

机译:基于深度强化学习的动态部分无等待多目标灵活作业车间的实时调度

获取原文
获取原文并翻译 | 示例

摘要

In modern discrete flexible manufacturing systems, dynamic disturbances frequently occur in real time and each job may contain several special operations in partial-no-wait constraint due to technological requirements. In this regard, a hierarchical multiagent deep reinforcement learning (DRL)-based real-time scheduling method named hierarchical multi-agent proximal policy optimization (HMAPPO) is developed to address the dynamic partial-no-wait multiobjective flexible job shop scheduling problem (DMOFJSP-PNW) with new job insertions and machine breakdowns. The proposed HMAPPO contains three proximal policy optimization (PPO)-based agents operating in different spatiotemporal scales, namely, objective agent, job agent, and machine agent. The objective agent acts as a higher controller periodically determining the temporary objectives to be optimized. The job agent and machine agent are lower actuators, respectively, choosing a job selection rule and machine assignment rule to achieve the temporary objective at each rescheduling point. Five job selection rules and six machine assignment rules are designed to select an uncompleted job and assign the next operation of which together with its successors in no-wait constraint on the corresponding processing machines. A hierarchical PPO-based training algorithm is developed. Extensive numerical experiments have confirmed the effectiveness and superiority of the proposed HMAPPO compared with other well-known dynamic scheduling methods. Note to Practitioners—The motivation of this article stems from the need to develop real-time scheduling methods for modern discrete flexible manufacturing factories, such as aerospace product manufacturing and steel manufacturing, where dynamic events frequently occur, and each job may contain several operations subjected to the no-wait constraint. Traditional dynamic scheduling methods, such as metaheuristics or dispatching rules, either suffer from poor time efficiency or fail to ensure good solution quality for multiple objectives in the long-term run. Meanwhile, few of the previous studies have considered the partial-no-wait constraint among several operations from the same job, which widely exists in many industries. In this article, we propose a hierarchical multiagent deep reinforcement learning (DRL)-based real-time scheduling method named HMAPPO to address the dynamic partial-no-wait multiobjective flexible job shop scheduling problem (DMOFJSP-PNW) with new job insertions and machine breakdowns. The proposed HMAPPO uses three DRL-based agents to adaptively select the temporary objectives and choose the most feasible dispatching rules to achieve them at different rescheduling points, through which the rescheduling can be made in real time and a good compromise among different objectives can be obtained in the long-term schedule. Extensive experimental results have demonstrated the effectiveness and superiority of the proposed HMAPPO. For industrial applications, this method can be extended to many other production scheduling problems, such as hybrid flow shops and open shop with different uncertainties and objectives.
机译:在现代离散柔性制造系统中,动态干扰经常实时发生,并且由于技术要求,每个作业可能包含多个特殊操作,处于部分无等待约束。为此,针对新作业插入和机器故障的动态部分无等待多目标灵活作业车间调度问题(DMOFJSP-PNW),提出了一种基于DRL(DRL)的实时调度方法,称为分层多智能体近端策略优化(HMAPPO)。所提出的HMAPPO包含3个基于PPO的近端策略优化智能体,分别在不同的时空尺度上运行,即目标智能体、作业智能体和机器智能体。目标代理充当更高的控制器,定期确定要优化的临时目标。作业代理和机器代理分别是下级执行器,选择作业选择规则和机器分配规则,以实现每个重新调度点的临时目标。设计了五条作业选择规则和六条机器分配规则,用于选择未完成的作业,并在相应的处理机器上分配下一个操作及其后续操作。建立了一种基于PPO的分层训练算法。大量的数值实验证实了所提出的HMAPPO与其他众所周知的动态调度方法相比的有效性和优越性。从业者须知 — 本文的动机源于需要为现代离散柔性制造工厂(例如航空航天产品制造和钢铁制造)开发实时调度方法,在这些工厂中,动态事件经常发生,并且每个作业可能包含多个受无等待约束的操作。传统的动态调度方法,如元启发式或调度规则,要么存在时间效率低下的问题,要么无法确保长期内多个目标的良好解决方案质量。同时,以前的研究很少考虑同一作业的几个操作之间的部分不等待约束,这在许多行业中广泛存在。本文提出了一种基于多智能体深度强化学习(DRL)的实时调度方法HMAPPO,以解决动态部分无等待多目标灵活作业车间调度问题(DMOFJSP-PNW)与新作业插入和机器故障。所提出的HMAPPO使用3个基于DRL的智能体,自适应地选择临时目标,并选择最可行的调度规则在不同的重新调度点实现这些目标,通过这些规则可以实时进行重新调度,并在长期调度中获得不同目标之间的良好折衷。大量的实验结果证明了所提出的HMAPPO的有效性和优越性。对于工业应用,这种方法可以扩展到许多其他生产调度问题,例如具有不同不确定性和目标的混合流车间和开放式车间。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号