...
首页> 外文期刊>Computers & Industrial Engineering >Reinforcement learning with Gaussian processes for condition-based maintenance
【24h】

Reinforcement learning with Gaussian processes for condition-based maintenance

机译:加固与基于条件的维护的高斯工艺

获取原文
获取原文并翻译 | 示例
           

摘要

Condition-based maintenance strategies are effective in enhancing reliability and safety for complex engineering systems that exhibit degradation phenomena with uncertainty. Such sequential decision-making problems are often modeled as Markov decision processes (MDPs) when the underlying process has a Markov property. Recently, reinforcement learning (RL) becomes increasingly efficient to address MDP problems with large state spaces. In this paper, we model the condition-based maintenance problem as a discrete-time continuous-state MDP without discretizing the deterioration condition of the system. The Gaussian process regression is used as function approximation to model the state transition and the value functions of states in reinforcement learning. A RL algorithm is then developed to minimize the long-run average cost (instead of the commonly-used discounted reward) with iterations on the state-action value function and the state value function, respectively. We verify the capability of the proposed algorithm by simulation experiments and demonstrate its advantages in a case study on a battery maintenance decision-making problem. The proposed algorithm outperforms the discrete MDP approach by achieving lower long-run average costs.
机译:基于条件的维护策略对于提高具有不确定性的降解现象的复杂工程系统的可靠性和安全性有效。当底层进程具有Markov属性时,这种连续决策问题通常是Markov决策过程(MDP)。最近,加强学习(RL)越来越有效地解决了大状态空间的MDP问题。在本文中,我们将基于条件的维护问题模拟为离散时间连续状态MDP,而无需离动系统的恶化条件。高斯进程回归用作函数近似,以模拟钢筋学习中状态的状态转换和状态的价值函数。然后开发了RL算法,以便分别在状态 - 动作值函数和状态值函数上使用迭代来最小化长期平均成本(而不是通常使用的折扣奖励)。我们通过仿真实验验证了所提出的算法的能力,并在案例研究中展示了电池维护决策问题的案例。该算法通过实现较低的长期平均成本来优于离散MDP方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号