Condition-based maintenance (CBM) optimization may turn intractable when a complex system with multiple units becomes an asset of interest. This paper aims to find a CBM policy for a multi-unit series system subject to stochastic degradation, where a new inspection is scheduled based on age and condition monitoring data upon each inspection. The novelty of this study lies in proposing a modified deep reinforcement learning (DRL) al-gorithm for the semi-Markov decision processes (SMDP) to find an opportunistic CBM policy for a multi-unit system with economic dependency over an infinite horizon, where a range of repair actions are allowed under an aperiodic inspection scheme. We also suggested a novel environment simulator that considers the simulta-neous impact of age and covariates using the proportional hazards (PH) model and the system's reliability characteristics. DRL acts as not only a learning algorithm obviating the full specification of the model but also an approximate scheme producing a solution in a limited computation. The proposed algorithm is applied to a multi-unit hydroelectric power system with the damage self-healing property to demonstrate the higher per-formance of the DRL algorithm in cost reduction than alternative policies and explain how enhancing system reliability reduces costs during the learning process.
展开▼