首页> 外文期刊>Empirical Software Engineering >Testing self-healing cyber-physical systems under uncertainty with reinforcement learning: an empirical study
【24h】

Testing self-healing cyber-physical systems under uncertainty with reinforcement learning: an empirical study

机译:钢筋学习的不确定性测试自愈的网络物理系统:实证研究

获取原文
           

摘要

Self-healing is becoming an essential feature of Cyber-Physical Systems (CPSs). CPSs with this feature are named Self-Healing CPSs (SH-CPSs). SH-CPSs detect and recover from errors caused by hardware or software faults at runtime and handle uncertainties arising from their interactions with environments. Therefore, it is critical to test if SH-CPSs can still behave as expected under uncertainties. By testing an SH-CPS in various conditions and learning from testing results, reinforcement learning algorithms can gradually optimize their testing policies and apply the policies to detect failures, i.e., cases that the SH-CPS fails to behave as expected. However, there is insufficient evidence to know which reinforcement learning algorithms perform the best in terms of testing SH-CPSs behaviors including their self-healing behaviors under uncertainties. To this end, we conducted an empirical study to evaluate the performance of 14 combinations of reinforcement learning algorithms, with two value function learning based methods for operation invocations and seven policy optimization based algorithms for introducing uncertainties. Experimental results reveal that the 14 combinations of the algorithms achieved similar coverage of system states and transitions, and the combination of Q-learning and Uncertainty Policy Optimization (UPO) detected the most failures among the 14 combinations. On average, the Q-Learning and UPO combination managed to discover two times more failures than the others. Meanwhile, the combination took 52% less time to find a failure. Regarding scalability, the time and space costs of the value function learning based methods grow, as the number of states and transitions of the system under test increases. In contrast, increasing the system's complexity has little impact on policy optimization based algorithms.
机译:自我治疗成为网络物理系统(CPS)的基本特征。具有此功能的CPS被命名为自我修复CPS(SH-CPS)。 SH-CPSS从运行时在硬件或软件故障引起的错误中检测和恢复,并处理与环境的交互引起的不确定性。因此,测试SH-CPS仍然可以按照不确定性的预期表现,这是至关重要的。通过在各种条件下测试SH-CP并从测试结果中学习,加强学习算法可以逐步优化其测试政策,并应用策略来检测故障,即SH-CPS未按预期表现出的情况。然而,没有足够的证据来了解哪些强化学习算法在测试SH-CPS行为中表现最佳,包括其在不确定因素下的自我修复行为。为此,我们进行了一项实证研究,以评估增强学习算法的14种组合的性能,具有基于两个值函数学习的操作调用和七种策略优化的算法,用于引入不确定性。实验结果表明,算法的14种组合达到了类似的系统状态和转变的覆盖范围,以及Q-Learning和不确定性政策优化(UPO)的组合在14个组合中检测到最大的失败。平均而言,Q-Learning和UPO组合设定了发现比其他故障更多的两倍。与此同时,该组合少52%的时间来找到失败。关于可伸缩性,基于价值函数的方法的时间和空间成本增长,随着测试的状态的数量和转换增加。相比之下,增加系统的复杂性对基于政策优化的算法几乎没有影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号