首页> 外国专利> SAFE AND FAST EXPLORATION FOR REINFORCEMENT LEARNING USING CONSTRAINED ACTION MANIFOLDS

SAFE AND FAST EXPLORATION FOR REINFORCEMENT LEARNING USING CONSTRAINED ACTION MANIFOLDS

机译：使用约束动作集的安全性和快速探索，用于强化学习

页面导航

摘要
著录项
相似文献

摘要

According to an aspect of the present invention, a computer-implemented method is provided for reinforcement learning. The method includes reading, by a processor device, an action manifold which is described as a n-polytope, at least one physical action limit, and at least one safety constraint. The method further includes updating, by the processor device, the action manifold based on the at least one physical action limit and the at least one safety constraint. The method also includes performing, by the processor device, the reinforcement learning by selecting a constrained action from among a set of constrained actions in the action manifold.

机译：根据本发明的一个方面，提供了一种用于增强学习的计算机实现的方法。该方法包括通过处理器设备读取被描述为n-多位体的动作歧管，至少一个物理动作极限和至少一个安全约束。该方法还包括由处理器设备基于至少一个物理动作极限和至少一个安全约束来更新动作歧管。该方法还包括由处理器设备通过从动作歧管中的一组受约束动作中选择受约束动作来执行强化学习。

著录项

公开/公告号US2020065666A1

专利类型
公开/公告日2020-02-27

原文格式PDF
申请/专利权人 INTERNATIONAL BUSINESS MACHINES CORPORATION;
展开▼

申请/专利号US201816112076
发明设计人 GIOVANNI DE MAGISTRIS;TU-HOA PHAM;ASIM MUNAWAR;RYUKI TACHIBANA;
展开▼

申请日2018-08-24
分类号G06N3/08;G06N3/04;
国家 US
入库时间 2022-08-21 11:20:45

相似文献

专利
外文文献
中文文献