【24h】

Restraining Bolts for Reinforcement Learning Agents

机译:用于加固学习代理的限制螺栓

获取原文

摘要

In this work we have investigated the concept of "restraining bolt", inspired by Science Fiction. We have two distinct sets of features extracted from the world, one by the agent and one by the authority imposing some restraining specifications on the behaviour of the agent (the "restraining bolt"). The two sets of features and, hence the model of the world attainable from them, are apparently unrelated since of interest to independent parties. However they both account for (aspects of) the same world. We have considered the case in which the agent is a reinforcement learning agent on a set of low-level (subsymbolic) features, while the restraining bolt is specified logically using linear time logic on finite traces ltl_f/LDL_f over a set of high-level symbolic features. We show formally, and illustrate with examples, that, under general circumstances, the agent can learn while shaping its goals to suitably conform (as much as possible) to the restraining bolt specifications.
机译:在这项工作中,我们研究了“抑制螺栓”的概念,灵感来自科幻小说。 我们有两个从世界提取的两种不同的特征,一个由代理商一个由代理商和权威机构施加了一些关于代理的行为的抑制规范(“抑制螺栓”)。 这两套特征,因此,自从独立方感兴趣以来,他们所能实现的世界的模型显然无关。 但是,他们都占(方面)同一个世界。 我们已经考虑了代理是在一组低级(亚jbolic)特征上的加强学习代理的情况,而约束螺栓在一组高电平上使用有限迹线LTL_F / LDL_F上的线性时间逻辑逻辑地指定 象征性功能。 我们正式地展示,并用示例说明,即在一般情况下,代理可以在塑造其目标以适当地符合(尽可能多地)到约束螺栓规格。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号