首页> 外文期刊>Neurocomputing >Zero-shot policy generation in lifelong reinforcement learning
【24h】

Zero-shot policy generation in lifelong reinforcement learning

机译:终身加固学习中的零射精政策生成

获取原文
获取原文并翻译 | 示例

摘要

Lifelong reinforcement learning (LRL) is an important approach to achieve continual lifelong learning of multiple reinforcement learning tasks. The two major methods used in LRL are task decomposition and policy knowledge extraction. Policy knowledge extraction method in LRL can share knowledge for tasks in different task domains and for tasks in the same task domain with different system environmental coefficients. However, the generalization ability of policy knowledge extraction method is limited on learned tasks rather than learned task domains. In this paper, we propose a cross-domain lifelong reinforcement learning algorithm with zero-shot policy generation ability (CDLRL-ZPG) to improve generalization ability of policy knowledge extraction method from learned tasks to learned task domains. In experiments, we evaluated CDLRL-ZPG performance on four task domains. And our results show that the proposed algorithm can directly generate satisfactory results without needing a trial and error learning process to achieve zero-shot learning in general.(c) 2021 Elsevier B.V. All rights reserved.
机译:终身加强学习(LRL)是实现多重加固学习任务的持续终身学习的重要方法。 LRL中使用的两种主要方法是任务分解和政策知识提取。 LRL中的策略知识提取方法可以共享不同任务域中的任务知识以及具有不同系统环境系数的同一任务域中的任务。然而,政策知识提取方法的泛化能力是有限的学习任务而不是学到的任务领域。在本文中,我们提出了一种跨域终身加强学习算法,具有零击策策略生成能力(CDLRL-ZPG),以改善策略知识提取方法的泛化能力从学习任务到学习的任务域。在实验中,我们在四个任务域中评估了CDLRL-ZPG性能。我们的结果表明,该算法可以直接产生令人满意的结果,而无需试验和错误学习过程,以实现零射击学习。(c)2021 Elsevier B.V.保留所有权利。

著录项

  • 来源
    《Neurocomputing》 |2021年第25期|65-73|共9页
  • 作者单位

    Univ Chinese Acad Sci UCAS Sch Artificial Intelligence Beijing 100049 Peoples R China|Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China;

    Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China|Meituan Beijing Peoples R China;

    Univ Chinese Acad Sci UCAS Sch Artificial Intelligence Beijing 100049 Peoples R China|Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China|Chinese Acad Sci CAS Ctr Excellence Brain Sci & Intelligence Techn Shanghai 200031 Peoples R China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Lifelong reinforcement learning; Generalization policy; Task domain;

    机译:终身加强学习;概括政策;任务域;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号