首页> 外文会议>Machinee learning >Self-improvement Based On Reinforcement Learning, Planning and Teaching
【24h】

Self-improvement Based On Reinforcement Learning, Planning and Teaching

机译:基于强化学习,计划和教学的自我完善

获取原文
获取原文并翻译 | 示例

摘要

AHC-learning and Q-learning are slow learning methods. This paper investigates three extensions to the two basic learning algorithms. The three extensions are 1) experience replay, 2) learning action models for planning, and 3) teaching. The basic algorithms and their extensions were evaluated using a dynamic environment as a testbed. The environment is nontrivial and nondeter-ministic. The results show that the extensions can effectively improve the learning rate and in many cases even the asymptotic performance.
机译:AHC学习和Q学习是缓慢的学习方法。本文研究了两种基本学习算法的三个扩展。这三个扩展是1)体验重播,2)学习用于计划的动作模型以及3)教学。使用动态环境作为测试平台评估了基本算法及其扩展。环境是不平凡的,不确定的。结果表明,扩展可以有效地提高学习率,甚至在许多情况下甚至可以提高渐近性能。

著录项

  • 来源
    《Machinee learning》|1991年|323-327|共5页
  • 会议地点 Evanston IL(US);Evanston IL(US)
  • 作者

    Long-Ji Lin;

  • 作者单位

    School of Computer Science Carnegie Mellon University Pittsburgh, Pennsylvania 15213;

  • 会议组织
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 计算机的应用;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号