【24h】

Learning using multidimensional internal rewards

机译:使用多维内部奖励学习

获取原文

摘要

Complicated tasks are often difficult to be expressed as single reward systems. In the human learning process, the relation between sensory inputs and action out-puts can be understood to have been acquired before-hand using an internal multidimensional reward system. We introduce reinforcement learning under multidimensional evaluation. The internal reward system includes both immediate evaluation and delayed rewards. The proposed architecture of the learning system is as a two layered Q-Learning system, which is combined with dynamic cell structure. We assume in the pushing task by a manipulator that information from touch sensors and motion detector of the vision system are available. The simulation showed that the acquired knowledge in the lower layer greatly helps to learn the pushing task.
机译:复杂的任务往往难以表达为单一奖励系统。在人类学习过程中,可以理解有感觉输入和动作输出的关系,以使用内部多维奖励系统在手头之前获得。我们在多维评价下引入强化学习。内部奖励系统包括即时评估和延迟奖励。学习系统的建议体系结构作为两个分层Q学习系统,其与动态小区结构组合。我们假设通过操纵器推动任务,从触摸传感器和视觉系统的运动检测器的信息可用。模拟表明,下层的获取知识极大地有助于学习推动任务。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号