Learning using multidimensional internal rewards

机译：使用多维内部奖励学习

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Complicated tasks are often difficult to be expressed as single reward systems. In the human learning process, the relation between sensory inputs and action out-puts can be understood to have been acquired before-hand using an internal multidimensional reward system. We introduce reinforcement learning under multidimensional evaluation. The internal reward system includes both immediate evaluation and delayed rewards. The proposed architecture of the learning system is as a two layered Q-Learning system, which is combined with dynamic cell structure. We assume in the pushing task by a manipulator that information from touch sensors and motion detector of the vision system are available. The simulation showed that the acquired knowledge in the lower layer greatly helps to learn the pushing task.

机译：复杂的任务往往难以表达为单一奖励系统。在人类学习过程中，可以理解有感觉输入和动作输出的关系，以使用内部多维奖励系统在手头之前获得。我们在多维评价下引入强化学习。内部奖励系统包括即时评估和延迟奖励。学习系统的建议体系结构作为两个分层Q学习系统，其与动态小区结构组合。我们假设通过操纵器推动任务，从触摸传感器和视觉系统的运动检测器的信息可用。模拟表明，下层的获取知识极大地有助于学习推动任务。

著录项

来源
《IEEE/RSJ International Conference on Intelligent Robots and Systems》|2000年||共6页
会议地点
作者
Kobayashi Y.; Yuasa H.; Institute of Electric and Electronic Engineer;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Task Learning Over Multi-Day Recording via Internally Rewarded Reinforcement Learning Based Brain Machine Interfaces [J] . Shen Xiang, Zhang Xiang, Huang Yifan, IEEE transactions on neural systems and rehabilitation engineering . 2020,第12期

机译：通过基于内部奖励加强学习的脑机接口，任务学习多日录制
2. Necessary Contributions of Human Frontal Lobe Subregions to Reward Learning in a Dynamic, Multidimensional Environment [J] . Vaidya Avinash R., Fellows Lesley K. The Journal of Neuroscience: The Official Journal of the Society for Neuroscience . 2016,第38期

机译：人类额叶次区域在动态，多维环境中奖励学习的必要贡献
3. Multi-Agent Cooperation Based on Reinforcement Learning with Internal Reward in Maze Problem [J] . Fumito UWANO, Naoki TATEBE, Yusuke TAJIMA, SICE Journal of Control, Measurement, and System Integration (SICE JCMSI) . 2018,第4期

机译：基于迷宫问题的内部奖励的强化学习多功能协作
4. Learning using multidimensional internal rewards [C] . Kobayashi, Y., Yuasa, Intelligent Robots and Systems, 2000. (IROS 2000). Proceedings. 2000 IEEE/RSJ International Conference on . 2000

机译：使用多维内部奖励进行学习
5. Effects of Nicotine Withdrawal on Motivation, Reward Sensitivity and Reward-Learning. [D] . Oliver, Jason A. 2015

机译：尼古丁戒断对动机，奖励敏感性和奖励学习的影响。
6. Necessary Contributions of Human Frontal Lobe Subregions to Reward Learning in a Dynamic Multidimensional Environment [O] . Avinash R. Vaidya, Lesley K. Fellows 2016

机译：动态多维环境中人类额叶子区域对奖励学习的必要贡献
7. Necessary Contributions of Human Frontal Lobe Subregions to Reward Learning in a Dynamic, Multidimensional Environment [O] . A. R. Vaidya, L. K. Fellows 2016

机译：人类额叶次区域在动态，多维环境中奖励学习的必要贡献
8. Framing Reinforcement Learning from Human Reward: Reward Positivity, Temporal Discounting, Episodicity, and Performance. [R] . Knox, W. B., Stone, P. 2014

机译：从人类奖励中学习强化学习：奖励积极性，时间贴现，情节性和表现。

Learning using multidimensional internal rewards

摘要

著录项

相似文献

相关主题

期刊订阅