Tackling the Credit Assignment Problem in Reinforcement Learning-Induced Pedagogical Policies with Neural Networks

机译：在强化学习诱发的教学政策中解决信用分配问题，用神经网络

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Intelligent Tutoring Systems (ITS) provide a powerful tool for students to learn in an adaptive, personalized, and goal-oriented manner. In recent years, Reinforcement Learning (RL) has shown to be capable of leveraging previous student data to induce effective pedagogical policies for future students. One of the most desirable goals of these policies is to maximize student learning gains while minimizing the training time. However, this metric is often not available until a student has completed the entire tutor. For this reason, the reinforcement signal of the effectiveness of the tutor is delayed. Assigning credit for each intermediate action based on a delayed reward is a challenging problem denoted the temporal Credit Assignment Problem (CAP). The CAP makes it difficult for most RL algorithms to assign credit to each action. In this work, we develop a general Neural Network-based algorithm that tackles the CAP by inferring immediate rewards from delayed rewards. We perform two empirical classroom studies, and the results show that this algorithm, in combination with a Deep RL agent, can improve student learning performance while reducing training time.

机译：智能辅导系统（其）为学生提供了一个强大的工具，以便以适应性，个性化和以目标为导向的方式学习。近年来，加强学习（RL）已表明能够利用以前的学生数据为未来学生引起有效的教学政策。这些政策最理想的目标之一是最大限度地提高学生学习收益，同时最小化培训时间。但是，在学生完成整个导师之前，这种度量通常不可用。因此，导师的有效性的增强信号延迟。基于延迟奖励为每个中间动作分配信用证是一个具有挑战性的问题，表示时间信用分配问题（上限）。 CAP使大多数RL算法难以为每个动作分配信用。在这项工作中，我们开发了一般的基于神经网络的算法，通过从延迟奖励推断立即奖励来解决帽子。我们执行两个实证课堂研究，结果表明，这种算法与深射击剂组合可以提高学生学习表现，同时减少培训时间。

著录项

来源
《International Conference on Artificial Intelligence in Education》|2021年|356-368|共13页
会议地点
作者
Markel Sanz Ausin; Mehak Maniktala; Tiffany Barnes; Min Chi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Pedagogical agent; Credit assignment problem; Deep reinforcement learning;

机译：教学剂;信用分配问题;深增强学习;

相似文献

外文文献
中文文献
专利

1. Cascade Attribute Network: Decomposing Reinforcement Learning Control Policies using Hierarchical Neural Networks [J] . Haonan Chang, Zhuo Xu, Masayoshi Tomizuka IFAC PapersOnLine . 2020,第2期

机译：Cascade属性网络：使用分层神经网络分解加强学习控制政策
2. Neural reactivations during sleep determine network credit assignment (vol 20, pg 1277, 2017) [J] . Gulati Tanuj, Guo Ling, Ramanathan Dhakshin S., Nature neuroscience . 2017,第9期

机译：睡眠期间的神经再激活确定网络信用分配（Vol 20，PG 1277,2017）
3. Neural reactivations during sleep determine network credit assignment [J] . Gulati Tanuj, Guo Ling, Ramanathan Dhakshin S., Nature neuroscience . 2017,第9期

机译：睡眠期间的神经再激活确定网络信用分配
4. STCA: Spatio-Temporal Credit Assignment with Delayed Feedback in Deep Spiking Neural Networks [C] . Pengjie Gu, Rong Xiao, Gang Pan, International Joint Conference on Artificial Intelligence . 2020

机译：STCA：时空信用分配，具有深度尖峰神经网络的延迟反馈
5. A Reinforcement Learning-based Framework for Resource Allocation and Task Assignment in Mobile Edge Computing Networks [D] . Hsieh, Li-Tse. 2021

机译：基于加强学习的移动边缘计算网络中的资源分配和任务分配框架
6. Neural reactivations during sleep determine network credit assignment [O] . Tanuj Gulati, Ling Guo, Dhakshin S. Ramanathan, -1

机译：睡眠期间的神经激活决定网络信用分配
7. Cascade Attribute Network: Decomposing Reinforcement Learning Control Policies using Hierarchical Neural Networks [O] . Haonan Chang, Zhuo Xu, Masayoshi Tomizuka 2020

机译：Cascade属性网络：使用分层神经网络分解加强学习控制政策
8. Drive reinforcement neurals networks for reactor control. Final report [R] . Williams, J. G. , Jouse, W. C. 1995

机译：用于反应堆控制的驱动增强神经网络。总结报告

Tackling the Credit Assignment Problem in Reinforcement Learning-Induced Pedagogical Policies with Neural Networks

摘要

著录项

相似文献

相关主题

期刊订阅