首页> 外文会议>International Conference on Artificial Intelligence in Education >Tackling the Credit Assignment Problem in Reinforcement Learning-Induced Pedagogical Policies with Neural Networks
【24h】

Tackling the Credit Assignment Problem in Reinforcement Learning-Induced Pedagogical Policies with Neural Networks

机译:在强化学习诱发的教学政策中解决信用分配问题,用神经网络

获取原文

摘要

Intelligent Tutoring Systems (ITS) provide a powerful tool for students to learn in an adaptive, personalized, and goal-oriented manner. In recent years, Reinforcement Learning (RL) has shown to be capable of leveraging previous student data to induce effective pedagogical policies for future students. One of the most desirable goals of these policies is to maximize student learning gains while minimizing the training time. However, this metric is often not available until a student has completed the entire tutor. For this reason, the reinforcement signal of the effectiveness of the tutor is delayed. Assigning credit for each intermediate action based on a delayed reward is a challenging problem denoted the temporal Credit Assignment Problem (CAP). The CAP makes it difficult for most RL algorithms to assign credit to each action. In this work, we develop a general Neural Network-based algorithm that tackles the CAP by inferring immediate rewards from delayed rewards. We perform two empirical classroom studies, and the results show that this algorithm, in combination with a Deep RL agent, can improve student learning performance while reducing training time.
机译:智能辅导系统(其)为学生提供了一个强大的工具,以便以适应性,个性化和以目标为导向的方式学习。近年来,加强学习(RL)已表明能够利用以前的学生数据为未来学生引起有效的教学政策。这些政策最理想的目标之一是最大限度地提高学生学习收益,同时最小化培训时间。但是,在学生完成整个导师之前,这种度量通常不可用。因此,导师的有效性的增强信号延迟。基于延迟奖励为每个中间动作分配信用证是一个具有挑战性的问题,表示时间信用分配问题(上限)。 CAP使大多数RL算法难以为每个动作分配信用。在这项工作中,我们开发了一般的基于神经网络的算法,通过从延迟奖励推断立即奖励来解决帽子。我们执行两个实证课堂研究,结果表明,这种算法与深射击剂组合可以提高学生学习表现,同时减少培训时间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号