Exploring the Impact of Simple Explanations and Agency on Batch Deep Reinforcement Learning Induced Pedagogical Policies

机译：探索简单的解释和代理对批量深度强化学习诱导的教学策略的影响

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In recent years, Reinforcement learning (RL), especially Deep RL (DRL), has shown outstanding performance in video games from Atari, Mario, to StarCraft. However, little evidence has shown that DRL can be successfully applied to real-life human-centric tasks such as education or healthcare. Different from classic game-playing where the RL goal is to make an agent smart, in human-centric tasks the ultimate RL goal is to make the human-agent interactions productive and fruitful. Additionally, in many real-life human-centric tasks, data can be noisy and limited. As a sub-field of RL, batch RL is designed for handling situations where data is limited yet noisy, and building simulations is challenging. In two consecutive classroom studies, we investigated applying batch DRL to the task of pedagogical policy induction for an Intelligent Tutoring System (ITS), and empirically evaluated the effectiveness of induced pedagogical policies. In Fall 2018 (F18), the DRL policy is compared against an expert-designed baseline policy and in Spring 2019 (S19), we examined the impact of explaining the batch DRL-induced policy with student decisions and the expert baseline policy. Our results showed that 1) while no significant difference was found between the batch RL-induced policy and the expert policy in F18, the batch RL-induced policy with simple explanations significantly improved students' learning performance more than the expert policy alone in S19; and 2) no significant differences were found between the student decision making and the expert policy. Overall, our results suggest that pairing simple explanations with induced RL policies can be an important and effective technique for applying RL to real-life human-centric tasks.

机译：近年来，强化学习（RL），尤其是Deep RL（DRL），在从Atari，Mario到StarCraft的视频游戏中都表现出出色的性能。但是，几乎没有证据表明DRL可以成功地应用于以人为中心的现实生活中的工作，例如教育或医疗保健。与RL目标是使代理人变得聪明的经典游戏不同，在以人为中心的任务中，RL最终目的是使人与人之间的交互富有成效。此外，在许多现实生活中，以人为中心的任务中，数据可能嘈杂且有限。作为RL的子字段，批RL被设计用于处理数据有限但嘈杂且构建模拟具有挑战性的情况。在两个连续的课堂研究中，我们调查了将批处理DRL应用于智能辅导系统（ITS）的教学政策归纳任务，并通过经验评估了诱导教学法的有效性。在2018年秋季（F18），将DRL政策与专家设计的基准政策进行了比较，在2019年春季（S19），我们研究了用学生的决定和专家基准政策解释批量DRL诱导政策的影响。我们的结果表明：1）虽然在F18中，批量RL诱导策略和专家策略之间没有发现显着差异，但简单解释的批量RL诱导策略比S19中单独使用专家策略显着提高了学生的学习成绩; 2）在学生的决策和专家政策之间没有发现显着差异。总体而言，我们的结果表明，将简单的解释与诱导的RL策略配对可能是将RL应用到以人为中心的现实生活中的重要且有效的技术。

著录项

来源
《International Conference on Artificial Intelligence in Education》|2020年|472-485|共14页
会议地点
作者
Markel Sanz Ausin; Mehak Maniktala; Tiffany Barnes; Min Chi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Deep reinforcement learning; Pedagogical policy; Explanation;

机译：深度强化学习;教育政策;说明;

相似文献

外文文献
中文文献
专利

1. Reinforcement learning based optimal control of batch processes using Monte-Carlo deep deterministic policy gradient with phase segmentation [J] . Haeun Yoo, Boeun Kim, Jong Woo Kim, Computers & Chemical Engineering . 2021,第Jana4期

机译：基于跨越蒙特 - 卡洛深度确定性政策梯度的批量学习基于批处理流程的最优控制
2. Counterfactual state explanations for reinforcement learning agents via generative deep learning [J] . Matthew L. Olson, Roli Khanna, Lawrence Neal, Artificial intelligence . 2021,第Juna期

机译：通过生成深度学习加固学习代理的反事实状态解释
3. Machine-learning-based simulation and fed-batch control of cyanobacterial-phycocyanin production in Plectonema by artificial neural network and deep reinforcement learning [J] . Yan Ma, Daniel A. Norena-Caro, Alexandria J. Adams, Computers & Chemical Engineering . 2020,第Nova2期

机译：通过人工神经网络和深加固学习在Plectonema中基于机器学习的仿真和喂养分批控制植物植物植物植物
4. Tackling the Credit Assignment Problem in Reinforcement Learning-Induced Pedagogical Policies with Neural Networks [C] . Markel Sanz Ausin, Mehak Maniktala, Tiffany Barnes, International Conference on Artificial Intelligence in Education . 2021

机译：在强化学习诱发的教学政策中解决信用分配问题，用神经网络
5. On Deep Reinforcement Learning for Games: Generalization of Deep Q-Learning with Multiple Policy Heads [D] . Boucher, Mathieu. 2020

机译：关于游戏的深度加固学习：多重政策头部深度Q学的泛化
6. Exploring Feature Dimensions to Learn a New Policy in an Uninformed Reinforcement Learning Task [O] . Oh-hyeon Choung, Sang Wan Lee, Yong Jeong -1

机译：探索功能维度以在不知情的强化学习任务中学习新策略
7. Exploring the Impact of Simple Explanations and Agency on Batch Deep Reinforcement Learning Induced Pedagogical Policies [O] . Markel Sanz Ausin, Mehak Maniktala, Tiffany Barnes, 2020

机译：探索简单解释与机构对批量深度加固学习诱发教学政策的影响

Exploring the Impact of Simple Explanations and Agency on Batch Deep Reinforcement Learning Induced Pedagogical Policies

摘要

著录项

相似文献

相关主题

期刊订阅