首页> 外文期刊>Machine Learning >Bandit algorithms to personalize educational chatbots
【24h】

Bandit algorithms to personalize educational chatbots

机译:BANDIT算法来个性化教育聊天

获取原文
获取原文并翻译 | 示例

摘要

To emulate the interactivity of in-person math instruction, we developed MathBot, a rule-based chatbot that explains math concepts, provides practice questions, and offers tailored feedback. We evaluated MathBot through three Amazon Mechanical Turk studies in which participants learned about arithmetic sequences. In the first study, we found that more than 40% of our participants indicated a preference for learning with MathBot over videos and written tutorials from Khan Academy. The second study measured learning gains, and found that MathBot produced comparable gains to Khan Academy videos and tutorials. We solicited feedback from users in those two studies to emulate a real-world development cycle, with some users finding the lesson too slow and others finding it too fast. We addressed these concerns in the third and main study by integrating a contextual bandit algorithm into MathBot to personalize the pace of the conversation, allowing the bandit to either insert extra practice problems or skip explanations. We randomized participants between two conditions in which actions were chosen uniformly at random (i.e., a randomized A/B experiment) or by the contextual bandit. We found that the bandit learned a similarly effective pedagogical policy to that learned by the randomized A/B experiment while incurring a lower cost of experimentation. Our findings suggest that personalized conversational agents are promising tools to complement existing online resources for math education, and that data-driven approaches such as contextual bandits are valuable tools for learning effective personalization.
机译:为了模拟人类数学指导的交互,我们开发了Mathbot,这是一个基于规则的聊天概念,解释了数学概念,提供了练习问题,并提供量身定制的反馈。我们通过三个亚马逊机械土耳其研究评估Mathbot,其中参与者了解算术序列。在第一项研究中,我们发现超过40%的参与者表示,偏好与Mathbot过度的视频和来自Khan Academy的书面教程学习。第二项研究测量了学习收益,并发现Mathbot为Khan Academy视频和教程产生了可比的收益。我们在两项研究中征求了用户的反馈,以模仿真实的发展周期,有些用户发现课程过于慢,其他用户发现太快。我们通过将上下文强盗算法集成到MathBot来个性化对话的步伐来解决第三和主要研究中的这些问题,允许强盗插入额外的练习问题或跳过解释。我们在两个条件之间随机化参与者,其中在随机均匀地选择动作(即,随机A / B实验)或由上下文强盗均匀选择。我们发现,该强盗将学习了类似有效的教学政策,以便在随机A / B实验中学到的,同时产生较低的实验成本。我们的研究结果表明,个性化的会话代理商是有助于补充现有数学教育的在线资源,以及上下文匪徒等数据驱动的方法是学习有效个性化的宝贵工具。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号