首页> 外文会议>Annual conference of the North American Chapter of the Association for Computational Linguistics: human language technologies >Dialogue Learning with Human Teaching and Feedback in End-to-End Trainable Task-Oriented Dialogue Systems
【24h】

Dialogue Learning with Human Teaching and Feedback in End-to-End Trainable Task-Oriented Dialogue Systems

机译:在端到端可训练的,面向任务的对话系统中通过人类教学和反馈进行对话学习

获取原文

摘要

In this work, we present a hybrid learning method for training task-oriented dialogue systems through online user interactions. Popular methods for learning task-oriented dialogues include applying reinforcement learning with user feedback on supervised pre-training models. Efficiency of such learning method may suffer from the mismatch of dialogue state distribution between offline training and online interactive learning stages. To address this challenge, we propose a hybrid imitation and reinforcement learning method, with which a dialogue agent can effectively learn from its interaction with users by learning from human teaching and feedback. We design a neural network based task-oriented dialogue agent that can be optimized end-to-end with the proposed learning method. Experimental results show that our end-to-end dialogue agent can learn effectively from the mistake it makes via imitation learning from user teaching. Applying reinforcement learning with user feedback after the imitation learning stage further improves the agent's capability in successfully completing a task.
机译:在这项工作中,我们提出了一种混合学习方法,用于通过在线用户交互来训练面向任务的对话系统。学习面向任务的对话的流行方法包括在监督的预训练模型上通过用户反馈应用强化学习。这种学习方法的效率可能会受到离线培训和在线互动学习阶段之间对话状态分布不匹配的困扰。为了应对这一挑战,我们提出了一种混合模仿和强化学习方法,通过这种方法,对话代理可以通过从人类的教学和反馈中学习,从而从与用户的互动中有效地学习。我们设计了一种基于神经网络的,面向任务的对话代理,该代理可以通过所提出的学习方法进行端到端的优化。实验结果表明,我们的端到端对话代理可以通过从用户教学中进行的模仿学习而从错误中有效学习。在模仿学习阶段之后,通过用户反馈应用强化学习,可以进一步提高座席成功完成任务的能力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号