【24h】

Reinforcement Learning Based Dialogue Management Strategy

机译:基于强化学习的对话管理策略

获取原文
获取外文期刊封面目录资料

摘要

This paper proposes a novel Markov Decision Process (MDP) to solve the problem of learning an optimal strategy by a Dialogue Manager for a flight enquiry system. A unique representation of state is presented followed by a relevant action set and a reward model which is specific to different time-steps. Different Reinforcement Learning (RL) algorithms based on classical methods and Deep Learning techniques have been implemented for the execution of the Dialogue Management component. To establish the robustness of the system, existing Slot-Filling (SF) module has been integrated with the system. The system can still generate valid responses to act sensibly even if the SF module falters. The experimental results indicate that the proposed MDP and the system hold promise to be scalable across satisfying the intent of the user.
机译:本文提出了一种新颖的马尔可夫决策过程(MDP),以解决对话管理器学习航班查询系统最优策略的问题。呈现状态的唯一表示,然后是相关的动作集和特定于不同时间步长的奖励模型。已经为执行对话管理组件实施了基于经典方法和深度学习技术的不同强化学习(RL)算法。为了建立系统的健壮性,现有的插槽填充(SF)模块已与系统集成在一起。即使SF模块停滞不前,系统仍可以生成有效的响应以明智地采取行动。实验结果表明,提出的MDP和系统有望在满足用户意图方面进行扩展。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号