首页> 外文会议>Natural language understanding and intelligent applications >Cascaded LSTMs Based Deep Reinforcement Learning for Goal-Driven Dialogue
【24h】

Cascaded LSTMs Based Deep Reinforcement Learning for Goal-Driven Dialogue

机译:基于级联LSTM的目标学习对话的深度强化学习

获取原文
获取原文并翻译 | 示例

摘要

This paper proposes a deep neural network model for jointly modeling Natural Language Understanding and Dialogue Management in goal-driven dialogue systems. There are three parts in this model. A Long Short-Term Memory (LSTM) at the bottom of the network encodes utterances in each dialogue turn into a turn embedding. Dialogue embeddings are learned by a LSTM at the middle of the network, and updated by the feeding of all turn embeddings. The top part is a forward Deep Neural Network which converts dialogue embeddings into the Q-values of different dialogue actions. The cascaded LSTMs based reinforcement learning network is jointly optimized by making use of the rewards received at each dialogue turn as the only supervision information. There is no explicit NLU and dialogue states in the network. Experimental results show that our model outperforms both traditional Markov Decision Process (MDP) model and single LSTM with Deep Q-Network on meeting room booking tasks. Visualization of dialogue embeddings illustrates that the model can learn the representation of dialogue states.
机译:本文提出了一种深度神经网络模型,用于在目标驱动的对话系统中联合建模自然语言理解和对话管理。此模型分为三个部分。网络底部的长短期记忆(LSTM)将每个对话转弯中的语音编码为转弯嵌入。对话嵌入是由LSTM在网络中间学习的,并通过提供所有回合嵌入来更新。顶部是前向深度神经网络,它将对话嵌入转换为不同对话动作的Q值。基于级联的LSTM的强化学习网络是通过将每次对话时收到的奖励用作唯一的监督信息来进行联合优化的。网络中没有明确的NLU和对话状态。实验结果表明,在会议室预订任务上,我们的模型优于传统的马尔可夫决策过程(MDP)模型和具有Deep Q网络的单个LSTM。对话嵌入的可视化说明该模型可以学习对话状态的表示。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号