首页> 外文期刊>Audio, Speech, and Language Processing, IEEE/ACM Transactions on >Attention-Based Response Generation Using Parallel Double Q-Learning for Dialog Policy Decision in a Conversational System
【24h】

Attention-Based Response Generation Using Parallel Double Q-Learning for Dialog Policy Decision in a Conversational System

机译:基于注意力的响应生成使用并行双Q学习进行对话系统中的对话策略决策

获取原文
获取原文并翻译 | 示例

摘要

This article proposes an approach to response generation using a Parallel Double Q-learning algorithm for dialog policy decision in a conversational system. First, a new semantic representation of the user's input sentence is presented by using the CKIP parser to derive the semantic dependency sequence of the input sentence. Then, a Gated Recurrent Unit-based Autoencoder is used to obtain the user's turn representation as well as context representation. A Parallel Double Q-learning algorithm with a Deep Neural Network (PD-DQN), combining two Double DQNs in parallel for the contextual and semantic information in the user's message, respectively, are proposed to determine the dialog act. Finally, the user's input and the determined dialog act are fed to an attention-based Transformer model to generate the response template. With the generated response template, the semantic slots are filled with their corresponding values to obtain the final sentence response. This article collects a multi-turn conversation database consisting of 4186 turns in the travel domain and 447 chitchat question-answer pairs as the evaluation corpus. Five-fold cross validation is employed for performance evaluation. Experimental results show that the proposed approach based on semantic dependency for intent detection increases the accuracy by 4.3%. For dialog policy decision, the PD-DQN achieves 87.57% task success rate, which is 13.9% higher than the baseline Double DQN (73.67%). Finally, using the attention-based Transformer for response template generation obtains a Bleu score of 13.6, improved by 1.5 compared to the Sequence-to-Sequence model. In subjective evaluation, both the dialog policy and sentence generation model achieve a higher appropriateness and grammatical correctness scores than the baseline system.
机译:本文提出了一种利用对话系统中的对话策略决策的并行双Q学习算法来实现响应生成的方法。首先,通过使用CKIP解析器派生输入句子的语义依赖序列来呈现用户输入句子的新语义表示。然后,使用基于门控复发单元的AutoEncoder来获得用户的转向表示以及上下文表示。提出了一种具有深神经网络(PD-DQN)的并行双Q学习算法,分别与用户消息中的上下文和语义信息并行组合两个双DQN,以确定对话框。最后,用户的输入和所确定的对话框ACT被馈送到基于关注的变压器模型以生成响应模板。利用生成的响应模板,语义时隙填充了它们对应的值以获得最终的句子响应。本文收集了由4186圈中的多转谈话数据库,以及447聊天问题答案对作为评估语料库。使用五倍的交叉验证来进行性能评估。实验结果表明,基于意图检测的语义依赖的建议方法会增加4.3%的准确性。对于对话策略决策,PD-DQN达到87.57%的任务成功率,比基线双DQN高出13.9%(73.67%)。最后,与响应模板生成的基于注意力的变压器获得13.6的BLEU分数,与序列到序列模型相比,通过1.5提高。在主观评估中,对话策略和句子生成模型都达到了比基线系统更高的适当性和语法正确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号