首页> 外文期刊>Computer speech and language >A Bi-LSTM memory network for end-to-end goal-oriented dialog learning
【24h】

A Bi-LSTM memory network for end-to-end goal-oriented dialog learning

机译:Bi-LSTM内存网络,用于端到端面向目标的对话学习

获取原文
获取原文并翻译 | 示例
           

摘要

We develop a model to satisfy the requirements of Dialog System Technology Challenge 6 (DSTC6) Track 1: building an end-to-end dialog systems for goal-oriented applications. This task involves learning a dialog policy from transactional dialogs in a given domain. Automatic system responses are generated using given task-oriented dialog data (http://workshop.colips.org/dstc6/index.html). As this task has a similar structure to a question answering task (Weston et al., 2015), we employ the MemN2N architecture (Sukhbaatar et al., 2015), which outperforms models based on recurrent neural networks or long short-term memory (LSTM). However, two problems arise when applying this model to the DSTC6 task. First, we encounter an out-of-vocabulary problem, which we resolve by categorizing the metadata types of words that exist in the knowledge base; the metadata is similar to the named entity. Second, the original memory network model has a weak ability to reflect sufficient temporal information, because it only uses sentence-level embeddings. Therefore, we add bidirectional LSTM (Bi-LSTM) at the beginning of the model to better reflect temporal information. The experimental results demonstrate that our model reflects temporal features well. Furthermore, our model achieves state-of-the-art performance among the memory networks, and is comparable to hybrid code networks (Ham et al., 2017) and hierarchical LSTM model (Bai et al., 2017) which is not an end-to-end architecture. (C) 2018 Elsevier Ltd. All rights reserved.
机译:我们开发了一个模型来满足Dialog System Technology Challenge 6(DSTC6)轨道1的要求:为面向目标的应用程序构建端到端对话框系统。此任务涉及从给定域中的事务性对话中学习对话策略。使用给定的面向任务的对话数据(http://workshop.colips.org/dstc6/index.html)可以生成自动系统响应。由于此任务的结构与问答任务(Weston等人,2015)相似,因此我们采用MemN2N架构(Sukhbaatar等人,2015),其性能优于基于递归神经网络或长期短期记忆的模型( LSTM)。但是,将此模型应用于DSTC6任务时会出现两个问题。首先,我们遇到一个词汇不足的问题,我们通过对知识库中存在的单词的元数据类型进行分类来解决该问题;元数据类似于命名实体。其次,原始的内存网络模型反映足够的时间信息的能力很弱,因为它仅使用句子级嵌入。因此,我们在模型的开头添加了双向LSTM(Bi-LSTM),以更好地反映时间信息。实验结果表明我们的模型很好地反映了时间特征。此外,我们的模型在存储网络中达到了最先进的性能,并且可以与混合代码网络(Ham等人,2017)和分层LSTM模型(Bai等人,2017)相提并论,但这并不是最终目的。到端的架构。 (C)2018 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号