首页> 外文会议>International Joint Conference on Artificial Intelligence >Reinforcing Coherence for Sequence to Sequence Model in Dialogue Generation
【24h】

Reinforcing Coherence for Sequence to Sequence Model in Dialogue Generation

机译:在对话生成中加强序列序列模型的相干性

获取原文

摘要

Sequence to sequence (Seq2Seq) approach has gained great attention in the field of single-turn dialogue generation. However, one serious problem is that most existing Seq2Seq based models tend to generate common responses lacking specific meanings. Our analysis show that the underlying reason is that Seq2Seq is equivalent to optimizing Kullback-Leibler (KL) divergence, thus does not penalize the case whose generated probability is high while the true probability is low. However, the true probability is unknown, which poses challenges for tackling this problem. Inspired by the fact that the coherence (i.e. similarity) between post and response is consistent with human evaluation, we hypothesize that the true probability of a response is proportional to the coherence degree. The coherence scores are then used as the reward function in a reinforcement learning framework to penalize the case whose generated probability is high while the true probability is low. Three different types of coherence models, including an unlearned similarity function, a pretrained semantic matching function, and an end-to-end dual learning architecture, are proposed in this paper. Experimental results on both Chinese Weibo dataset and English Subtitle dataset show that the proposed models produce more specific and meaningful responses, yielding better performances against Seq2Seq models in terms of both metric-based and human evaluations.
机译:序列(SEQ2Seq)方法在单转对话的领域中获得了很大的关注。然而,一个严重的问题是基于大多数现有的SEQ2Seq模型倾向于产生缺乏特定含义的共同响应。我们的分析表明,潜在的原因是SEQ2Seq相当于优化Kullback-Leibler(KL)发散,因此不会惩罚产生概率高的情况,而真正的概率低。然而,真正的概率是未知的,这造成了解决这个问题的挑战。受到职位和响应之间的相干性(即相似性)与人类评估一致的事实,我们假设反应的真正概率与相干度成比例。然后将相干分数用作加强学习框架中的奖励功能,以惩罚产生的概率高的情况,而真正的概率低。本文提出了三种不同类型的相干模型,包括未经读数的相似函数,预先训练的语义匹配功能和端到端的双学习架构。中国微博数据集和英语字幕数据集的实验结果表明,拟议的模型会产生更具体和有意义的响应,从基于度量和人类评估方面产生更好的针对SEQ2SEQ模型的表现。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号