首页> 外文期刊>ACM Transactions on Information Systems >PONE: A Novel Automatic Evaluation Metric for Open-domain Generative Dialogue Systems
【24h】

PONE: A Novel Automatic Evaluation Metric for Open-domain Generative Dialogue Systems

机译:POE:开放式生成对话系统的新型自动评估度量

获取原文
获取原文并翻译 | 示例

摘要

Open-domain generative dialogue systems have attracted considerable attention over the past few years. Currently, how to automatically evaluate them is still a big challenge. As far as we know, there are three kinds of automatic evaluations for open-domain generative dialogue systems: (1) Word-overlap-based metrics; (2) Embedding-based metrics; (3) Learning-based metrics. Due to the lack of systematic comparison, it is not clear which kind of metrics is more effective. In this article, we first measure systematically all kinds of metrics to check which kind is best. Extensive experiments demonstrate that learning-based metrics are the most effective evaluation metrics for open-domain generative dialogue systems. Moreover, we observe that nearly all learning-based metrics depend on the negative sampling mechanism, which obtains extremely unbalanced and low-quality samples to train a score model. To address this issue, we propose a novel learning-based metric that significantly improves the correlation with human judgments by using augmented POsitive samples and valuable NEgative samples, called PONE. Extensive experiments demonstrate that PONE significantly outperforms the state-of-the-art learning-based evaluation method. Besides, we have publicly released the codes of our proposed metric and state-of-the-art baselines.(1)
机译:开放式生成对话系统在过去几年中引起了相当大的关注。目前,如何自动评估它们仍然是一个很大的挑战。据我们所知,开放式生成对话系统有三种自动评估:(1)基于词汇的度量; (2)基于嵌入的指标; (3)基于学习的指标。由于缺乏系统的比较,目前尚不清楚哪种指标更有效。在本文中,我们首先衡量系统地检查各种度量,检查哪种类型。广泛的实验表明,基于学习的指标是开放式生成对话系统中最有效的评估度量。此外,我们观察到几乎所有基于学习的指标取决于负面采样机制,从而获得极其不平衡和低质量的样本以训练得分模型。为了解决这个问题,我们提出了一种新的基于学习的公制,通过使用增强的阳性样本和有价值的阴性样本来显着提高与人类判断的相关性。广泛的实验表明,PEE显着优于最先进的基于学习的评估方法。此外,我们公开发布了我们拟议的公制和最先进的基线的代码。(1)

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号