首页> 外文会议>Conference of the European Chapter of the Association for Computational Linguistics >Rethinking Coherence Modeling: Synthetic vs. Downstream Tasks
【24h】

Rethinking Coherence Modeling: Synthetic vs. Downstream Tasks

机译:重新思考一致性建模:合成与下游任务

获取原文

摘要

Although coherence modeling has come a long way in developing novel models, their evaluation on downstream applications for which they are purportedly developed has largely been neglected. With the advancements made by neural approaches in applications such as machine translation (MT), summarization and dialog systems, the need for coherence evaluation of these tasks is now more crucial than ever. However, coherence models are typically evaluated only on synthetic tasks, which may not be representative of their performance in downstream applications. To investigate how representative the synthetic tasks are of downstream use cases, we conduct experiments on benchmarking well-known traditional and neural coherence models on synthetic sentence ordering tasks, and contrast this with their performance on three downstream applications: coherence evaluation for MT and summarization, and next utterance prediction in retrieval-based dialog. Our results demonstrate a weak correlation between the model performances in the synthetic tasks and the downstream applications, motivating alternate training and evaluation methods for coherence models.
机译:虽然连贯建模在开发新颖的模型方面已经走了很长的路要走,但他们对他们据称开发的下游应用的评价在很大程度上被忽视了。随着神经方法在机器翻译(MT),摘要和对话系统等应用中所做的进步,对这些任务的一致性评估的需求现在比以往任何时候都更关键。然而,相干模型通常仅在合成任务上进行评估,这可能不代表其在下游应用中的性能。为了调查代表性的合成任务如何在下游使用情况下,我们对熟悉句子订购任务的众所周知的传统和神经相干模型进行了实验,并将其与三个下游应用的表现相比:Mt和摘要的一致性评估,以及基于检索的对话中的下一个话语预测。我们的结果表明,合成任务和下游应用中的模型性能与下游应用之间的相关性,激励备用培训和相容模型的评估方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号