首页> 外文会议>Annual meeting of the Association for Computational Linguistics >Handling Divergent Reference Texts when Evaluating Table-to-Text Generation
【24h】

Handling Divergent Reference Texts when Evaluating Table-to-Text Generation

机译:在评估表到文本生成时处理不同的参考文本

获取原文

摘要

Automatically constructed datasets for generating text from semi-structured data (tables), such as WikiBio (Lebret et al., 2016), often contain reference texts that diverge from the information in the corresponding semi-structured data. We show that metrics which rely solely on the reference texts, such as BLEU and ROUGE, show poor correlation with human judgments when those references diverge. We propose a new metric, PARENT, which aligns n-grams from the reference and generated texts to the semi-structured data before computing their precision and recall. Through a large scale human evaluation study of table-to-text models for WikiBio, we show that PARENT correlates with human judgments better than existing text generation metrics. We also adapt and evaluate the information extraction based evaluation proposed in Wiseman et al. (2017), and show that PARENT has comparable correlation to it, while being easier to use. We show that PARENT is also applicable when the reference texts are elicited from humans using the data from the WebNLG challenge.~1
机译:自动构造的数据集以从半结构化数据(表)(如Wikibio(Lebret等),2016),通常包含从相应的半结构化数据中的信息发出的参考文本。我们展示了单独依赖于参考文本的指标,例如Bleu和Rouge,当这些引用发散时,与人类判断相关。我们提出了一个新的度量标准,父级,它在计算精度和召回之前将n-gram从参考和生成的文本转到半结构化数据。通过大规模的人类评估研究Wikibio的文本模型,我们表明父母比现有文本生成度量更好地与人类判断相关。我们还调整和评估Wisman等人提出的基于信息提取的评估。 (2017),并表明父母与它具有相当的相关性,同时更容易使用。我们表明,当使用来自Webnlg挑战的数据引发参考文本时,父母也适用。〜1

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号