首页> 外文会议>International joint conference on natural language processing;Conference on empirical methods in natural language processing >Judge the Judges: A Large-Scale Evaluation Study of Neural Language Models for Online Review Generation
【24h】

Judge the Judges: A Large-Scale Evaluation Study of Neural Language Models for Online Review Generation

机译:法官评审:用于在线评论生成的神经语言模型的大规模评估研究

获取原文

摘要

We conduct a large-scale, systematic study to evaluate the existing evaluation methods for natural language generation in the context of generating online product reviews. We compare human-based evaluators with a variety of automated evaluation procedures, including discriminative evaluators that measure how well machine-generated text can be distinguished from human-written text, as well as word overlap metrics that assess how similar the generated text compares to human-written references. We determine to what extent these different evaluators agree on the ranking of a dozen of state-of-the-art generators for online product reviews. We find that human evaluators do not correlate well with discriminative evaluators, leaving a bigger question of whether adversarial accuracy is the correct objective for natural language generation. In general, distinguishing machine-generated text is challenging even for human evaluators, and human decisions correlate better with lexical overlaps. We find lexical diversity an intriguing metric that is indicative of the assessments of different evaluators. A post-experiment survey of participants provides insights into how to evaluate and improve the quality of natural language generation systems~1.
机译:我们进行了大规模的系统研究,以在生成在线产品评论的背景下评估现有的自然语言生成评估方法。我们将以人为基础的评估人员与各种自动化评估程序进行比较,其中包括可判别机器生成的文本与人为书写的文本的区分性评估人员,以及评估生成的文本与人类的相似程度的单词重叠量度书面参考。我们确定这些不同的评估人员对十二种最先进的在线产品评论生成器的排名的认同程度。我们发现人类评估者与判别式评估者之间的关联性不高,而剩下的一个更大的问题是对抗性准确性是否是自然语言生成的正确目标。通常,区分机器生成的文本甚至对于人类评估者来说也是具有挑战性的,并且人类决策与词汇重叠的关联更好。我们发现词汇多样性是一个有趣的指标,它指示了不同评估者的评估。参与者的实验后调查提供了有关如何评估和提高自然语言生成系统的质量的见解〜1。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号