【24h】

Enhancing Neural Data-To-Text Generation Models with External Background Knowledge

机译:利用外部背景知识增强神经数据转文本生成模型

获取原文

摘要

Recent neural models for data-to-text generation rely on massive parallel pairs of data and text to learn the writing knowledge. They often assume that writing knowledge can be acquired from the training data alone. However, when people are writing, they not only rely on the data but also consider related knowledge. In this paper, we enhance neural data-to-text models with externa] knowledge in a simple but effective way to improve the fidelity of generated text. Besides relying on parallel data and text as in previous work, our model attends to relevant external knowledge, encoded as a temporary memory, and combines this knowledge with the context representation of data before generating words. This allows the model to infer relevant facts which are not explicitly stated in the data table from an external knowledge source. Experimental results on twenty-one Wikipedia infobox-to-text datasets show our model, KBAtt. consistently improves a state-of-the-art model on most of the datasets. In addition, to quantify when and why external knowledge is effective, we design a metric, KBGain. which shows a strong correlation with the observed performance boost. This result demonstrates the relevance of external knowledge and sparseness of original data are the main factors affecting system performance.
机译:用于数据到文本生成的最新神经模型依靠大量并行的数据和文本对来学习写作知识。他们通常认为写作知识可以仅从培训数据中获得。但是,当人们写作时,他们不仅依靠数据,而且还考虑相关知识。在本文中,我们以一种简单而有效的方式增强了具有外部知识的神经数据到文本模型,以提高生成文本的保真度。除了像以前的工作一样依赖并行数据和文本之外,我们的模型还涉及相关的外部知识,这些知识被编码为临时内存,并在生成单词之前将这些知识与数据的上下文表示结合起来。这允许模型从外部知识源推断数据表中未明确陈述的相关事实。在二十一个Wikipedia infobox-to-text数据集上的实验结果显示了我们的模型KBAtt。持续改善大多数数据集的最新模型。此外,为了量化何时以及为什么外部知识有效,我们设计了一个指标KBGain。这表明与观察到的性能提升有很强的相关性。该结果表明,外部知识的相关性和原始数据的稀疏性是影响系统性能的主要因素。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号