首页> 外文会议>Conference on empirical methods in natural language processing >Neural Text Generation from Structured Data with Application to the Biography Domain
【24h】

Neural Text Generation from Structured Data with Application to the Biography Domain

机译:从结构化数据生成神经文本并将其应用于传记领域

获取原文

摘要

This paper introduces a neural model for concept-to-text generation that scales to large, rich domains. It generates biographical sentences from fact tables on a new dataset of biographies from Wikipedia. This set is an order of magnitude larger than existing resources with over 700k samples and a 400k vocabulary. Our model builds on conditional neural language models for text generation. To deal with the large vocabulary, we extend these models to mix a fixed vocabulary with copy actions that transfer Sample-specific words from the input database to the generated output sentence. To deal with structured data, we allow the model to embed words differently depending on the data fields in which they occur. Our neural model significantly outperforms a Templated Kneser-Ney language model by nearly 15 BLEU.
机译:本文介绍了一种用于概念到文本生成的神经模型,该模型可扩展到大型,丰富的领域。它从Wikipedia的新传记数据集中的事实表生成传记句子。该集合比具有70万个样本和40万个词汇量的现有资源大一个数量级。我们的模型基于用于文本生成的条件神经语言模型。为了处理大量词汇,我们扩展了这些模型,以将固定词汇与复制操作混合使用,这些复制操作将特定于样本的单词从输入数据库传输到生成的输出句子。为了处理结构化数据,我们允许模型根据单词出现的数据字段以不同的方式嵌入单词。我们的神经模型明显优于模板化的Kneser-Ney语言模型近15个BLEU。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号