Neural Text Generation from Structured Data with Application to the Biography Domain

机译：从结构化数据生成神经文本并将其应用于传记领域

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper introduces a neural model for concept-to-text generation that scales to large, rich domains. It generates biographical sentences from fact tables on a new dataset of biographies from Wikipedia. This set is an order of magnitude larger than existing resources with over 700k samples and a 400k vocabulary. Our model builds on conditional neural language models for text generation. To deal with the large vocabulary, we extend these models to mix a fixed vocabulary with copy actions that transfer Sample-specific words from the input database to the generated output sentence. To deal with structured data, we allow the model to embed words differently depending on the data fields in which they occur. Our neural model significantly outperforms a Templated Kneser-Ney language model by nearly 15 BLEU.

机译：本文介绍了一种用于概念到文本生成的神经模型，该模型可扩展到大型，丰富的领域。它从Wikipedia的新传记数据集中的事实表生成传记句子。该集合比具有70万个样本和40万个词汇量的现有资源大一个数量级。我们的模型基于用于文本生成的条件神经语言模型。为了处理大量词汇，我们扩展了这些模型，以将固定词汇与复制操作混合使用，这些复制操作将特定于样本的单词从输入数据库传输到生成的输出句子。为了处理结构化数据，我们允许模型根据单词出现的数据字段以不同的方式嵌入单词。我们的神经模型明显优于模板化的Kneser-Ney语言模型近15个BLEU。

著录项

来源
《Conference on empirical methods in natural language processing》|2016年|1203-1213|共11页
会议地点
作者
Remi Lebret; David Grangier; Michael Auli;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
入库时间 2022-08-26 14:31:35

相似文献

外文文献
中文文献
专利

1. Practical text mining and statistical analysis for non-structured text data applications [J] . Radu State Computing reviews . 2014,第9期

机译：适用于非结构化文本数据应用程序的实用文本挖掘和统计分析
2. Modeling Content Structures of Domain-Specific Texts with RUP-HDP-HSMM and Its Applications [J] . Youwei LU, Shogo OKADA, Katsumi NITTA IEICE transactions on information and systems . 2017,第9期

机译：用RUP-HDP-HSMM建模领域特定文本的内容结构及其应用
3. An Overview on XML Semantic Disambiguation from Unstructured Text to Semi-Structured Data: Background, Applications, and Ongoing Challenges [J] . Joe Tekli IEEE Transactions on Knowledge and Data Engineering . 2016,第6期

机译：从非结构化文本到半结构化数据的XML语义歧义概述：背景，应用程序和持续的挑战
4. Neural Text Generation from Structured Data with Application to the Biography Domain [C] . Remi Lebret, David Grangier, Michael Auli Conference on empirical methods in natural language processing . 2016

机译：从结构化数据的神经文本生成，应用于传记域
5. Neural Structured Prediction Using Iterative Refinement with Applications to Text and Molecule Generation [D] . Mansimov, Elman. 2021

机译：使用迭代细化与文本和分子生成的神经结构预测
6. Structuring text and standardizing data for clinical and population health applications [O] . Lucila Ohno-Machado 2014

机译：为临床和人群健康应用构建文本并标准化数据
7. Data2Text Studio: Automated Text Generation from Structured Data [O] . Longxu Dou, Guanghui Qin, Jinpeng Wang, 2018

机译：data2text Studio：从结构化数据中生成自动文本

Neural Text Generation from Structured Data with Application to the Biography Domain

摘要

著录项

相似文献

相关主题

期刊订阅