KPTimes: A Large-Scale Dataset for Keyphrase Generation on News Documents

机译：KPTimes：用于新闻文档关键字生成的大规模数据集

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Keyphrase generation is the task of predicting a set of lexical units that conveys the main content of a source text. Existing datasets for keyphrase generation are only readily available for the scholarly domain and include nonexpert annotations. In this paper we present KPTimes, a large-scale dataset of news texts paired with editor-curated keyphrases. Exploring the dataset, we show how editors tag documents, and how their annotations differ from those found in existing datasets. We also train and evaluate state-of-the-art neural keyphrase generation models on KPTimes to gain insights on how well they perform on the news domain. The dataset is available online at https: // github.com/ygorg/KPTimes.

机译：关键字短语的生成是预测一组词汇单元的任务，该词汇单元传达源文本的主要内容。用于关键字短语生成的现有数据集仅可用于学术领域，并且包含非专家注释。在本文中，我们介绍了KPTimes，这是新闻文本的大型数据集，并配有编辑者策划的关键词。探索数据集，我们展示了编辑者如何标记文档以及它们的注释与现有数据集中的注释有何不同。我们还将在KPTimes上训练和评估最新的神经关键字短语生成模型，以了解它们在新闻领域的表现如何。该数据集可从https://github.com/ygorg/KPTimes在线获得。

著录项

来源
《International natural language generation conference》|2019年|130-135|共6页
会议地点
作者
Ygor Gallina; Florian Boudin; Beatrice Daille;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Automatic keyphrase extraction for Arabic news documents based on KEA system [J] . Duwairi Rehab, Hedaya Mona Journal of intelligent & fuzzy systems: Applications in Engineering and Technology . 2016,第4期

机译：基于KEA系统的阿拉伯新闻文档自动关键词提取
2. Face Retrieval in Large-Scale News Video Datasets [J] . Thanh Duc NGO, Hung Thanh VU, Duy-Dinh LE, IEICE transactions on information and systems . 2013,第8期

机译：大规模新闻视频数据集中的人脸检索
3. Face Retrieval in Large-Scale News Video Datasets [J] . Thanh Duc NGO, Hung Thanh VU, Duy-Dinh LE, IEICE Transactions on Information and Systems . 2013,第8期

机译：大规模新闻视频数据集中的人脸检索
4. KPTimes: A Large-Scale Dataset for Keyphrase Generation on News Documents [C] . Ygor Gallina, Florian Boudin, Beatrice Daille International natural language generation conference . 2019

机译：kptimes：新闻文件上的关键数据集是关键词
5. Spatial Discovery and the Research Library: Linking Research Datasets and Documents. [D] . Lafia, Sara. 2017

机译：空间发现与研究库：链接研究数据集和文档。
6. Representing Documents via Latent Keyphrase Inference [O] . Jialu Liu, Xiang Ren, Jingbo Shang, -1

机译：通过潜在的关键短语推断表示文档
7. Keyphrase Cloud Generation of Broadcast News [O] . Marujo, Luis, Viveiros, Márcio, Neto, João Paulo da Silva 2013

机译：广播新闻的关键词云生成

KPTimes: A Large-Scale Dataset for Keyphrase Generation on News Documents

摘要

著录项

相似文献

相关主题

期刊订阅