Unsupervised-learning-based keyphrase extraction from a single document by the effective combination of the graph-based model and the modified C-value method

Yeom Hongseon; Ko Youngjoong; Seo Jungyun

首页> 外文期刊>Computer speech and language >Unsupervised-learning-based keyphrase extraction from a single document by the effective combination of the graph-based model and the modified C-value method

【24h】

Unsupervised-learning-based keyphrase extraction from a single document by the effective combination of the graph-based model and the modified C-value method

机译：通过有效结合基于图的模型和改进的C值方法从单个文档中提取基于无监督学习的关键字

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Keyphrases of a given document represent its main topic and they are used as a simple method to represent the document. Statistical and graph-based models as unsupervised approaches have been mainly studied. The statistical models have some difficulty in extracting keyphrases from a single document because most statistical ones generally require statistical information from a large corpus. On the other hand, graph-based models can extract keyphrases by only using the information from a single document; nevertheless, they have some drawbacks. The scores of the edges can be biased because a single document does not contain sufficient information to score the edges of a graph and this influences the score of the nodes. In this paper, we propose an effective combination method of a statistical model, C-value method, and a graph-based model to overcome the drawbacks of each model. A new scoring method for keyphrase candidates is developed by the graph-based model and the scores calculated by the new method are applied to the modified C-value method to estimate the final importance scores of the keyphrase candidates. Subsequently, the proposed model is evaluated using two datasets, SemEval 2010 and Inspec, and its results outperformed the state-of-the-art model among unsupervised models and the existing graph-based ranking models. (C) 2019 Elsevier Ltd. All rights reserved.

机译：给定文档的关键字短语代表其主要主题，它们被用作表示文档的简单方法。统计和基于图形的模型作为无监督方法已被主要研究。统计模型在从单个文档中提取关键短语方面有些困难，因为大多数统计模型通常都需要大型语料库的统计信息。另一方面，基于图的模型可以仅通过使用单个文档中的信息来提取关键字。但是，它们有一些缺点。边缘的分数可能会出现偏差，因为单个文档不包含足够的信息来对图形的边缘进行分数，并且这会影响节点的分数。在本文中，我们提出了一种有效的统计模型，C值方法和基于图的模型的组合方法，以克服每种模型的缺点。通过基于图的模型开发了一种针对关键短语候选者的新评分方法，并将通过该新方法计算出的分数应用于修改后的C值方法，以估算关键短语候选者的最终重要性分数。随后，使用两个数据集SemEval 2010和Inspec对提议的模型进行了评估，其结果优于无人监督模型和现有基于图的排名模型中的最新模型。（C）2019 Elsevier Ltd.保留所有权利。

著录项

来源
《Computer speech and language》 |2019年第11期|304-318|共15页
作者
Yeom Hongseon; Ko Youngjoong; Seo Jungyun;
展开▼
作者单位

Sogang Univ, Dept Comp Engn, 35 Baekbeom Ro, Seoul 04107, South Korea;

Dong A Univ, Dept Comp Engn, 840 Hadan 2 Dong, Busan 604714, South Korea;

Sogang Univ, Dept Comp Engn, 35 Baekbeom Ro, Seoul 04107, South Korea;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Automatic keyphrase extraction; Graph-basedranking algorithm; C-value; PageRank; Information extraction;

机译：自动关键词提取;基于图形的算法;C值;PageRank;信息提取;

相似文献

外文文献
中文文献
专利

1. Unsupervised-learning-based keyphrase extraction from a single document by the effective combination of the graph-based model and the modified C-value method [J] . Yeom Hongseon, Ko Youngjoong, Seo Jungyun Computer speech and language . 2019,第Nova期

机译：通过基于图形的模型的有效组合和改进的C值方法的无监督学习的基于学习的关键词提取
2. Single-Document Keyphrase Extraction for Multi-Document Keyphrase Extraction [J] . Gábor Berend, Richárd Farkas Computacion y Sistemas . 2013,第2期

机译：单文档关键字提取用于多文档关键字提取
3. RankUp: Enhancing graph-based keyphrase extraction methods with error-feedback propagation [J] . Gerardo Figueroa, Po-Chi Chen, Yi-Shin Chen Computer speech and language . 2018,第JANa期

机译：RankUp：通过错误反馈传播增强基于图的关键字提取方法
4. A graph-based ranking model for automatic keyphrases extraction from Arabic documents [C] . Mohamed Salim EL BAZZI, Driss MAMMASS, Taher ZAKI, International conference on data mining . 2017

机译：基于图的排名模型，用于从阿拉伯文档中自动提取关键词
5. Evaluation techniques and graph-based algorithms for automatic summarization and keyphrase extraction. [D] . Hamid, Fahmida. 2016

机译：自动汇总和关键短语提取的评估技术和基于图的算法。
6. Deep neural model with self-training for scientific keyphrase extraction [O] . Xun Zhu, Chen Lyu, Donghong Ji, 2020

机译：具有自我训练的深度神经模型用于科学关键训练
7. Automatic keyphrase extraction using graph-based methods [O] . Josiane Mothe, Faneva Ramiandrisoa, Michael Rasolomanana 2018

机译：使用基于图形的方法自动关键词提取

Unsupervised-learning-based keyphrase extraction from a single document by the effective combination of the graph-based model and the modified C-value method

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅