Automatic keyphrase extraction using word embeddings

Yuxiang Zhang; Huan Liu; Suge Wang; W. H. Ip.; Wei Fan; Chunjing Xiao

首页> 外文期刊>Soft computing: A fusion of foundations, methodologies and applications >Automatic keyphrase extraction using word embeddings

【24h】

Automatic keyphrase extraction using word embeddings

机译：使用Word Embeddings自动关键字提取

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Unsupervised random-walk keyphrase extraction models mainly rely on global structural information of the word graph, with nodes representing candidate words and edges capturing the co-occurrence information between candidate words. However, using word embedding method to integrate multiple kinds of useful information into the random-walk model to help better extract keyphrases is relatively unexplored. In this paper, we propose a random-walk-based ranking method to extract keyphrases from text documents using word embeddings. Specifically, we first design a heterogeneous text graph embedding model to integrate local context information of the word graph (i.e., the local word collocation patterns) with some crucial features of candidate words and edges of the word graph. Then, a novel random-walk-based ranking model is designed to score candidate words by leveraging such learned word embeddings. Finally, a new and generic similarity-based phrase scoring model using word embeddings is proposed to score phrases for selecting top-scoring phrases as keyphrases. Experimental results show that the proposed method consistently outperforms eight state-of-the-art unsupervised methods on three real datasets for keyphrase extraction.

机译：无监督的随机步道关键词提取模型主要依赖于单词图的全局结构信息，其中节点表示候选词和边缘捕获候选词之间的共同发生信息。但是，使用Word嵌入方法将多种有用信息集成到随机步行模型中，以帮助更好的提取密钥次相对未探索。在本文中，我们提出了一种随机散步的排名方法，可以使用Word Embeddings从文本文档中提取关键次数。具体地，我们首先设计一个异构文本图嵌入模型，以将字图（即，本地词搭配模式）的本地上下文信息与单词图的候选词和边缘的一些重要特征集成在一起。然后，设计了一种新的随机步行排名模型，用于通过利用这样的学习词嵌入来获得候选词。最后，提出了一种使用Word Embeddings的基于新的基于相似性的短语评分模型，以逐句选择要选择顶级短语作为关键词的短语。实验结果表明，该方法在三个真实数据集中始终如一地优于八种最新的无监督方法，用于关键正萃取。

著录项

来源
《Soft computing: A fusion of foundations, methodologies and applications》 |2020年第8期|共16页
作者
Yuxiang Zhang; Huan Liu; Suge Wang; W. H. Ip.; Wei Fan; Chunjing Xiao;
展开▼
作者单位

1grid.411713.10000 0000 9364 0373School of Computer Science and TechnologyCivil Aviation University of ChinaTianjinChina;

1grid.411713.10000 0000 9364 0373School of Computer Science and TechnologyCivil Aviation University of ChinaTianjinChina;

2grid.163032.50000 0004 1760 2008School of Computer and Information TechnologyShanxi UniversityTaiyuanChina;

3grid.16890.360000 0004 1764 6123Department of Industrial and Systems EngineeringHong Kong Polytechnic UniversityKowloonHong Kong SARChina;

1grid.411713.10000 0000 9364 0373School of Computer Science and TechnologyCivil Aviation University of ChinaTianjinChina;

1grid.411713.10000 0000 9364 0373School of Computer Science and TechnologyCivil Aviation University of ChinaTianjinChina;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算机软件;
关键词
Keyphrase extraction; Random-walk-based keyphrase extraction model; Word embedding; Phrase scoring model;

机译：关键术提取;基于随机步行的关键词提取模型;词嵌入;短语评分模型;

相似文献

外文文献
中文文献
专利

1. Automatic keyphrase extraction using word embeddings [J] . Yuxiang Zhang, Huan Liu, Suge Wang, Soft computing: A fusion of foundations, methodologies and applications . 2020,第8期

机译：使用Word Embeddings自动关键字提取
2. Geoscience keyphrase extraction algorithm using enhanced word embedding [J] . Qiu Qinjun, Xie Zhong, Wu Liang, Expert Systems with Application . 2019,第JULa期

机译：基于增强词嵌入的地球科学关键词提取算法
3. Geoscience keyphrase extraction algorithm using enhanced word embedding [J] . Qiu Qinjun, Xie Zhong, Wu Liang, Expert systems with applications . 2019,第Jula期

机译：Geoscience Keyphrase提取算法使用增强词嵌入
4. Exploiting Position and Contextual Word Embeddings for Keyphrase Extraction from Scientific Papers [C] . Krutarth Patel, Cornelia Caragea Conference of the European Chapter of the Association for Computational Linguistics . 2021

机译：科学论文对关键词提取关键词的剥削位置和上下文词嵌入
5. Evaluation techniques and graph-based algorithms for automatic summarization and keyphrase extraction. [D] . Hamid, Fahmida. 2016

机译：自动汇总和关键短语提取的评估技术和基于图的算法。
6. The Fractal Patterns of Words in a Text: A Method for Automatic Keyword Extraction [O] . Elham Najafi, Amir H. Darooneh -1

机译：文本中词的分形模式：一种自动关键词提取方法
7. Semantic Unsupervised Automatic Keyphrases Extraction by Integrating Word Embedding with Clustering Methods [O] . Isabella Gagliardi, Maria Teresa Artese 2020

机译：通过将单词嵌入嵌入方法与聚类方法集成来提取语义无监督的自动关键字段

Automatic keyphrase extraction using word embeddings

摘要

著录项

相似文献

相关主题

期刊订阅