A Joint Learning Approach based on Self-Distillation for Keyphrase Extraction from Scientific Documents

机译：基于自蒸馏对科学文献的关键蒸馏的联合学习方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Keyphrase extraction is the task of extracting a small set of phrases that best describe a document. Most existing benchmark datasets for the task typically have limited numbers of annotated documents, making it challenging to train increasingly complex neural networks. In contrast, digital libraries store millions of scientific articles online, covering a wide range of topics. While a significant portion of these articles contain keyphrases provided by their authors, most other articles lack such kind of annotations. Therefore, to effectively utilize these large amounts of unlabeled articles, we propose a simple and efficient joint learning approach based on the idea of self-distillation. Experimental results show that our approach consistently improves the performance of baseline models for keyphrase extraction. Furthermore, our best models outperform previous methods for the task, achieving new state-of-the-art results on two public benchmarks: Inspec and SemEval-2017.

机译：关键词提取是提取最能描述文件的一小组短语的任务。任务的大多数现有的基准数据集通常具有有限数量的注释文件，使得培训越来越复杂的神经网络的挑战性。相比之下，数字图书馆在线存储数百万科学文章，涵盖了各种主题。虽然这些文章的重要部分包含他们的作者提供的关键时代，但大多数其他文章都缺乏这种注释。因此，为了有效利用这些大量的未标记物品，我们提出了一种基于自蒸馏的想法的简单有效的联合学习方法。实验结果表明，我们的方法一致地提高了基线模型的关键酶提取。此外，我们最好的模型优于前面的任务方法，实现了两种公共基准的新的最先进结果：Inspec和Semeval-2017。

著录项

来源
《International Conference on Computational Linguistics》|2020年|649-656|共8页
会议地点
作者
Tuan Manh Lai; Trung Bui; Doo Soon Kim; Quan Hung Tran;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
入库时间 2022-08-26 13:55:56

相似文献

外文文献
中文文献
专利

1. Unsupervised-learning-based keyphrase extraction from a single document by the effective combination of the graph-based model and the modified C-value method [J] . Yeom Hongseon, Ko Youngjoong, Seo Jungyun Computer speech and language . 2019,第NOVa期

机译：通过有效结合基于图的模型和改进的C值方法从单个文档中提取基于无监督学习的关键字
2. Unsupervised-learning-based keyphrase extraction from a single document by the effective combination of the graph-based model and the modified C-value method [J] . Yeom Hongseon, Ko Youngjoong, Seo Jungyun Computer speech and language . 2019,第Nova期

机译：通过基于图形的模型的有效组合和改进的C值方法的无监督学习的基于学习的关键词提取
3. An Efficient Approach to Improve Arabic Documents Clustering Based on a new Keyphrases Extraction Algorithm [J] . Hanane FROUD, Issam SAHMOUDI, Abdelmonaime LACHKAR Computer Science & Information Technology . 2013,第8期

机译：一种基于新的关键词提取算法的阿拉伯文档聚类改进方法
4. Keyphrases Extraction from Scientific Documents: Improving Machine Learning Approaches with Natural Language Processing [C] . Mikalai Krapivin, Aliaksandr Autayeu, Maurizio Marchese, The role of digital libraries in a time of global change . 2010

机译：从科学文献中提取关键短语：通过自然语言处理改进机器学习方法
5. Evaluation techniques and graph-based algorithms for automatic summarization and keyphrase extraction. [D] . Hamid, Fahmida. 2016

机译：自动汇总和关键短语提取的评估技术和基于图的算法。
6. Building the evidence base on the HIV programme in India: an integrated approach to document programmatic learnings [O] . Deepika Ganju, Bidhubhusan Mahapatra, Rajatashuvra Adhikary, 2018

机译：以印度的HIV计划为基础建立证据：记录计划学习的综合方法
7. AN EFFICIENT APPROACH TO IMPROVE ARABIC DOCUMENTS CLUSTERING BASED ON A NEW KEYPHRASES EXTRACTION ALGORITHM [O] . Hanane Froud, Issam Sahmoudi, Abdelmonaime Lachkar 2014

机译：基于新的关键短语提取算法的阿拉伯文件聚类的有效方法

A Joint Learning Approach based on Self-Distillation for Keyphrase Extraction from Scientific Documents

摘要

著录项

相似文献

相关主题

期刊订阅